Module 3.1 — LLM-Authorship Signals in Dropped Code
50-minute lecture · Day 3 morning
Learning objectives
By end of this module, students can:
- Identify five categories of LLM-authorship signal that appear in malicious code samples in 2024-2026
- Apply a working YARA rule pack that flags suspect samples for human review
- Articulate the false-positive scenarios (junior developers, textbook code, intentional camouflage) that limit how LLM-authorship signals can be used in automated blocking
- Pair LLM-authorship signals with behavioral correlation to produce high-confidence detection
The honest read
LLM-authorship detection is a low-fidelity, high-volume signal. Treat it as a flag for human review, not as a basis for auto-blocking. The reasons:
- Junior developers and non-native English programmers produce code that looks LLM-shaped (verbose comments, textbook variable naming)
- Adversaries who realize defenders are looking for these signals will train themselves out of them (or post-process LLM output to remove the tells)
- LLMs themselves are getting better at producing varied, less idiomatic code
Despite the limitations, the signal is useful when paired with corroborating telemetry. A binary dropped to disk that both has LLM-authorship features and exhibits suspicious runtime behavior is a higher-priority alert than either signal alone. This module covers the authorship side; behavioral correlation is covered in Module 3.2.
What HP Wolf Security saw in May 2025
The May 2025 HP Wolf Security threat report (HP’s regular endpoint-protection threat-landscape publication) documented a French-targeted phishing campaign delivering AsyncRAT through HTML smuggling with VBScript and JavaScript droppers exhibiting clear LLM-authorship signals.
The specific signals HP Wolf called out:
- Over-explanatory comments narrating trivial code flow (“This function will initialize the connection to the C2 server”)
- AI-idiom variable names with redundant suffixes (
result_data,final_output,processed_items) - Verbose docstrings for trivial helper functions
- Defensive over-handling —
try/exceptwrappers around operations that cannot fail - Templating fingerprints — code structure that matches common LLM “tutorial” output patterns
HP characterized the campaign as “evidence that adversaries are using AI to author malware at scale.” The same signals were subsequently found in samples linked to other campaigns through late 2025.
Source: HP Wolf Security threat insights, May 2025 (verify current URL at hp.com/wolf-security).
The five LLM-authorship signal categories
A working taxonomy detection engineers should know:
Signal 1: Over-explanatory comments
LLMs trained on tutorial content default to narrating their code. Real malware authors writing for stealth do not narrate. The pattern:
# This function will initialize the connection to the remote server
# We use TCP because UDP would be unreliable for our purposes
# First, we create a socket object
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Now we connect to the server using the IP and port from our config
sock.connect((server_ip, server_port))
Real malware authors typically write:
s = socket.socket(2, 1); s.connect((ip, p))
Signal 2: AI-idiom variable names
LLMs gravitate toward generic descriptive names with redundant suffixes — result_data, final_output, processed_items, return_value. Real malware uses short or obfuscated names (a, _x, deliberately misleading legitimate_handler).
Signal 3: Verbose imports and structured boilerplate
LLM-generated code typically has every import on its own line, sometimes with explanatory comments:
# We use os for filesystem operations
import os
# We use sys for argument parsing
import sys
# We use socket for network connections
import socket
vs. compressed real-malware imports (import os,sys,socket,base64,subprocess).
Signal 4: Specific phrasings
Certain phrases recur with high frequency in LLM-generated content: “for the purposes of this,” “important to note,” “in summary,” “let me explain.” These rarely appear in malware comments authored by humans.
Signal 5: Defensive over-handling
LLMs wrap operations in try/except even when no exception is possible:
try:
x = 1 + 1
except Exception as e:
print(f"An unexpected error occurred: {e}")
Real malware rarely catches everything; either it ignores errors or it has specific handling for specific exceptions.
The YARA rule pack (Codex-generated)
The full rule pack is at .boss-pattern-work/day3/llm_authorship.yar. Five rules cover the five signal categories above. Each rule’s meta.threshold field includes the disclaimer: “High-confidence only when combined with corroborating indicators; flag for human review, not auto-block.”
Rule 1: Over-explanatory comments
rule rule_1_overexplained_comments
{
meta:
author = "vExpertAI x SANS Course"
description = "Detects dense over-explanatory comments narrating simple code flow in LLM-authored samples."
threshold = "Flag for human review, not auto-block."
strings:
$py_this_function = /#[ \t]*(This function|This method|This script)[ \t]+(will|is designed to|is responsible for|helps to)/ nocase
$py_here_we = /#[ \t]*(Here we|Now we|Next we|Then we)[ \t]+(will|are going to|can|need to)/ nocase
$py_first_we = /#[ \t]*(First,?[ \t]+we|Next,?[ \t]+we|Finally,?[ \t]+we)[ \t]+(initialize|create|check|process|return)/ nocase
$py_make_sure = /#[ \t]*(Make sure|Ensure that|We need to make sure)/ nocase
$c_this_function = /\/\/[ \t]*(This function|This method|This routine)[ \t]+(will|is designed to)/ nocase
$c_here_we = /\/\/[ \t]*(Here we|Now we|Next we|Then we)[ \t]+(will|are going to|can|need to)/ nocase
$c_first_we = /\/\/[ \t]*(First,?[ \t]+we|Next,?[ \t]+we|Finally,?[ \t]+we)[ \t]+(initialize|create)/ nocase
$c_make_sure = /\/\/[ \t]*(Make sure|Ensure that|We need to make sure)/ nocase
condition:
filesize < 2MB and
(
(3 of ($py_*) and (#py_this_function + #py_here_we + #py_first_we + #py_make_sure) >= 5) or
(3 of ($c_*) and (#c_this_function + #c_here_we + #c_first_we + #c_make_sure) >= 5)
)
}
The remaining four rules (rule_2_idiom_naming, rule_3_verbose_structure, rule_4_phrases, rule_5_defensive_overhandling) follow the same pattern. The full file is committed to the course materials.
Deployment guidance
- Run in shadow mode for 30 days against your incoming malware sample queue. Measure false-positive rate against known-clean code samples (open-source libraries, your dev team’s tutorial code, sandbox-detonator scripts).
- Set the alert as “flag for review,” not as “block.” A hit on this rule pack triggers a human-review queue, not a blocklist insertion.
- Pair with behavioral signals. Module 3.2 covers runtime-LLM-query detection. A sample that fires on the YARA pack and queries an LLM API at runtime is a much higher-confidence alert than either alone.
- Track precision over time. Adversaries who realize the rules are deployed will train themselves out of the signals. Treat the rule pack as having a 12-18 month half-life; refresh the patterns periodically based on observed sample evolution.
Adjacent research
Beyond the HP Wolf 2025 report, other vendor research worth tracking:
- SentinelLABS (LabsCon 2025) — published on LLM-enabled malware at runtime, including YARA rules targeting hardcoded LLM API keys, model identifiers (
gpt-oss:20b), and default local API ports like Ollama’s 11434. Different angle (runtime query, not authorship) but complementary detection layer. - CERT-UA (July 2025) — published on the LameHug / PROMPTSTEAL APT28 campaign, which queries Qwen2.5-Coder via Hugging Face at runtime. The LameHug malware itself has some LLM-authorship signals in the dropper code.
Cross-checking against these adjacent research lines helps your rule pack stay current.
False-positive scenarios to expect
When you deploy the rule pack, expect false positives in these populations:
| Population | Why they trip the rules | How to handle |
|---|---|---|
| Junior developers / bootcamp graduates | Their style mirrors tutorial code with verbose comments and explicit variable names | Allow-list internal dev-team commit hashes; exclude internal-source-tagged repositories |
| Non-native English programmers | Verbose explanatory comments are a common second-language pattern | Pair with binary-execution context; ignore at static-analysis-only stage |
| Educational / tutorial code samples | Designed to be verbose for pedagogy | Filter by file path; tutorial directories should be excluded |
| Generated code from legitimate copilot tools | GitHub Copilot, Cursor, Continue — your own dev team’s AI-assistant output | Allow-list internal repos; track which projects use AI-assisted authoring |
| Sandbox-detonator harnesses | Often verbose for analyst clarity | Allow-list known sandbox tooling |
The rule of thumb: if your alert volume is dominated by false positives, the rules are too sensitive for blocking but may still be useful as a triage-priority signal. Adjust thresholds; don’t disable.
Pairing with behavioral correlation (preview of Module 3.2)
The single most-valuable correlation is:
Sample fires on
rule_1_overexplained_commentsAND sample dropped to disk exhibits outbound HTTPS to a known LLM API endpoint AND the binary running was not a known developer tool.
The conjunction has near-zero false-positive rate but catches the increasingly common pattern of LLM-authored droppers that fetch their next stage from an LLM API at runtime. Module 3.2 covers the runtime-query side of this in depth.
Discussion questions (~10 min)
- The HP Wolf 2025 report named LLM-authored AsyncRAT droppers as evidence that adversaries are using AI at scale. Why are droppers (not the full malware payload) the place where LLM-authorship signals show up most? What does this tell you about the adversary’s workflow?
- Your YARA rule pack is producing 12% false positive rate against your sample queue. Most FPs are tutorial code from a security-training Slack channel. What’s the highest-leverage tuning change you can make without disabling rules?
- An adversary reads this module, then post-processes their LLM-generated droppers to compress comments and minify variable names. Are the rule-pack signals useless now? What residual signal might still survive their counter-measures?
Common mistakes
| Mistake | Better approach |
|---|---|
| Deploying YARA rules as auto-block | Use as triage-priority signal; combine with behavioral evidence before blocking |
| Treating high-comment-density as proof of LLM authorship | Junior devs and non-native English programmers produce similar code; require multiple signal types |
| Assuming the rules have permanent shelf-life | Adversary counter-tuning is fast; refresh quarterly based on observed sample evolution |
| Running rules only at endpoint detonation | Run at gateway, in code-review systems (for supply-chain attacks), in email attachment filtering |
What’s next
Module 3.2 covers polymorphic and runtime-generated malware — where the malicious behavior is not in the binary at all, but is fetched at runtime from an LLM API. The detection shifts from static analysis (this module) to behavioral / network analysis.