Module 3.1 — LLM-Authorship Signals in Dropped Code

50-minute lecture · Day 3 morning

Learning objectives

By end of this module, students can:

  1. Identify five categories of LLM-authorship signal that appear in malicious code samples in 2024-2026
  2. Apply a working YARA rule pack that flags suspect samples for human review
  3. Articulate the false-positive scenarios (junior developers, textbook code, intentional camouflage) that limit how LLM-authorship signals can be used in automated blocking
  4. Pair LLM-authorship signals with behavioral correlation to produce high-confidence detection

The honest read

LLM-authorship detection is a low-fidelity, high-volume signal. Treat it as a flag for human review, not as a basis for auto-blocking. The reasons:

Despite the limitations, the signal is useful when paired with corroborating telemetry. A binary dropped to disk that both has LLM-authorship features and exhibits suspicious runtime behavior is a higher-priority alert than either signal alone. This module covers the authorship side; behavioral correlation is covered in Module 3.2.


What HP Wolf Security saw in May 2025

The May 2025 HP Wolf Security threat report (HP’s regular endpoint-protection threat-landscape publication) documented a French-targeted phishing campaign delivering AsyncRAT through HTML smuggling with VBScript and JavaScript droppers exhibiting clear LLM-authorship signals.

The specific signals HP Wolf called out:

HP characterized the campaign as “evidence that adversaries are using AI to author malware at scale.” The same signals were subsequently found in samples linked to other campaigns through late 2025.

Source: HP Wolf Security threat insights, May 2025 (verify current URL at hp.com/wolf-security).


The five LLM-authorship signal categories

A working taxonomy detection engineers should know:

Signal 1: Over-explanatory comments

LLMs trained on tutorial content default to narrating their code. Real malware authors writing for stealth do not narrate. The pattern:

# This function will initialize the connection to the remote server
# We use TCP because UDP would be unreliable for our purposes
# First, we create a socket object
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Now we connect to the server using the IP and port from our config
sock.connect((server_ip, server_port))

Real malware authors typically write:

s = socket.socket(2, 1); s.connect((ip, p))

Signal 2: AI-idiom variable names

LLMs gravitate toward generic descriptive names with redundant suffixes — result_data, final_output, processed_items, return_value. Real malware uses short or obfuscated names (a, _x, deliberately misleading legitimate_handler).

Signal 3: Verbose imports and structured boilerplate

LLM-generated code typically has every import on its own line, sometimes with explanatory comments:

# We use os for filesystem operations
import os
# We use sys for argument parsing
import sys
# We use socket for network connections
import socket

vs. compressed real-malware imports (import os,sys,socket,base64,subprocess).

Signal 4: Specific phrasings

Certain phrases recur with high frequency in LLM-generated content: “for the purposes of this,” “important to note,” “in summary,” “let me explain.” These rarely appear in malware comments authored by humans.

Signal 5: Defensive over-handling

LLMs wrap operations in try/except even when no exception is possible:

try:
    x = 1 + 1
except Exception as e:
    print(f"An unexpected error occurred: {e}")

Real malware rarely catches everything; either it ignores errors or it has specific handling for specific exceptions.


The YARA rule pack (Codex-generated)

The full rule pack is at .boss-pattern-work/day3/llm_authorship.yar. Five rules cover the five signal categories above. Each rule’s meta.threshold field includes the disclaimer: “High-confidence only when combined with corroborating indicators; flag for human review, not auto-block.”

Rule 1: Over-explanatory comments

rule rule_1_overexplained_comments
{
  meta:
    author = "vExpertAI x SANS Course"
    description = "Detects dense over-explanatory comments narrating simple code flow in LLM-authored samples."
    threshold = "Flag for human review, not auto-block."

  strings:
    $py_this_function = /#[ \t]*(This function|This method|This script)[ \t]+(will|is designed to|is responsible for|helps to)/ nocase
    $py_here_we = /#[ \t]*(Here we|Now we|Next we|Then we)[ \t]+(will|are going to|can|need to)/ nocase
    $py_first_we = /#[ \t]*(First,?[ \t]+we|Next,?[ \t]+we|Finally,?[ \t]+we)[ \t]+(initialize|create|check|process|return)/ nocase
    $py_make_sure = /#[ \t]*(Make sure|Ensure that|We need to make sure)/ nocase
    $c_this_function = /\/\/[ \t]*(This function|This method|This routine)[ \t]+(will|is designed to)/ nocase
    $c_here_we = /\/\/[ \t]*(Here we|Now we|Next we|Then we)[ \t]+(will|are going to|can|need to)/ nocase
    $c_first_we = /\/\/[ \t]*(First,?[ \t]+we|Next,?[ \t]+we|Finally,?[ \t]+we)[ \t]+(initialize|create)/ nocase
    $c_make_sure = /\/\/[ \t]*(Make sure|Ensure that|We need to make sure)/ nocase

  condition:
    filesize < 2MB and
    (
      (3 of ($py_*) and (#py_this_function + #py_here_we + #py_first_we + #py_make_sure) >= 5) or
      (3 of ($c_*) and (#c_this_function + #c_here_we + #c_first_we + #c_make_sure) >= 5)
    )
}

The remaining four rules (rule_2_idiom_naming, rule_3_verbose_structure, rule_4_phrases, rule_5_defensive_overhandling) follow the same pattern. The full file is committed to the course materials.

Deployment guidance

  1. Run in shadow mode for 30 days against your incoming malware sample queue. Measure false-positive rate against known-clean code samples (open-source libraries, your dev team’s tutorial code, sandbox-detonator scripts).
  2. Set the alert as “flag for review,” not as “block.” A hit on this rule pack triggers a human-review queue, not a blocklist insertion.
  3. Pair with behavioral signals. Module 3.2 covers runtime-LLM-query detection. A sample that fires on the YARA pack and queries an LLM API at runtime is a much higher-confidence alert than either alone.
  4. Track precision over time. Adversaries who realize the rules are deployed will train themselves out of the signals. Treat the rule pack as having a 12-18 month half-life; refresh the patterns periodically based on observed sample evolution.

Adjacent research

Beyond the HP Wolf 2025 report, other vendor research worth tracking:

Cross-checking against these adjacent research lines helps your rule pack stay current.


False-positive scenarios to expect

When you deploy the rule pack, expect false positives in these populations:

PopulationWhy they trip the rulesHow to handle
Junior developers / bootcamp graduatesTheir style mirrors tutorial code with verbose comments and explicit variable namesAllow-list internal dev-team commit hashes; exclude internal-source-tagged repositories
Non-native English programmersVerbose explanatory comments are a common second-language patternPair with binary-execution context; ignore at static-analysis-only stage
Educational / tutorial code samplesDesigned to be verbose for pedagogyFilter by file path; tutorial directories should be excluded
Generated code from legitimate copilot toolsGitHub Copilot, Cursor, Continue — your own dev team’s AI-assistant outputAllow-list internal repos; track which projects use AI-assisted authoring
Sandbox-detonator harnessesOften verbose for analyst clarityAllow-list known sandbox tooling

The rule of thumb: if your alert volume is dominated by false positives, the rules are too sensitive for blocking but may still be useful as a triage-priority signal. Adjust thresholds; don’t disable.


Pairing with behavioral correlation (preview of Module 3.2)

The single most-valuable correlation is:

Sample fires on rule_1_overexplained_comments AND sample dropped to disk exhibits outbound HTTPS to a known LLM API endpoint AND the binary running was not a known developer tool.

The conjunction has near-zero false-positive rate but catches the increasingly common pattern of LLM-authored droppers that fetch their next stage from an LLM API at runtime. Module 3.2 covers the runtime-query side of this in depth.


Discussion questions (~10 min)

  1. The HP Wolf 2025 report named LLM-authored AsyncRAT droppers as evidence that adversaries are using AI at scale. Why are droppers (not the full malware payload) the place where LLM-authorship signals show up most? What does this tell you about the adversary’s workflow?
  2. Your YARA rule pack is producing 12% false positive rate against your sample queue. Most FPs are tutorial code from a security-training Slack channel. What’s the highest-leverage tuning change you can make without disabling rules?
  3. An adversary reads this module, then post-processes their LLM-generated droppers to compress comments and minify variable names. Are the rule-pack signals useless now? What residual signal might still survive their counter-measures?

Common mistakes

MistakeBetter approach
Deploying YARA rules as auto-blockUse as triage-priority signal; combine with behavioral evidence before blocking
Treating high-comment-density as proof of LLM authorshipJunior devs and non-native English programmers produce similar code; require multiple signal types
Assuming the rules have permanent shelf-lifeAdversary counter-tuning is fast; refresh quarterly based on observed sample evolution
Running rules only at endpoint detonationRun at gateway, in code-review systems (for supply-chain attacks), in email attachment filtering

What’s next

Module 3.2 covers polymorphic and runtime-generated malware — where the malicious behavior is not in the binary at all, but is fetched at runtime from an LLM API. The detection shifts from static analysis (this module) to behavioral / network analysis.