Module 4.2 — Detection Signatures for Adversary Agents

50-minute lecture · Day 4 morning · Hands-on Sigma + Suricata in the lab

Learning objectives

By end of this module, students can:

  1. Identify the three layers of adversary-agent detection signal: network (TLS SNI, JA3/JA4, DNS), endpoint (Sysmon process and network events), and behavioral (cadence, structured-JSON, agent-loop process trees)
  2. Deploy a working Sigma rule pack for adversary-agent telemetry, with explicit allow-listing for legitimate developer tools
  3. Deploy a working Suricata rule pack for network-level detection of LLM API traffic from anomalous sources
  4. Tune the signatures against your environment’s legitimate AI-tool usage to keep false-positive rate manageable

The three detection layers

A defender catching GTG-1002 / PROMPTSTEAL-class adversaries has three complementary layers to deploy. Each catches different attack stages and has different false-positive profiles.

Layer 1: Network signatures

The defender’s view of an LLM-querying adversary on the wire:

Layer 2: Endpoint signatures

The defender’s view from Sysmon and EDR telemetry:

Layer 3: Behavioral signatures

The agent-loop pattern at the workload level:


The Codex-generated Sigma rule pack

The full pack is at .boss-pattern-work/day4/adversary_agent_sigma.yml. Three rules cover the three detection layers. Example excerpt:

title: LLM API HTTPS Connection From Non-Developer Process
id: 2fba710c-f1f2-4191-ab03-6674d3e7a1bc
status: experimental
description: |
  Detects outbound HTTPS connections to common LLM API endpoints from processes
  that are not in a local developer-tool allow-list. This pattern is intended to
  surface adversary-operated AI-agent telemetry and orchestration similar to the
  GTG-1002 use of Claude Code agents and MCP-connected tooling.
references:
  - https://www.anthropic.com/news/disrupting-the-first-reported-ai-orchestrated-cyber-espionage-campaign
  - https://atlas.mitre.org/
author: vExpertAI x SANS Course
date: 2026-05-14
logsource:
  product: windows
  service: sysmon
detection:
  selection_llm_api:
    EventID: 3
    DestinationPort: 443
    DestinationHostname|endswith:
      - api.openai.com
      - api.anthropic.com
      - generativelanguage.googleapis.com
      - api-inference.huggingface.co
      - router.huggingface.co
  filter_allowed_developer_process:
    Image|endswith:
      - '\Code.exe'
      - '\Cursor.exe'
      - '\Windsurf.exe'
      - '\claude.exe'
      - '\node.exe'
      - '\python.exe'
      - '\python3.exe'
      - '\git.exe'
      - '\curl.exe'
      - '\idea64.exe'
      - '\pycharm64.exe'
  condition: selection_llm_api and not filter_allowed_developer_process
fields:
  - UtcTime
  - Computer
  - User
  - Image
  - CommandLine
  - DestinationHostname
  - DestinationIp
falsepositives:
  - Approved enterprise AI clients not represented in the local allow-list
level: high
tags:
  - attack.command_and_control
  - attack.collection

The remaining Sigma rules cover:

Tuning the allow-list

The most important tuning step is the developer-tool allow-list. Every org has a different set of legitimate AI-tool users. Initial allow-list candidates:

Run the rule in shadow mode for 30 days before alerting. Measure false-positive volume; tune the allow-list against observed legitimate usage. After 30 days, the FP rate against your developer-process pattern should be near-zero.


The Codex-generated Suricata rule pack

For network-level detection without endpoint visibility, the Suricata pack at .boss-pattern-work/day4/adversary_agent_suricata.rules complements the Sigma rules. Three rules:

alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: TLS SNI to LLM API endpoint";
   tls.sni; content:"api.openai.com"; nocase;
   threshold: type both, track by_src, count 10, seconds 60;
   classtype:trojan-activity;
   sid:4000001; rev:1;)

alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: Long-lived TLS to LLM API";
   tls.sni; content:"api.anthropic.com"; nocase;
   flow:established,from_client;
   threshold: type both, track by_src, count 50, seconds 600;
   classtype:trojan-activity;
   sid:4000002; rev:1;)

alert dns $HOME_NET any -> any any (msg:"AI Agent: Bursty DNS to LLM endpoint";
   dns.query.name; content:"huggingface.co"; nocase;
   threshold: type both, track by_src, count 20, seconds 60;
   classtype:trojan-activity;
   sid:4000003; rev:1;)

The Suricata rules use sid in the 4000000-4999999 range (reserved for custom rules) and reference the Sigma UUIDs for cross-detection-tool correlation.

Tuning thresholds


MITRE ATLAS technique mapping

Each rule maps to specific ATLAS techniques:

RuleATLAS techniques
LLM API HTTPS from non-developer processT1620 (Inject LLM Behavior at Runtime), T1635 (AI Orchestrator Pattern), AML.TA0015 (Command and Control via AI)
Long-lived polling to LLM endpointsT1635, AML.TA0015
Agent-loop process treeT1635, ATT&CK T1059 (Command and Scripting Interpreter)

The detection engineer’s deliverable: map each adversary-agent rule to ATLAS techniques in the rule’s references and tags. This makes the rules legible to threat intelligence teams and traceable to the framework.


Vendor detection content to track

Detection engineering doesn’t happen in isolation. Vendors are publishing increasingly specific detection content for adversary-agent telemetry. Track:

The detection-engineering practice: pull each vendor’s relevant detection content, cross-reference with your own rule pack, and update quarterly.


False-positive management

The Sigma + Suricata rules above will produce false positives from:

Mitigations:

  1. Allow-list maintenance: treat the allow-list as a living artifact; update on every new approved AI tool deployment
  2. User-group filtering: alert only when the source user is NOT in the developer-group AD/Okta group
  3. Time-of-day filtering: if your org has a clear work-hours pattern, filter alerts outside working hours where the FP rate is lower
  4. Severity tiering: Layer 1 (TLS SNI from non-developer process) is medium-severity by default; combined with Layer 2 + Layer 3 telemetry, escalate to high-severity

Discussion questions (~10 min)

  1. Your org’s developer-group has 200 engineers; 30 use AI coding assistants regularly; 170 do not. What’s the highest-leverage filter to add to the Sigma rule to focus alerts on the 170 users who shouldn’t be making LLM API calls?
  2. PROMPTSTEAL uses Hugging Face inference API. Your rule catches it. What does the adversary do next to evade the rule? Walk through the cat-and-mouse and identify what residual signal you can still detect after their counter-move.
  3. The Sigma rule generates 500 alerts/day in shadow mode at your org. You need to be at <10 alerts/day before alerting goes live. Which tuning levers do you pull first, and in what order?

Common mistakes

MistakeBetter approach
Deploying the rules without an allow-listThe allow-list is the FP control; without it, you generate too much noise to action
Blocking outbound to LLM APIs as the response actionBreaks legitimate developer workflows; treat the rule as detect-and-investigate, not block
Only deploying network signaturesNetwork-only misses on-prem LLMs; endpoint + behavioral layers cover the gap
Static allow-list that never updatesNew AI tools appear monthly; allow-list maintenance is part of detection-engineering ownership
Ignoring user-group contextThe rule is much more accurate when it knows which users are legitimately using AI tools

What’s next

Module 4.3 covers the defender’s side of agentic systems — Anthropic’s Building Effective Agents patterns, LangGraph HITL primitives, the action-criticality matrix, and the working multi-agent SOC workflow that Codex generated for the course.