Module 4.2 — Detection Signatures for Adversary Agents

50-minute lecture · Day 4 morning · Hands-on Sigma + Suricata in the lab

Learning objectives

By end of this module, students can:

Identify the three layers of adversary-agent detection signal: network (TLS SNI, JA3/JA4, DNS), endpoint (Sysmon process and network events), and behavioral (cadence, structured-JSON, agent-loop process trees)
Deploy a working Sigma rule pack for adversary-agent telemetry, with explicit allow-listing for legitimate developer tools
Deploy a working Suricata rule pack for network-level detection of LLM API traffic from anomalous sources
Tune the signatures against your environment’s legitimate AI-tool usage to keep false-positive rate manageable

The three detection layers

A defender catching GTG-1002 / PROMPTSTEAL-class adversaries has three complementary layers to deploy. Each catches different attack stages and has different false-positive profiles.

Layer 1: Network signatures

The defender’s view of an LLM-querying adversary on the wire:

TLS SNI: api.openai.com, api.anthropic.com, generativelanguage.googleapis.com, api.cohere.ai, api.groq.com, api-inference.huggingface.co, router.huggingface.co, inference.huggingface.co. Outbound TLS to any of these from non-developer endpoints is the first-pass signal.
JA3/JA4 client fingerprints: Each common LLM client library has a characteristic TLS handshake fingerprint:
- Python httpx (used by openai-python and anthropic-sdk supporting HTTP/2) produces a recognizable JA4 fingerprint
- Python requests (legacy LangChain integrations, custom scripts) produces a different JA4
- These fingerprints persist even when the destination is masked behind a proxy
DNS resolution patterns: High-frequency DNS resolution of LLM API endpoints (Type 65 HTTPS records, Type A/AAAA records) with sub-second polling cadence indicates agentic “think-act-observe” loops, not human-driven browser sessions

Layer 2: Endpoint signatures

The defender’s view from Sysmon and EDR telemetry:

Sysmon Event ID 3 (Network Connection): outbound HTTPS to LLM endpoints from processes NOT in the allow-listed developer tool set
Sysmon Event ID 1 (Process Creation): parent process is a script interpreter (python.exe, node.exe, pwsh.exe) and child process is a system command (cmd.exe, git.exe, curl.exe, net.exe) — the “act” phase of an agent loop
Temporal correlation: every 5-30 seconds, an Event ID 3 (outbound HTTPS to LLM API) is followed by an Event ID 1 (process spawn) — the agent loop fingerprint

Layer 3: Behavioral signatures

The agent-loop pattern at the workload level:

Polling cadence: sustained connections to LLM endpoints at regular intervals (every 2-30 seconds) — distinct from bursty human use (1-2 requests per minute clustered during a task)
Structured-JSON HTTP bodies: request payloads contain agent-orchestration keys: "thinking", "plan", "tool_call", "tool_use", "observation", "action", "final_answer"
Long-lived sessions from server workloads: A server (not a workstation) maintaining persistent HTTPS to an LLM API for hours is almost never legitimate — your servers don’t have human users running copilots on them

The Codex-generated Sigma rule pack

The full pack is at .boss-pattern-work/day4/adversary_agent_sigma.yml. Three rules cover the three detection layers. Example excerpt:

title: LLM API HTTPS Connection From Non-Developer Process
id: 2fba710c-f1f2-4191-ab03-6674d3e7a1bc
status: experimental
description: |
  Detects outbound HTTPS connections to common LLM API endpoints from processes
  that are not in a local developer-tool allow-list. This pattern is intended to
  surface adversary-operated AI-agent telemetry and orchestration similar to the
  GTG-1002 use of Claude Code agents and MCP-connected tooling.
references:
  - https://www.anthropic.com/news/disrupting-the-first-reported-ai-orchestrated-cyber-espionage-campaign
  - https://atlas.mitre.org/
author: vExpertAI x SANS Course
date: 2026-05-14
logsource:
  product: windows
  service: sysmon
detection:
  selection_llm_api:
    EventID: 3
    DestinationPort: 443
    DestinationHostname|endswith:
      - api.openai.com
      - api.anthropic.com
      - generativelanguage.googleapis.com
      - api-inference.huggingface.co
      - router.huggingface.co
  filter_allowed_developer_process:
    Image|endswith:
      - '\Code.exe'
      - '\Cursor.exe'
      - '\Windsurf.exe'
      - '\claude.exe'
      - '\node.exe'
      - '\python.exe'
      - '\python3.exe'
      - '\git.exe'
      - '\curl.exe'
      - '\idea64.exe'
      - '\pycharm64.exe'
  condition: selection_llm_api and not filter_allowed_developer_process
fields:
  - UtcTime
  - Computer
  - User
  - Image
  - CommandLine
  - DestinationHostname
  - DestinationIp
falsepositives:
  - Approved enterprise AI clients not represented in the local allow-list
level: high
tags:
  - attack.command_and_control
  - attack.collection

The remaining Sigma rules cover:

Long-lived polling traffic to LLM endpoints — sustained sessions with regular intervals
Agent-loop process trees — interpreter-spawning-system-command sequences correlated with LLM API egress

Tuning the allow-list

The most important tuning step is the developer-tool allow-list. Every org has a different set of legitimate AI-tool users. Initial allow-list candidates:

VSCode (Code.exe) — GitHub Copilot, Continue, Cody
Cursor (Cursor.exe) — built-in AI-assisted IDE
Windsurf (Windsurf.exe) — Codeium’s IDE
JetBrains IDEs (idea64.exe, pycharm64.exe, webstorm64.exe)
CLI tools (claude.exe, gemini, codex) — the CLIs this course itself uses
Generic interpreters (python.exe, node.exe) — broad but necessary

Run the rule in shadow mode for 30 days before alerting. Measure false-positive volume; tune the allow-list against observed legitimate usage. After 30 days, the FP rate against your developer-process pattern should be near-zero.

The Codex-generated Suricata rule pack

For network-level detection without endpoint visibility, the Suricata pack at .boss-pattern-work/day4/adversary_agent_suricata.rules complements the Sigma rules. Three rules:

alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: TLS SNI to LLM API endpoint";
   tls.sni; content:"api.openai.com"; nocase;
   threshold: type both, track by_src, count 10, seconds 60;
   classtype:trojan-activity;
   sid:4000001; rev:1;)

alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: Long-lived TLS to LLM API";
   tls.sni; content:"api.anthropic.com"; nocase;
   flow:established,from_client;
   threshold: type both, track by_src, count 50, seconds 600;
   classtype:trojan-activity;
   sid:4000002; rev:1;)

alert dns $HOME_NET any -> any any (msg:"AI Agent: Bursty DNS to LLM endpoint";
   dns.query.name; content:"huggingface.co"; nocase;
   threshold: type both, track by_src, count 20, seconds 60;
   classtype:trojan-activity;
   sid:4000003; rev:1;)

The Suricata rules use sid in the 4000000-4999999 range (reserved for custom rules) and reference the Sigma UUIDs for cross-detection-tool correlation.

Tuning thresholds

count 10, seconds 60 on TLS SNI catches sustained agent traffic; tune lower for high-stakes networks, higher for noisy ones
count 50, seconds 600 on the long-lived flow catches the polling pattern; tune to match your environment’s legitimate API-usage cadence
DNS thresholds depend heavily on local resolver caching — measure your baseline first

MITRE ATLAS technique mapping

Each rule maps to specific ATLAS techniques:

Rule	ATLAS techniques
LLM API HTTPS from non-developer process	T1620 (Inject LLM Behavior at Runtime), T1635 (AI Orchestrator Pattern), AML.TA0015 (Command and Control via AI)
Long-lived polling to LLM endpoints	T1635, AML.TA0015
Agent-loop process tree	T1635, ATT&CK T1059 (Command and Scripting Interpreter)

The detection engineer’s deliverable: map each adversary-agent rule to ATLAS techniques in the rule’s references and tags. This makes the rules legible to threat intelligence teams and traceable to the framework.

Vendor detection content to track

Detection engineering doesn’t happen in isolation. Vendors are publishing increasingly specific detection content for adversary-agent telemetry. Track:

Microsoft Entra — Entra Agent ID identity-first security for AI agents (treats agents as non-human identities subject to conditional access)
Google Cloud Security Operations — Gemini in Security Operations for autonomous alert triage (the defender’s-side agent)
Anthropic — Disrupting AI Misuse report series with operational detection patterns
SentinelOne — Purple AI integration for in-line agentic auto-investigation, including LiteLLM Trojan Detection (the LiteLLM/Mercor case from Module 4.4)
CrowdStrike — Non-Human Identity (NHI) Governance for GenAI-built malware detection

The detection-engineering practice: pull each vendor’s relevant detection content, cross-reference with your own rule pack, and update quarterly.

False-positive management

The Sigma + Suricata rules above will produce false positives from:

Legitimate enterprise AI deployments (e.g., your org runs an internal copilot platform that wasn’t in the allow-list)
CI/CD runners making LLM API calls for code review automation
Developer machines running tools not in the allow-list
Cloud-hosted dev environments (Codespaces, Gitpod) where the agent-like patterns are legitimate

Mitigations:

Allow-list maintenance: treat the allow-list as a living artifact; update on every new approved AI tool deployment
User-group filtering: alert only when the source user is NOT in the developer-group AD/Okta group
Time-of-day filtering: if your org has a clear work-hours pattern, filter alerts outside working hours where the FP rate is lower
Severity tiering: Layer 1 (TLS SNI from non-developer process) is medium-severity by default; combined with Layer 2 + Layer 3 telemetry, escalate to high-severity

Discussion questions (~10 min)

Your org’s developer-group has 200 engineers; 30 use AI coding assistants regularly; 170 do not. What’s the highest-leverage filter to add to the Sigma rule to focus alerts on the 170 users who shouldn’t be making LLM API calls?
PROMPTSTEAL uses Hugging Face inference API. Your rule catches it. What does the adversary do next to evade the rule? Walk through the cat-and-mouse and identify what residual signal you can still detect after their counter-move.
The Sigma rule generates 500 alerts/day in shadow mode at your org. You need to be at <10 alerts/day before alerting goes live. Which tuning levers do you pull first, and in what order?

Common mistakes

Mistake	Better approach
Deploying the rules without an allow-list	The allow-list is the FP control; without it, you generate too much noise to action
Blocking outbound to LLM APIs as the response action	Breaks legitimate developer workflows; treat the rule as detect-and-investigate, not block
Only deploying network signatures	Network-only misses on-prem LLMs; endpoint + behavioral layers cover the gap
Static allow-list that never updates	New AI tools appear monthly; allow-list maintenance is part of detection-engineering ownership
Ignoring user-group context	The rule is much more accurate when it knows which users are legitimately using AI tools

What’s next

Module 4.3 covers the defender’s side of agentic systems — Anthropic’s Building Effective Agents patterns, LangGraph HITL primitives, the action-criticality matrix, and the working multi-agent SOC workflow that Codex generated for the course.