Module 4.2 — Detection Signatures for Adversary Agents
50-minute lecture · Day 4 morning · Hands-on Sigma + Suricata in the lab
Learning objectives
By end of this module, students can:
- Identify the three layers of adversary-agent detection signal: network (TLS SNI, JA3/JA4, DNS), endpoint (Sysmon process and network events), and behavioral (cadence, structured-JSON, agent-loop process trees)
- Deploy a working Sigma rule pack for adversary-agent telemetry, with explicit allow-listing for legitimate developer tools
- Deploy a working Suricata rule pack for network-level detection of LLM API traffic from anomalous sources
- Tune the signatures against your environment’s legitimate AI-tool usage to keep false-positive rate manageable
The three detection layers
A defender catching GTG-1002 / PROMPTSTEAL-class adversaries has three complementary layers to deploy. Each catches different attack stages and has different false-positive profiles.
Layer 1: Network signatures
The defender’s view of an LLM-querying adversary on the wire:
- TLS SNI:
api.openai.com,api.anthropic.com,generativelanguage.googleapis.com,api.cohere.ai,api.groq.com,api-inference.huggingface.co,router.huggingface.co,inference.huggingface.co. Outbound TLS to any of these from non-developer endpoints is the first-pass signal. - JA3/JA4 client fingerprints: Each common LLM client library has a characteristic TLS handshake fingerprint:
- Python
httpx(used byopenai-python andanthropic-sdk supporting HTTP/2) produces a recognizable JA4 fingerprint - Python
requests(legacy LangChain integrations, custom scripts) produces a different JA4 - These fingerprints persist even when the destination is masked behind a proxy
- Python
- DNS resolution patterns: High-frequency DNS resolution of LLM API endpoints (Type 65 HTTPS records, Type A/AAAA records) with sub-second polling cadence indicates agentic “think-act-observe” loops, not human-driven browser sessions
Layer 2: Endpoint signatures
The defender’s view from Sysmon and EDR telemetry:
- Sysmon Event ID 3 (Network Connection): outbound HTTPS to LLM endpoints from processes NOT in the allow-listed developer tool set
- Sysmon Event ID 1 (Process Creation): parent process is a script interpreter (
python.exe,node.exe,pwsh.exe) and child process is a system command (cmd.exe,git.exe,curl.exe,net.exe) — the “act” phase of an agent loop - Temporal correlation: every 5-30 seconds, an Event ID 3 (outbound HTTPS to LLM API) is followed by an Event ID 1 (process spawn) — the agent loop fingerprint
Layer 3: Behavioral signatures
The agent-loop pattern at the workload level:
- Polling cadence: sustained connections to LLM endpoints at regular intervals (every 2-30 seconds) — distinct from bursty human use (1-2 requests per minute clustered during a task)
- Structured-JSON HTTP bodies: request payloads contain agent-orchestration keys:
"thinking","plan","tool_call","tool_use","observation","action","final_answer" - Long-lived sessions from server workloads: A server (not a workstation) maintaining persistent HTTPS to an LLM API for hours is almost never legitimate — your servers don’t have human users running copilots on them
The Codex-generated Sigma rule pack
The full pack is at .boss-pattern-work/day4/adversary_agent_sigma.yml. Three rules cover the three detection layers. Example excerpt:
title: LLM API HTTPS Connection From Non-Developer Process
id: 2fba710c-f1f2-4191-ab03-6674d3e7a1bc
status: experimental
description: |
Detects outbound HTTPS connections to common LLM API endpoints from processes
that are not in a local developer-tool allow-list. This pattern is intended to
surface adversary-operated AI-agent telemetry and orchestration similar to the
GTG-1002 use of Claude Code agents and MCP-connected tooling.
references:
- https://www.anthropic.com/news/disrupting-the-first-reported-ai-orchestrated-cyber-espionage-campaign
- https://atlas.mitre.org/
author: vExpertAI x SANS Course
date: 2026-05-14
logsource:
product: windows
service: sysmon
detection:
selection_llm_api:
EventID: 3
DestinationPort: 443
DestinationHostname|endswith:
- api.openai.com
- api.anthropic.com
- generativelanguage.googleapis.com
- api-inference.huggingface.co
- router.huggingface.co
filter_allowed_developer_process:
Image|endswith:
- '\Code.exe'
- '\Cursor.exe'
- '\Windsurf.exe'
- '\claude.exe'
- '\node.exe'
- '\python.exe'
- '\python3.exe'
- '\git.exe'
- '\curl.exe'
- '\idea64.exe'
- '\pycharm64.exe'
condition: selection_llm_api and not filter_allowed_developer_process
fields:
- UtcTime
- Computer
- User
- Image
- CommandLine
- DestinationHostname
- DestinationIp
falsepositives:
- Approved enterprise AI clients not represented in the local allow-list
level: high
tags:
- attack.command_and_control
- attack.collection
The remaining Sigma rules cover:
- Long-lived polling traffic to LLM endpoints — sustained sessions with regular intervals
- Agent-loop process trees — interpreter-spawning-system-command sequences correlated with LLM API egress
Tuning the allow-list
The most important tuning step is the developer-tool allow-list. Every org has a different set of legitimate AI-tool users. Initial allow-list candidates:
- VSCode (
Code.exe) — GitHub Copilot, Continue, Cody - Cursor (
Cursor.exe) — built-in AI-assisted IDE - Windsurf (
Windsurf.exe) — Codeium’s IDE - JetBrains IDEs (
idea64.exe,pycharm64.exe,webstorm64.exe) - CLI tools (
claude.exe,gemini,codex) — the CLIs this course itself uses - Generic interpreters (
python.exe,node.exe) — broad but necessary
Run the rule in shadow mode for 30 days before alerting. Measure false-positive volume; tune the allow-list against observed legitimate usage. After 30 days, the FP rate against your developer-process pattern should be near-zero.
The Codex-generated Suricata rule pack
For network-level detection without endpoint visibility, the Suricata pack at .boss-pattern-work/day4/adversary_agent_suricata.rules complements the Sigma rules. Three rules:
alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: TLS SNI to LLM API endpoint";
tls.sni; content:"api.openai.com"; nocase;
threshold: type both, track by_src, count 10, seconds 60;
classtype:trojan-activity;
sid:4000001; rev:1;)
alert tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"AI Agent: Long-lived TLS to LLM API";
tls.sni; content:"api.anthropic.com"; nocase;
flow:established,from_client;
threshold: type both, track by_src, count 50, seconds 600;
classtype:trojan-activity;
sid:4000002; rev:1;)
alert dns $HOME_NET any -> any any (msg:"AI Agent: Bursty DNS to LLM endpoint";
dns.query.name; content:"huggingface.co"; nocase;
threshold: type both, track by_src, count 20, seconds 60;
classtype:trojan-activity;
sid:4000003; rev:1;)
The Suricata rules use sid in the 4000000-4999999 range (reserved for custom rules) and reference the Sigma UUIDs for cross-detection-tool correlation.
Tuning thresholds
count 10, seconds 60on TLS SNI catches sustained agent traffic; tune lower for high-stakes networks, higher for noisy onescount 50, seconds 600on the long-lived flow catches the polling pattern; tune to match your environment’s legitimate API-usage cadence- DNS thresholds depend heavily on local resolver caching — measure your baseline first
MITRE ATLAS technique mapping
Each rule maps to specific ATLAS techniques:
| Rule | ATLAS techniques |
|---|---|
| LLM API HTTPS from non-developer process | T1620 (Inject LLM Behavior at Runtime), T1635 (AI Orchestrator Pattern), AML.TA0015 (Command and Control via AI) |
| Long-lived polling to LLM endpoints | T1635, AML.TA0015 |
| Agent-loop process tree | T1635, ATT&CK T1059 (Command and Scripting Interpreter) |
The detection engineer’s deliverable: map each adversary-agent rule to ATLAS techniques in the rule’s references and tags. This makes the rules legible to threat intelligence teams and traceable to the framework.
Vendor detection content to track
Detection engineering doesn’t happen in isolation. Vendors are publishing increasingly specific detection content for adversary-agent telemetry. Track:
- Microsoft Entra — Entra Agent ID identity-first security for AI agents (treats agents as non-human identities subject to conditional access)
- Google Cloud Security Operations — Gemini in Security Operations for autonomous alert triage (the defender’s-side agent)
- Anthropic — Disrupting AI Misuse report series with operational detection patterns
- SentinelOne — Purple AI integration for in-line agentic auto-investigation, including LiteLLM Trojan Detection (the LiteLLM/Mercor case from Module 4.4)
- CrowdStrike — Non-Human Identity (NHI) Governance for GenAI-built malware detection
The detection-engineering practice: pull each vendor’s relevant detection content, cross-reference with your own rule pack, and update quarterly.
False-positive management
The Sigma + Suricata rules above will produce false positives from:
- Legitimate enterprise AI deployments (e.g., your org runs an internal copilot platform that wasn’t in the allow-list)
- CI/CD runners making LLM API calls for code review automation
- Developer machines running tools not in the allow-list
- Cloud-hosted dev environments (Codespaces, Gitpod) where the agent-like patterns are legitimate
Mitigations:
- Allow-list maintenance: treat the allow-list as a living artifact; update on every new approved AI tool deployment
- User-group filtering: alert only when the source user is NOT in the developer-group AD/Okta group
- Time-of-day filtering: if your org has a clear work-hours pattern, filter alerts outside working hours where the FP rate is lower
- Severity tiering: Layer 1 (TLS SNI from non-developer process) is medium-severity by default; combined with Layer 2 + Layer 3 telemetry, escalate to high-severity
Discussion questions (~10 min)
- Your org’s developer-group has 200 engineers; 30 use AI coding assistants regularly; 170 do not. What’s the highest-leverage filter to add to the Sigma rule to focus alerts on the 170 users who shouldn’t be making LLM API calls?
- PROMPTSTEAL uses Hugging Face inference API. Your rule catches it. What does the adversary do next to evade the rule? Walk through the cat-and-mouse and identify what residual signal you can still detect after their counter-move.
- The Sigma rule generates 500 alerts/day in shadow mode at your org. You need to be at <10 alerts/day before alerting goes live. Which tuning levers do you pull first, and in what order?
Common mistakes
| Mistake | Better approach |
|---|---|
| Deploying the rules without an allow-list | The allow-list is the FP control; without it, you generate too much noise to action |
| Blocking outbound to LLM APIs as the response action | Breaks legitimate developer workflows; treat the rule as detect-and-investigate, not block |
| Only deploying network signatures | Network-only misses on-prem LLMs; endpoint + behavioral layers cover the gap |
| Static allow-list that never updates | New AI tools appear monthly; allow-list maintenance is part of detection-engineering ownership |
| Ignoring user-group context | The rule is much more accurate when it knows which users are legitimately using AI tools |
What’s next
Module 4.3 covers the defender’s side of agentic systems — Anthropic’s Building Effective Agents patterns, LangGraph HITL primitives, the action-criticality matrix, and the working multi-agent SOC workflow that Codex generated for the course.