Day 4 — Agentic Adversaries + AI Supply-Chain Compromise

Course: SEC5xx — Detecting and Responding to AI-Generated Adversary Content Day: 4 of 5 · ~6 hours instruction + 2.5 hour lab + breaks Prerequisite: Days 1–3 (Detector stack + Phishing + Deepfake BEC + Copilot prompt injection)

What Day 4 builds

Day 3 covered enterprise copilots as the inside attack surface: the EchoLeak class of zero-click exfiltration, the lethal trifecta, the guardrails stack as SIEM telemetry. Day 4 takes the same architectural thinking to two of the most consequential 2025-2026 threat classes:

  1. Agentic adversaries — adversaries running their own LLM-orchestrated attack agents (GTG-1002 / Anthropic Nov 2025 was the first publicly documented case; PROMPTSTEAL / APT28 was the first operational deployment). The adversary’s agent telemetry becomes the detection signal.
  2. AI supply-chain compromise — malicious packages and models in the ML toolchain (LiteLLM/Mercor Mar 2026; JFrog Hugging Face Feb 2024), backdoored fine-tunes (Anthropic Sleeper Agents), poisoned RAG corpora. The detection signal is provenance — and where you don’t have provenance, behavioral monitoring is the fallback.

By end of Day 4, students leave with:

  1. A working multi-agent SOC workflow on LangGraph with explicit HITL gates and audit logging — the defender’s reference architecture for safe agentic SOCs
  2. A Sigma + Suricata rule pack for detecting adversary AI-agent telemetry (the GTG-1002 / PROMPTSTEAL pattern)
  3. A model SBOM generator for inventorying ML artifacts in their org and flagging supply-chain risk
  4. The action-criticality matrix that governs which agent actions can be auto-executed vs HITL-gated vs dual-control
  5. The honest read on backdoored fine-tunes — you cannot fully clear a third-party fine-tune; provenance + behavioral monitoring is the durable control

The six modules

#ModuleFocus
4.1The agentic adversaryGTG-1002 deep dive, Anthropic disrupting-misuse reports, MITRE ATLAS agentic tactics
4.2Detection signatures for adversary agentsNetwork + endpoint patterns, JA3/JA4 fingerprints, Sigma/Suricata rule pack
4.3Hardening your own agentsAnthropic patterns, LangGraph HITL, action-criticality matrix, audit-log schema
4.4Supply-chain compromise of ML artifactsLiteLLM/Mercor case study, JFrog HF, model SBOM discipline
4.5Backdoored fine-tunes and sleeper-agent modelsAnthropic Sleeper Agents research, behavioral evals, the hard truth
4.6Poisoned RAG corporaPublic-corpus and internal-corpus poisoning, canary tokens, instruction-stripping

Lab 4

The lab is a red-vs-blue exercise in a controlled environment:

Key references for Day 4

Verified incident reports (cross-checked May 2026):

Frameworks and standards:

Tools introduced (working code in Modules 4.3, 4.2, 4.4):

How Day 4 changes the detector’s mental model

Day 3 introduced “the LLM-touching application is itself an attack surface.” Day 4 extends this in two directions:

Direction 1: Adversaries are running their own LLM agents. The detector’s adversary signal is now agent telemetry — outbound LLM API calls from non-developer processes, polling-shaped traffic patterns, structured-JSON HTTP bodies, agent-loop process trees. Detection moves to the network and process-context layer.

Direction 2: The ML toolchain has become a supply-chain target. Every PyPI package, every Hugging Face model, every dataset is a potential injection vector. The defender’s discipline shifts to SBOM-for-models, provenance pinning, and behavioral monitoring of model artifacts at load time. Detection becomes preventive: tag what you don’t trust, then watch what they do.

The architectural insight running through Day 4: the threat moves up the stack. Day 1’s adversary is at the SOC’s inbox. Day 4’s adversary is at the SOC’s toolchain. The detection engineer’s controls must operate at every layer simultaneously.

What Day 5 builds on this

Day 5 is the capstoneOperation Hollow Mirror. The Verdancy Health scenario chains together threats from all four prior days:

The defender’s stack from all four days is what survives the capstone. Day 4’s controls — agent telemetry detection, supply-chain hardening, action-criticality HITL gates — are specifically tested in Stage 4 where the adversary’s agent attempts to manipulate the defender’s own AI triage layer.