Day 1 — The Detector’s AI Stack + AI-Generated Phishing

Course: SEC5xx — Detecting and Responding to AI-Generated Adversary Content Day: 1 of 5 · ~6 hours instruction + 2.5 hour lab + breaks Prerequisite: SEC450 or equivalent SOC experience + Python literacy

What Day 1 builds

By end of Day 1, students leave with:

A working understanding of the AI-augmented adversary landscape as of mid-2026 — named actors, named tooling, named tradecraft phases
The defender’s AI stack assembled: deployment-decision framework, embedding-based detection plays, hybrid-retrieval RAG with citation enforcement
First-class detection capability against AI-generated phishing at scale
An honest read on the failure modes — both architectural and operational — that have caused 60% of LLM-in-SOC deployments to underdeliver

The six modules

Each module is ~50 minutes of lecture. Discussion questions are instructor-led at the end of each.

#	Module	Focus
1.1	What changed when adversaries got LLMs	Threat landscape: 6 named actor disclosures 2024–2025, MITRE ATLAS framework, three-phase evolution
1.2	The detector’s AI deployment decision	Cloud vs on-prem economics, regulatory snapshot, four-axis decision matrix, hybrid architecture pattern
1.3	Embeddings as the detector’s highest-ROI primitive	Current MTEB picks, three security failure modes, dedup + clustering + sensitivity-classification
1.4	RAG for detection engineering	Hybrid retrieval mandate, citation enforcement, RAGAS evaluation, four production failure modes
1.5	Detecting AI-generated phishing	SpamGPT/KaliGPT market, five-signal detection stack, MITRE T1566 sub-techniques, Sigma rule pattern
1.6	Anti-patterns to avoid	Six wrong responses, SANS 2025 SOC Survey data, Sygnia/SOCpilot evidence, red-flags self-check

Lab 1

Sample Lab — “Triage with Two Brains” — 2.5 hours, browser-based on pre-provisioned EC2.

Students work with a synthetic Windows EDR alert and a 5,000-email corpus. They:

Run alert triage with a local Llama 3.1-8B and with Claude Sonnet 4.6, diff the outputs
Discover a planted indirect-prompt-injection in alert metadata (Module 1.6 anti-pattern test)
Implement embedding-based campaign clustering on the email corpus
Build a Sigma rule that catches one of the four planted phishing campaigns in production traffic
Write a 200-word deployment-recommendation memo (cloud vs on-prem) with documented rationale

Key references for Day 1

Threat intelligence and disclosures (all verified May 2026):

Microsoft + OpenAI, Staying ahead of threat actors in the age of AI (Feb 14, 2024)
OpenAI, October 2024 Influence and Cyber Operations Update (SweetSpecter, CyberAv3ngers disclosures)
CrowdStrike, 2025 Global Threat Report (FAMOUS CHOLLIMA section)
Google Threat Intelligence Group, AI Threat Tracker (PROMPTSTEAL technical analysis)
Anthropic, Disrupting the first reported AI-orchestrated cyber espionage campaign (Nov 2025, GTG-1002)
Sygnia, When Your Logs Lie to You: Log Prompt Poisoning & Injection Risks in XDR AI Summaries (Aug 2025)

Research and frameworks:

MITRE ATLAS — adversarial-AI tactics taxonomy
RAGAS evaluation framework — faithfulness, answer relevance, context precision metrics
SANS 2025 SOC Survey — AI/ML adoption and satisfaction data
SOCpilot (arxiv:2605.05501) — verifying policy compliance for LLM-assisted incident response

Industry references:

MTEB Leaderboard (huggingface.co/spaces/mteb/leaderboard) — current embedding model rankings
FedRAMP Marketplace — current LLM service authorization status
Microsoft Security Copilot architecture documentation
Google SecOps Duet AI documentation

What Days 2-5 build on this foundation

Day 2 — Deepfake BEC, vishing, synthetic identity. Same detector stack applied to audio/video adversary artifacts.
Day 3 — LLM-authored malware, prompt-injection campaigns against enterprise copilots (EchoLeak class). Guardrails as detection telemetry.
Day 4 — Agentic adversaries (GTG-1002 class), AI supply-chain compromise (LiteLLM/Mercor case). Adversary agent telemetry detection.
Day 5 — Capstone: 8-hour immersive IR against the Operation Hollow Mirror attack on Verdancy Health.

Each day reuses Day 1’s deployment-decision framework, embedding primitive, RAG architecture, and anti-pattern discipline. The detector stack you assemble Monday is the stack you defend with all week.