Module 1.6 — Anti-Patterns to Avoid

50-minute lecture · Day 1 afternoon · Lab follows

Learning objectives

By end of this module, students can:

Name and argue against the six dominant anti-patterns in mid-2026 LLM-in-SOC deployments
Cite documented failures (Sygnia log-prompt-poisoning research, SOCpilot policy compliance findings) when objecting to bad architectural choices in their own org
Recognize the SANS 2025 SOC Survey findings on AI satisfaction (~40% adoption, satisfaction ranks last across surveyed tools, 42% deployed out-of-box without tuning, 69% still rely on manual reporting) as the empirical state of the industry
Defend the architectural patterns from Modules 1.2-1.5 when challenged by stakeholders pushing toward simpler-but-wrong alternatives

Why this module matters

The previous five modules were about doing it right. This module is about not doing it wrong. Both matter equally — a SOC that builds the right components but operates them with the wrong mental model will still produce incidents.

The empirical evidence motivates the discipline. The SANS 2025 SOC Survey found that across all surveyed SOC tools, AI/ML tools ranked dead last in analyst satisfaction. ~40% of SOCs have adopted them; 42% are running them out-of-the-box with zero customization; 69% of SOCs still rely on manual reporting because they don’t trust the AI output. (Source: SANS 2025 SOC Survey — sans.org/white-papers/sans-2025-soc-survey)

This isn’t an indictment of AI in the SOC. It’s an indictment of how it’s currently being deployed. Each of the six anti-patterns below contributes to that satisfaction gap.

Anti-pattern 1: “Block ChatGPT at the proxy and call it done”

This is the most common executive-level response to AI threat news. It is wrong in three ways:

It doesn’t stop adversaries. Phase 2 and Phase 3 adversaries (Module 1.1) don’t use corporate-proxy ChatGPT. PROMPTSTEAL calls Hugging Face. GTG-1002 used their own Anthropic API keys. SpamGPT runs on operator infrastructure. None of these touch your proxy.
It doesn’t protect internal copilots. Microsoft 365 Copilot, Google Duet, your internal RAG bot — all of these sit inside the trust boundary. EchoLeak (CVE-2025-32711, Day 3) was a zero-click exfil through M365 Copilot. The proxy block is irrelevant.
It creates shadow AI. Employees who can’t access AI tools at their desk will use personal devices. Their work product comes back into your network through email, Slack, etc. You’ve lost visibility, not gained it.

Better: Treat AI as a new identity and data plane. Log prompts/responses. Classify sensitive egress. Apply DLP on LLM I/O. Apply Day 3’s multi-agent guardrails to both employee-facing copilots and any externally-touching LLM surface.

Anti-pattern 2: “LLMs reduce false positives”

LLMs do not reduce FPs as an intrinsic property. They shift the error distribution to a different — and more attacker-influenceable — mode.

Pre-LLM triage: a noisy SIEM rule produces a known false-positive rate. The rate is stable, measurable, and the adversary can’t easily influence which alerts get suppressed.

Post-LLM triage: an LLM categorizes alerts. The FP rate may be lower against random noise but can be steered by adversaries who control alert content. The Sygnia “log prompt poisoning” research (sygnia.co/blog/log-prompt-poisoning-xdr-ai-risks/, August 2025) documented exactly this: attackers inserted bracketed instructions into PowerShell command-line arguments that overrode a Tier-1 triage agent’s system prompt and caused the agent to suppress a Mimikatz credential-theft alert as “scheduled kernel patch.”

Better: Measure your LLM triage against a held-out, adversary-controlled-content evaluation set. If accuracy degrades on the held-out set, the FP-reduction claim is false. Report adversary-influenced FP rate as a separate metric from natural FP rate.

Anti-pattern 3: “Our RAG demo enriches every alert beautifully”

The most common architecture failure in production RAG: the eval set and the index set are the same set. Every alert in the demo has a near-duplicate already in the index, so retrieval is trivially perfect. In production, novel alerts return weak retrievals; the LLM ignores the weak context and confabulates plausible-looking but invented ATT&CK technique IDs.

This is RAG failure mode 3 from Module 1.4 (hallucination despite retrieval). It is almost universal in first-deployment RAG systems.

Better:

Hold out a temporal slice (last 30 days of tickets) from your index. Never evaluate on indexed data.
Enforce a minimum retrieval-confidence floor. Below it, return “insufficient context, escalate to analyst” — not a confident wrong answer.
Validate every cited ATT&CK technique ID against the canonical STIX bundle post-generation. Reject responses citing IDs not present in the retrieval context.

Anti-pattern 4: “Our agent has confidence 0.92, so we auto-approved”

Never gate human-in-the-loop on the model’s self-reported confidence score. The model’s confidence is a function of its training distribution, not of the action’s blast radius. A model can be 0.95-confident about isolating the wrong host.

Action-criticality, not certainty, drives HITL design. Day 3 builds the formal version. The mental model:

Action class	Auto / HITL
Read-only enrichment, lookups, tagging	Auto
Ticket creation, user notification	Auto with audit
Email-to-employee about phishing	Auto with audit
Host isolation, credential reset, firewall rule change	HITL required
Cross-domain action (AD, Okta, cloud IAM)	Dual-control HITL

The SOCpilot research (arxiv 2605.05501) demonstrated that LLM-generated incident response plans can violate mandatory policies (e.g., restoring hosts before preserving evidence) when adversarial content is in the alert payload. A deterministic verifier successfully blocked 466 non-compliant actions across 200 real incidents — proving that human review or deterministic verification is needed regardless of model confidence.

Anti-pattern 5: “We have an audio detector, we’re deepfake-safe”

Single-detector deployments fail by design. The 0.7-confidence threshold problem is the canonical example: audio detector scores a deepfake at 0.61 (below alert threshold), the audio is fake anyway, and the org wires the money. Your audio detector is one signal in a layered defense, not a binary decision.

The durable control for deepfake-driven BEC is out-of-band verification as a workflow gate. A wire-transfer request with changed payment instructions requires confirmation through a second channel the attacker doesn’t control. The OOB verification’s absence in the workflow is itself a detection signal — your SIEM should flag “payment changed without OOB confirmation event in past 24h” as high-fidelity.

Day 2 covers this in depth. The principle applies broadly: the workflow gap is the detection, not the artifact analysis.

Anti-pattern 6: “We deployed an AI tool out-of-the-box”

The SANS 2025 SOC Survey finding that 42% of SOCs deploy AI tools without customization is the empirical face of this anti-pattern. Out-of-the-box AI tools are vendor demos. Your environment, your data schema, your alert taxonomy, your runbooks — none of these are in the vendor’s training data.

The work between “we bought the tool” and “the tool works” is:

Calibrate thresholds against your own data. The vendor’s defaults are tuned for an average customer that doesn’t exist.
Build your own evaluation set — 200-500 representative examples — and run it in CI against the tool. Regressions on this set are your single best signal that something changed.
Custom-prompt the system for your taxonomy and conventions. The default prompt produces answers in the vendor’s idiom; your analysts need answers in yours.
Integrate with your specific data sources. The vendor’s connector for Splunk-default-schema doesn’t understand your custom fields.
Train your analysts on the tool’s failure modes — and on the failure modes covered in this Day’s modules.

A SOC that does these five steps with a mid-tier tool will outperform a SOC that drops the best-in-class tool in unconfigured.

The red-flags self-check

When you (or a stakeholder) are about to make an architectural decision, run through these red flags. Any “yes” requires re-examination.

Are we deploying an LLM in production without an evaluation set?
Are we treating the LLM’s confidence score as our HITL decision boundary?
Are we evaluating retrieval quality only on data that’s already in the index?
Are we auto-actioning on the LLM’s output for any cross-domain or identity action?
Are we relying on a single detector for a high-impact threat class?
Are we using the vendor’s default configuration without calibration?
Are we blocking ChatGPT and considering that “AI threat mitigation”?
Are we shipping an LLM detection layer that hasn’t been adversarially tested?

The hardest of these to spot is the third (eval/index overlap). It hides in plain sight because the demo works beautifully. The remaining seven announce themselves at deployment review if anyone asks.

Discussion questions (~10 min)

Your VP of Security wants to procure an AI alert-triage product. The vendor demonstrates 99.2% accuracy in their demo. Walk through the six anti-patterns in this module and identify which is most likely hiding in that 99.2% figure.
The SANS 2025 SOC Survey shows AI tools ranked last in analyst satisfaction. Pick two of the six anti-patterns above and argue which is most likely responsible for the satisfaction gap, with reasoning.
The Sygnia log-prompt-poisoning research showed that attackers can plant instructions in PowerShell command-line arguments that override a triage agent’s behavior. Your org’s SOC uses GPT-5.4 for Tier-1 triage of EDR alerts. What defenses from Modules 1.2-1.5 close this attack class? Which are insufficient?

What’s next: Lab 1

The lab puts Day 1’s content together. Students work with a synthetic 5,000-email corpus containing four distinct AI-generated phishing campaigns. They:

Build the embedding-based deduplication and clustering pipeline (Module 1.3)
Implement hybrid retrieval over a MITRE ATT&CK + runbook corpus (Module 1.4)
Apply the five-signal AI-phishing detection stack (Module 1.5)
Encounter — and detect — a planted prompt-injection attempt in alert metadata (Module 1.6 anti-pattern test)
Write a Sigma rule that catches one campaign in production traffic

The lab handout is at the sample-day1-lab link. Allow 2.5 hours. The instructor walks through the planted prompt-injection moment at the 90-minute mark; senior students will discover it before that.

Closing the day

Day 1 has covered:

What changed when adversaries got LLMs (1.1) — six named state and criminal actors, three phases of evolution, MITRE ATLAS as framework
The deployment decision (1.2) — hybrid cloud + on-prem default, regulatory snapshot, four-axis matrix
Embeddings as the highest-ROI primitive (1.3) — current MTEB top picks, three security failure modes, the dedup + clustering + sensitivity-classification plays
RAG for detection engineering (1.4) — hybrid retrieval mandate, citation enforcement, RAGAS evaluation, four production failure modes
AI-generated phishing detection (1.5) — SpamGPT/KaliGPT as the 2025 commercial market, five-signal detection stack, MITRE ATT&CK T1566 mapping
Anti-patterns (1.6) — six wrong responses, empirical evidence from SANS 2025 survey + Sygnia + SOCpilot research

Students should leave Day 1 with: the detector’s AI stack assembled, the threat landscape’s first chapter (AI-generated phishing) understood, the failure modes named, and the lab artifacts in hand.

Days 2-5 build on this foundation. Day 2 takes the same detector toolkit and applies it to the next threat class: deepfake-driven BEC and synthetic identity. By Day 5, students defend a fictional org against an AI-orchestrated multi-stage attack in the Operation Hollow Mirror capstone.