Module 5.4 — Scoring Rubric and Required Deliverables

Day 5 capstone · Section 4 of 6

The 1000-point scoring rubric

The full breakdown:

Category	Max Points	Cascading Logic
Detection	400	100 per stage caught; missing a stage caps the next stage at 50%
Containment	200	50 per stage; over-block penalty -25 per inappropriate block
Attribution	150	50 each for: deepfake vendor, agent framework, exfil bucket; -50 if AI agent attribution accepted without verification
Reporting	150	50 for exec summary, 50 for IOC list, 50 for CISO memo
AI SOC hygiene	100	100 if student caught their own agent’s wrong attribution in Phase 4; 0 if auto-trusted
TOTAL	1000	Pass bar: 700/1000 for GIAC capstone credit

Detection (400 pts)

Stage caught?	Points if caught	Cap on next-stage if missed
Stage 1 (Recon)	100	Stage 2 max = 50
Stage 2 (BEC)	100	Stage 3 max = 50
Stage 3 (NoraBot injection)	100	Stage 4 max = 50
Stage 4 (Mirror Twist exfil)	100	—

The cascading penalty mirrors real IR: if you didn’t see the recon, your understanding of the BEC is incomplete; if you didn’t catch the BEC’s PDF injection, you have weaker grounding for the NoraBot incident; etc.

Containment (200 pts)

50 pts per stage where containment was both timely (within the stage’s defined window) and correct (the right scope of action, not the maximum possible blocking).

The over-block penalty: -25 per inappropriate block. Examples:

Disabling Brenda’s entire team because Brenda received a deepfake call: over-block
Disabling NoraBot entirely when sandboxing would have sufficed: over-block (without justification)
Cutting all S3 access during Phase 4 instead of cutting the specific adversarial credentials: over-block

Attribution (150 pts)

50 pts each for correctly identifying:

Deepfake vendor used by PROMETHEUS-7 (Phase 2 forensics — examining the audio file metadata + voice-clone fingerprint patterns reveals which commercial deepfake-as-a-service was used)
Agent framework used by PROMETHEUS-7 (Phases 1 + 3 + 4 evidence — multi-agent orchestration patterns visible in the recon DNS, NoraBot trace artifacts, and CloudTrail event patterns)
Exfil infrastructure (Phase 4 forensics — the lookalike S3 bucket + the receiver IP infrastructure)

The -50 penalty applies if AI-agent attribution is accepted without verification. This applies in two specific scenarios:

Phase 2: the triage agent claims “Voicemail is not a deepfake — detector confidence 0.61 below threshold.” Students who copy-paste this into reports lose 50.
Phase 4: the triage agent claims “Vendor sync attributed to vendor-acme — confidence 0.93.” Students who copy-paste this without verifying against raw CloudTrail lose 50.

The -50 stacks with the -30 phase-detection penalty if students fully accept the agent. Mathematically: full acceptance of agent attribution in Phase 4 costs ~110 points across attribution and AI SOC hygiene categories.

Reporting (150 pts)

50 pts each for:

Executive summary (exec_summary.md)

Single page. Must include:

Business impact statement (one paragraph): what was at risk, what was lost, financial/operational scope
Timeline summary (one paragraph): T+0 to T+8h key events
Three concrete asks of the executive team

Scored on: business-language clarity (not jargon), actionability of asks, accuracy of impact statement.

IOC list (ioc_list.txt)

Must include:

Hashes (SHA-256) of every preserved artifact
Domains, IPs, lookalike bucket name
Voice-print signature ID for the deepfake
The specific prompt-injection payload from Phase 3
The AssumeRole pattern from Phase 4

Scored on completeness against the instructor’s master IOC list.

CISO memo (ciso_memo.md)

Two pages. Must propose exactly three concrete control changes. At least one of the three must address AI SOC trust calibration (otherwise -25 from CISO memo score).

Scored on: specificity of proposals, feasibility, alignment with what the exercise actually exposed.

AI SOC hygiene (100 pts)

The single binary signal: did the student catch their own agent’s wrong attribution in Phase 4?

Yes (verified against raw CloudTrail, documented in post-mortem): 100 pts
No (accepted “vendor-acme” attribution, copied into reports): 0 pts

This category exists separately from attribution scoring because the lesson is so central to the course thesis that it deserves its own metric.

The 6 required deliverables

Students must produce all six within the 45-minute reporting block (6:30-7:15). The lab platform’s submission directory ~/submissions/ will be scored after the deadline.

1. timeline.csv

Comma-separated values, one row per detected event. Fields:

timestamp (ISO 8601 UTC)
phase (1-4)
event_type (recon | bec | injection | exfil | containment | reporting)
source (which data source the event was detected in)
description (brief)
mitre_attck_or_atlas_technique (e.g., T1566.004, T1635)
evidence_path (path to the supporting artifact in the evidence locker)

2. ioc_list.txt

Plain text, one IOC per line. Format:

SHA-256: <hash> | <description>
DOMAIN: <name> | <role>
IP: <addr> | <role>
BUCKET: <name> | <role>
VOICE-PRINT: <id> | <vendor>
PROMPT-INJECTION-PAYLOAD: "<payload-text>" | <location>
ROLE-ARN-PATTERN: <pattern> | <indicator>

3. exec_summary.md (1 page)

For Verdancy Health CEO Marcus Wei. Non-technical. Business impact framed.

4. ciso_memo.md (2 pages)

For Verdancy Health CISO Dr. Marcus Wei. Three concrete control changes; at least one must address AI SOC trust calibration.

5. ai_soc_postmortem.md

Short writeup of where the student’s own agent stack failed and how they would retrain or reground it. Must explicitly address Phase 4’s manipulation moment.

This deliverable is the single most-important pedagogical artifact of the day. It’s where students articulate the lesson the exercise was designed to teach.

6. containment_log.csv

Every block, quarantine, or revoke action with justification. Fields:

timestamp (ISO 8601 UTC)
action_taken (string)
scope (specific user/host/service/credential)
justification (one-line rationale)
reversibility (reversible | not-reversible)

The reversibility field matters: actions that are not-reversible should have stronger justifications.

The Codex-generated auto-grader

The capstone_grader.py tool at .boss-pattern-work/day5/capstone_grader.py provides illustrative automated scoring against the rubric. Usage:

python3 capstone_grader.py --student-dir ~/submissions/student_42 \
                          --output-report ~/grades/student_42.json

Output: structured JSON with per-category breakdown, total score, pass/fail flag, and improvement recommendations.

The auto-grader uses simple regex + keyword matching as a starting point. Instructors can extend with LLM-based grading (with appropriate ethics — Day 1 Module 1.4’s citation-enforced RAG pattern applies to grading too).

Scoring tiers

Score	Tier	Implication
900-1000	Coin tier (top 10%)	SANS-Coin recognition; instructor-presented at hot wash
800-899	Excellent	GIAC capstone credit + strong portfolio piece
700-799	Pass	GIAC capstone credit
600-699	Near miss	No GIAC credit; specific remediation plan from instructor
<600	Fail	Instructor-led debrief on what to study before retake

Common scoring outcomes

The pedagogical intent shows in the score distribution we’ve calibrated for:

Students who pass all 4 detection stages but trust their AI agent in Phase 4: typical score ~680. Just below pass bar. The course is designed so that AI trust failure is the most common cause of near-miss.
Students who miss Stage 1 entirely: typical score ~580. Below pass bar. Recon detection is the gateway lesson.
Students who detect all stages and verify against raw telemetry: typical score 850-950. Pass with strong margin.
Students who over-block in Containment: -50 to -100 from cumulative over-block penalties. Even good detection can drop them below pass bar.

The score distribution is intentional: ~60% of students should clear the 700 bar in the first attempt; the other 40% should walk away with specific learning gaps identified.

What’s next

Module 5.5 covers the instructor materials — the 10-point nudge cheat sheet for stuck students, the deliberately-seeded teachable moments, the hot-wash structure that turns the exercise into a memorable learning event.