Runbook Library

Incident: Model Output Causing Harm

v0.1.0 Last reviewed 2026-05-09 ~120 min operational incident high

Purpose

Execute a consistent, evidenced process for: Incident: Model Output Causing Harm. This runbook is a template; tailor owners and tools to your environment. Sections marked for IR review must be validated by a practitioner before you treat the text as authoritative.

When to use

  • The scenario matches the title and your governance tier requires a run record.
  • You need a shared checklist across security, platform, and product teams.
  • An audit or customer asks for proof of operational discipline.

Prerequisites

  • Incident or change channel identified (ticket system, war room, or comms tree).
  • Access to model cards, monitoring dashboards, and access logs as applicable.
  • Legal/comms on standby if external notification may be required.

Steps

1. Initiate and assign roles

Type: manual Owner: Incident commander or release owner SLA: 1 hour

Open the tracking record, set severity, assign owners for technical, comms, and legal streams. Confirm single accountable decision maker.

[IR / editorial review] Confirm escalation titles match your org.

2. Stabilize and preserve evidence

Type: manual Owner: Security engineering SLA: same day

Capture timestamps, configuration snapshots, and log excerpts. Avoid destructive changes until evidence is secured. Document scope of affected systems.

3. Upload primary evidence bundle

Type: file_upload Owner: Security engineering SLA: same day

Attach the initial evidence package: dashboards, queries, redacted samples, and chain-of-custody notes suitable for later audit export.

4. Structured impact assessment

Type: form Owner: Product or risk owner SLA: 1 business day

Record data classes touched, user impact, regulatory triggers, and whether third parties are implicated. Use your internal risk rubric.

5. Approver sign-off for next phase

Type: approval Owner: Security operations leader SLA: 1 business day

Named approver authorizes remediation actions that may affect production (for example, model rollback, block rules, or vendor engagement).

6. Integrations and follow-up

Type: webhook Owner: Platform engineering SLA: same day

Trigger ticketing or paging webhooks per your standard. Ensure post-incident review is scheduled and linked from the record.