Incident Response Checklist

Steps the on-call incident commander and IR team run when a security or availability incident is declared, from identification through containment to postmortem. Aligned to NIST SP 800-61 phases.

6 sections 23 steps Collects data
1

Detection and Triage

  1. Open the incident in PagerDuty
    • The on-call engineer creates the incident in PagerDuty (or Opsgenie / FireHydrant), assigns an incident commander, and opens a dedicated Slack channel using the #inc-YYYYMMDD-name convention. Do not investigate in the alerting channel — noise drowns the timeline.

  2. Classify severity (P1-P4)
    • Use the severity matrix: P1 = customer-impacting outage or confirmed data exposure; P2 = major degradation or suspected breach; P3 = limited impact with workaround; P4 = no customer impact. Severity drives paging, comms cadence, and exec notification — err high; you can downgrade later.

    Collects list
  3. Validate the alert is not a false positive
    • Cross-check the originating signal against a second source — EDR alert against SIEM logs, monitoring against synthetic checks, user report against access logs. Tuned-out detection rules and stale dashboards are common false-positive sources.

  4. Determine if this is a security incident
    • Security incidents (unauthorized access, malware, data exposure) trigger additional legal, regulatory, and forensic obligations — chain of custody, breach-notification clocks, possible law-enforcement coordination. Operational incidents (outage, capacity, deploy regression) do not.

    Collects list
2

Investigation and Scoping

  1. Assign incident commander, scribe, and comms lead
    • The IC drives decisions and is not hands-on-keyboard. The scribe maintains the running timeline in Slack with timestamps. The comms lead handles status-page updates and stakeholder notifications. On a P1, all three roles must be filled by separate people.

  2. Preserve volatile evidence before remediation
    • Snapshot affected EC2 / Azure VMs before terminating. Capture memory dumps from EDR (CrowdStrike RTR, SentinelOne) where supported. Export relevant SIEM queries with time ranges locked. Once you reboot or wipe, volatile evidence is gone — and so is the chain of custody for any later legal action.

    Collects file
  3. Pull SIEM and EDR logs for the incident window
    • Query Splunk / Datadog / Sumo for authentication events, network flows, and process executions across the suspected window. Pull EDR detections, IdP sign-in logs (Okta / Entra ID), and CloudTrail / Azure Activity Log entries. Default cloud retention often falls below SOC 2 / PCI minimums — pull now, archive to cold storage.

  4. Determine the scope of impacted systems and data
    • Enumerate every host, identity, SaaS tenant, and data store touched by the incident. For credential compromise, check sign-in logs across every IdP-connected app for the affected accounts. Scope drives breach-notification thresholds — undercounting here causes regulatory exposure later.

  5. Build the incident timeline
    • From first malicious action through detection through current state. Use UTC throughout to avoid timezone confusion across the team. The scribe maintains this in real time; the IC validates it during handoff between shifts.

3

Containment

  1. Isolate affected hosts via EDR network containment
    • Use CrowdStrike network containment, SentinelOne disconnect, or Defender for Endpoint isolation rather than pulling cables — the agent stays online for forensics while blocking lateral movement. For cloud workloads, modify the security group to deny all egress except to your forensic jump host.

  2. Revoke compromised credentials and active sessions
    • In the IdP, force sign-out and reset for every implicated account. Rotate API tokens, OAuth grants, and SSH keys associated with the identity. For service accounts, rotate the secret in Vault / Secrets Manager and redeploy. SMS / email MFA is bypassable via SIM swap and phishing — escalate affected users to FIDO2 / passkey before re-enabling.

  3. Block IOCs at the firewall and DNS layer
    • Push known-bad IPs, domains, and file hashes to the NGFW (Palo Alto, Fortinet), DNS filter (Umbrella, NextDNS), and EDR custom-IOC list. Cross-reference IOCs against the CISA KEV catalog and recent threat-intel feeds before declaring the IOC list complete.

4

Eradication and Recovery

  1. Remove malware and persistence mechanisms
    • For confirmed compromise, rebuild from a known-good image rather than cleaning in place — scheduled tasks, registry run keys, cron jobs, and rogue IAM roles are easy to miss. Validate the gold image predates the initial compromise based on the timeline.

  2. Patch the exploited vulnerability
    • Identify the CVE or misconfiguration that enabled initial access. Push the patch through your patch management tool (Action1, Automox, Intune) to all hosts running the affected version, not just the compromised one. Re-scan with Tenable / Qualys / Wiz to confirm closure.

  3. Restore impacted services from clean backups
    • Restore from immutable backups (Veeam hardened repos, AWS Backup vault lock, Rubrik). Verify the restore point predates compromise based on your timeline — restoring from a backup taken after initial access reintroduces the foothold. Validate RPO and RTO against SLA commitments.

  4. Verify containment is complete before bringing systems online
    • Run a fresh EDR scan, re-query SIEM for any IOC matches in the last 24 hours, and confirm no anomalous outbound traffic from the rebuilt hosts. The IC signs off before the comms lead announces resolution.

    Collects list
5

Notification and Regulatory Reporting

  1. Engage legal and privacy counsel
    • For confirmed security incidents, loop in legal before drafting external communications — attorney-client privilege over the investigation depends on counsel directing the work. Privacy counsel evaluates breach-notification triggers under GDPR Article 33, HIPAA, and applicable state laws.

  2. File regulatory breach notifications within required windows
    • GDPR requires notification to the supervisory authority within 72 hours of awareness. State breach-notification statutes range from 30-90 days, with varying recipient lists (AG, credit bureaus, affected residents). HIPAA breach of >500 individuals requires HHS notification within 60 days. Track each filing separately with submission timestamps.

    Collects paragraph
  3. Notify impacted customers and partners
    • Comms lead sends notifications per the templates approved by legal. Update the public status page (Statuspage, Instatus) and customer success / account managers' talking points. Avoid speculation about cause until forensics is complete; correcting public statements later is worse than initial silence.

6

Postmortem and Improvement

  1. Schedule the blameless postmortem
    • Hold the postmortem within 5 business days while the timeline is fresh. Invite the IR team, the service owner, legal (for security incidents), and an exec sponsor. Blameless framing focuses on systems and processes, not individuals — the goal is durable fixes, not blame.

  2. Document root cause, MTTD, and MTTR
    • Capture the technical root cause, the contributing factors, MTTD (time from first malicious action to detection), and MTTR (detection to full recovery). Trend these across incidents quarterly to know whether your detective controls are improving.

    Collects file
  3. File action items with owners and due dates
    • Every action item lands in Jira / Linear with a named owner and a due date. Track completion in monthly security ops review — postmortem actions that don't ship are the most reliable predictor of the same incident recurring.

  4. Update the IR runbook with lessons learned
    • Fold detection-rule changes, new IOCs, and process gaps into the runbook in Confluence / Notion / IT Glue. Surface the changes at the next IR tabletop so the team practices against the updated playbook before the next real incident.

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.


Sections 6
Steps 23
Category Information Technology
Price Free to start
Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Run Incident Response Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.