Software Project Risk Management Checklist
Quarterly risk-management cycle for a software engineering team — identify, score, and mitigate technical, security, vendor, and schedule risks across the project portfolio. Run by an engineering manager or technical program manager with input from tech leads, SRE, and AppSec.
Risk Identification Kickoff
-
Schedule the pre-mortem workshop
Block 90 minutes with tech leads, SRE on-call, AppSec, and the product manager. Pre-mortem framing: "It's six months from now and the project failed — what went wrong?" Async brainstorming in a shared doc 24 hours ahead surfaces more than a cold-start meeting.
-
Pull contributing factors from past PIRs
Review the last 4 quarters of post-incident reviews in Confluence/Notion. Extract recurring contributing factors — flaky CI, untested rollbacks, certificate expiry, unowned services. Recurring factors are the strongest signal for risks worth registering.
Collects paragraph -
Inventory third-party dependencies and SaaS vendors
Generate the SBOM (Syft, Trivy, or your registry's built-in) and list paid vendors from the procurement system. Watch for transitive critical-CVE dependencies (think Log4Shell-class), single-vendor lock-in (auth provider, payments), and packages without a maintained upstream.
-
Capture risks raised by on-call engineers
On-call sees the rough edges first — alert noise, runbooks that don't match reality, services with a single SME. Ask the last two rotations directly; don't rely on tickets alone.
Risk Analysis and Scoring
-
Score each risk on probability and impact
Use a 1–5 × 1–5 matrix for probability × impact. Impact dimensions: customer-facing downtime, data exposure, revenue, and engineering toil. Anything scoring 15+ goes on the top-tier list and needs a named owner this cycle.
-
Classify the project's regulatory scope
Confirm what regulated data the in-scope services touch. PHI pulls in HIPAA + BAA review; cardholder data pulls in PCI scope; EU resident data pulls in GDPR sub-processor obligations. Misclassification here is the most common reason auditors find a control gap later.
Collects list -
Log entries in the risk register
Single source of truth — Jira, Linear, or a Notion table linked from the engineering wiki. Each entry gets: ID, description, category (technical / security / vendor / schedule / compliance), score, owner, mitigation, status. Avoid private spreadsheets; auditors and successors won't find them.
Collects file Collects number -
Run the SOC 2 / HIPAA / PCI control mapping review
Map each compliance-relevant risk to the affected control (CC6.x for access, CC7.x for monitoring, CC8.x for change management under SOC 2). Loop in the compliance lead or your Vanta/Drata/Secureframe owner to confirm the gap is registered and an evidence task exists.
Mitigation Planning
-
Assign a named owner to each top-tier risk
One human per risk, not a team. The owner drives the mitigation plan, reports status at the monthly review, and closes the entry. Rotate ownership when people change roles — orphaned risks are how a tracked gap becomes a Sev1.
-
Draft mitigation plans for top-tier risks
Each plan needs: concrete engineering work (linked tickets), a target completion date, and the residual-risk score after mitigation. Vague mitigations ("improve observability") don't ship; "add SLO burn-rate alert on checkout-service p99" does.
-
Define rollback triggers and kill-switch flags
For each release-related risk, document the trigger condition (error rate > X%, p99 > Y ms, customer support tickets > Z/hr) and the operator action (flip the LaunchDarkly flag, redeploy previous container tag, run the rollback migration). The PagerDuty runbook link goes here too.
-
Confirm residual risk is within appetite
After applying mitigations, re-score each top-tier risk. Anything still scoring 15+ is residual exposure leadership needs to accept explicitly — it doesn't go away because you wrote a plan.
Collects list
Monitoring and Control
-
Wire risk indicators into Datadog or Grafana
If the risk has a leading indicator (Dependabot critical-CVE count, certificate days-to-expiry, p99 latency budget burn), it goes on a dashboard with an alert routing to the risk owner — not a deprecated #alerts channel. "Backup nightly green for 18 months" without a restore test is not monitoring.
-
Hold the monthly risk register review
30 minutes, calendar-recurring. Owners report status on their entries, retire mitigated risks, add new ones surfaced since last cycle. Skipping the review is how registers become museum pieces.
-
Re-test rollback and restore procedures
Quarterly drill into a non-prod environment: restore the latest backup, redeploy the previous container tag, run the down migration. The first restore attempt usually fails on a rotated credential or a missing IAM permission — finding that during a drill is the point.
Stakeholder Communication
-
Brief the CTO on accepted residual risk
For any risk still rated High or Critical after mitigation, schedule a 15-minute briefing with engineering leadership. Capture the explicit accept/reject decision in the register so it's defensible at the next audit walkthrough.
-
Post the risk summary in #engineering
One Slack post per cycle: top three risks, owners, target dates, and a link to the register. Async visibility prevents "nobody told me" surprises during release weeks.
-
Hold the quarterly risk retrospective
Look back at the cycle: which risks materialized despite mitigation, which we missed entirely, and which controls actually held. Feed the answers into next quarter's identification step — that's how risk management compounds instead of resetting.
Use this template
Copy it to your account, customize the steps, and run it with your team in minutes.
Browse hundreds of free templates across every team and industry.
Back to template libraryRun Software Project Risk Management Checklist with your team
Customize the steps, assign roles, set a schedule, and keep a complete record for every run.