Change Management Checklist
Steps a release manager or platform engineer runs to push a production change through CAB review, deploy, verification, and closure. Designed for SaaS teams operating under SOC 2 change-management controls.
Request Intake and Scoping
-
Open the change ticket in Jira
Use the CHG project in Jira (or your equivalent in Linear / ServiceNow). Link the originating PR, incident, or feature epic. Include the target environment, customer-visible impact, and the engineer accountable for the change.
-
Classify the change risk tier
Standard = pre-approved low-risk (dependency patch, doc update). Normal = scheduled change requiring CAB review. Emergency = breaks-glass for SEV1 or expiring TLS cert. Misclassifying an emergency as normal is a common SOC 2 audit finding.
Collects list -
Identify blast radius and dependent services
Pull the service from Backstage or your service catalog. Note upstream callers, downstream dependencies, shared databases, and any feature flags this change depends on. Database schema changes always require a separate migration plan.
-
Notify dependent service owners in Slack
Post in #eng-releases and tag the on-call for each dependent service. Don't rely on CODEOWNERS auto-review alone — async Slack notice gives owners time to flag conflicts before CAB.
Change Plan and Rollback
-
Document the deploy steps in the runbook
Write the runbook as commands a different engineer could execute. Include Terraform plan output, kubectl commands, migration scripts, and the order they run in. Database migration always deploys before the backend that depends on it.
-
Write the rollback procedure
Confirm the previous container image is still in the registry and not pruned. For DB migrations, write a reverse migration or document the forward-compatible path. "Restore from backup" is not a rollback plan — that's a disaster recovery plan.
Collects url -
Define success metrics and SLO impact
Name the Datadog or Grafana dashboards to watch post-deploy. Document the p99 latency and error-rate thresholds that constitute a rollback trigger. If this change consumes error budget, note it in the SLO ticket.
-
Assign the release captain and on-call backup
CAB Review and Approval
-
Submit the CR to the Change Advisory Board
Attach the runbook, rollback plan, blast-radius summary, and risk classification. CAB meets weekly; submitting after the cutoff pushes the change a full week.
-
Walk through the change in CAB
Cover scope, blast radius, rollback, and the deploy window. Be explicit about which dependent services have signed off. CAB members will challenge the rollback plan — have it tested or expect pushback.
-
Capture the CAB decision
Record the outcome in the CR ticket. Approved-with-conditions means the conditions are blocking — don't deploy until they're resolved and re-confirmed in writing. SOC 2 auditors trace approvals back to this artifact.
Collects list Collects text Collects paragraph -
Resolve CAB-imposed conditions
Common conditions: load test in staging, security review for IAM changes, cross-team sign-off, expanded canary window. Re-attach evidence to the CR before re-requesting approval.
Deploy Window Execution
-
Confirm no active SEV1 or SEV2 incidents
Check PagerDuty and the #incidents channel. An active incident on a dependent service means the deploy waits — even a green build doesn't justify shipping into a degraded environment.
-
Lock main branch for the deploy window
Set branch protection to require release-captain approval, or post the freeze notice in #eng-releases. Concurrent merges during a canary muddy the rollback decision.
-
Run the database migration first
Watch replication lag in RDS during the migration. For Postgres, use CONCURRENTLY for index creation; for column adds with defaults on large tables, batch the backfill rather than locking the table. Skip this step if the change has no schema change.
-
Deploy backend canary at 5%
Hold canary for at least 10 minutes. Watch error rate, p99 latency, and saturation on the dashboard named in the success-metrics step. If any threshold is breached, abort and roll back before continuing.
-
Roll out backend to 25, 50, then 100 percent
-
Deploy frontend after backend is stable
Frontend ships last because the backend is forward-compatible. Reversing the order means the frontend calls API endpoints the backend doesn't yet serve. CDN cache purge is part of this step for Cloudflare or CloudFront-fronted assets.
-
Confirm deploy outcomeCollects list
Rollback Path
-
Page the incident commander
Trigger a SEV2 in PagerDuty, open the war-room Zoom, and post in #incidents. The release captain is not the IC — split the roles so one person drives the rollback while the other coordinates comms.
-
Execute the documented rollback
Follow the runbook from the Change Plan section. Redeploy the previous container image tag; if the migration was non-reversible, run the documented forward-fix instead. Don't improvise — improvised rollbacks are how outages double in length.
-
Verify production is healthy after rollback
Watch the same dashboards used during the deploy. Confirm error rate and p99 latency return to the pre-deploy baseline. Update the status page to resolved only after 30 minutes of clean signal.
Post-Deploy Verification
-
Run the production smoke test suite
Trigger the Playwright or Cypress synthetic against production. Cover the critical user journeys — login, primary CRUD path, billing webhook. A green CI build does not substitute for production smoke.
-
Watch error rate and p99 for 30 minutes
-
Check Sentry for new exception signatures
New unique error fingerprints in the first hour are the tell — even at low volume they often grow. Triage each new signature; assign a ticket or roll back depending on severity.
-
Flip planned feature flags
If the change ships dark behind a LaunchDarkly or Unleash flag, schedule the rollout per the launch plan. Note the flag's owner and cleanup ticket — stale flags accumulate fast.
Closure and SOC 2 Evidence
-
Tag the release in git
Use semver — e.g., v2024.45.0 — and push the annotated tag with the deployed sha. The tag is the artifact auditors map back to the CAB approval.
-
Publish the changelog and release notes
-
Attach evidence to the Vanta change-management control
Link the CR ticket, CAB approval, deploy log, and post-deploy verification screenshots in Vanta or Drata. SOC 2 Type II auditors sample CRs across the audit window — missing evidence on one sampled change becomes a control exception.
Collects file -
Schedule a retro if the deploy went sideways
Rolled-back or partial deploys get a blameless PIR within 5 business days. Capture contributing factors, not just the surface cause — alert tuning gaps, missing runbook steps, and review-process misses are the durable lessons.
-
Close the change request ticket
Use this template
Copy it to your account, customize the steps, and run it with your team in minutes.
Browse hundreds of free templates across every team and industry.
Back to template libraryRun Change Management Checklist with your team
Customize the steps, assign roles, set a schedule, and keep a complete record for every run.