Release Management Checklist
Change-management workflow IT operations runs to take a release from RFC through CAB approval, deployment window, and post-release closure. Built around normal/standard/emergency change classification with rollback discipline.
Change Planning and RFC
-
Document the change scope and business driver
Capture the RFC summary in ServiceNow / Jira Service Management / ConnectWise: what is changing, the business reason, and which application or infrastructure components are touched. Vague RFCs ("upgrade middleware") get bounced at CAB — name the version, the host, and the customer-visible behavior.
-
Map impacted systems and downstream dependencies
Pull the CMDB record and trace upstream / downstream dependencies — load balancers, scheduled jobs, integrations, monitoring agents. Common gotcha: a "single app" upgrade silently breaks the nightly batch job that pulls from its API.
-
Attach the rollback runbook
Document the exact rollback procedure: snapshot revert commands, package downgrade syntax, config restore steps, and the named decision point at which the engineer aborts and rolls back. "We can roll back" is not a plan — the runbook is the plan.
Collects file -
Classify the change type
Standard = pre-approved, repeatable, low-risk (e.g., routine WSUS patch ring). Normal = requires CAB review. Emergency = bypasses normal CAB cadence with expedited approval. Misclassifying a normal change as standard is the most common audit finding in SOX ITGC reviews.
Collects list -
Schedule the maintenance window
Check the change calendar for blackout periods (month-end close, retail freeze, fiscal cutover). Pick a window that gives the on-call team daylight to roll back if smoke tests fail — Friday 5pm is the runbook-author's enemy.
Collects datetime
Pre-Deployment Validation
-
Deploy the build to the test ring
Use the staging or pilot OU / VLAN / cluster that mirrors production topology. Test rings that diverge from prod (different OS build, different agents installed) provide false confidence — patch Tuesday horror stories almost always trace to a staging environment that wasn't really staging.
-
Run the smoke test suite on staging
Hit the named user-facing checks: SSO login, primary report runs, scheduled job fires, monitoring agent reports in. The smoke-test list lives in the runbook — if it's not written down, it's not a test, it's vibes.
-
Verify the most recent backup is restorable
Confirm Veeam / Datto / Rubrik shows a successful job within the past 24 hours and that the restore point is mountable — "green dashboard" is not the same as "restorable." If the change touches a database, take a fresh backup before the window starts; do not rely on last night's job.
Collects list -
Dry-run the rollback procedure on staging
Walk the rollback runbook step by step on the staging system after the change is applied there. The first time you discover the rollback depends on a credential that rotated should not be at 2am on production.
-
Log defects and known issues for the change record
CAB Review and Communication
-
Submit the RFC to the CAB queue
Most CABs require RFCs in the queue 48-72 hours before the meeting. Late-submitted RFCs get deferred to the next cycle by default, which usually means slipping the deployment window by a week.
-
Present the change at CAB
Walk the board through scope, blast radius, rollback plan, and test evidence. Be ready for the question "who owns the rollback decision during the window?" — name the engineer and their backup.
Collects list -
Send maintenance notice to affected users
Notify via the standing channel — status page, Teams / Slack #announcements, email to affected DLs. Include start time, expected duration, services impacted, and where to file tickets if something is broken after the window closes.
-
Confirm on-call coverage for the window
Page the primary, secondary, and the named escalation owner via PagerDuty / Opsgenie before the window opens. For MSP work, confirm the client's after-hours contact is reachable in case decisions need their sign-off mid-window.
Deployment Execution
-
Snapshot production hosts before cutover
Take VM-level snapshots in vCenter / Hyper-V / Proxmox immediately before the deployment runbook starts. Note: snapshots are not backups — they expire on day-counters and they balloon storage. Tag the snapshot with the change number and a 7-day expiry.
-
Execute the deployment runbook
Follow the runbook exactly — do not improvise during the window. If a step fails or surfaces unexpected behavior, stop and call the change owner before continuing. Off-script execution is the most common cause of post-mortem findings.
-
Run post-deploy smoke tests against production
Execute the same smoke-test list used on staging. Verify monitoring (PRTG, Datadog, LogicMonitor) shows agents reporting and no new alerts. Customer-facing endpoints should be checked from outside the corporate network.
Collects list -
Execute the rollback procedure
Follow the rollback runbook attached to the RFC. Restore from snapshot or run the documented downgrade steps, re-run smoke tests, and re-open the change as failed. Notify users on the same channel where the maintenance notice was posted.
Post-Release Closure
-
Monitor systems through the stabilization period
Watch SIEM, EDR, and APM dashboards for 48-72 hours post-cutover. Track any new tickets tagged to the change in the PSA / ITSM queue. Spike in helpdesk volume on day 2 is often the first signal of a partial regression.
-
Hold the post-implementation review
Walk through what went per plan, what deviated, and which runbook steps need correction. Capture action items with owners and due dates — a PIR with no follow-through is theater.
-
Update the CMDB and runbook documentation
Push CI updates into ServiceNow / IT Glue / Hudu — version, host, dependencies, owner. Stale CMDB entries are how the next change owner trips on the same dependency you just discovered.
-
Close the change record with outcome and artifacts
Set final status, attach evidence (smoke test output, monitoring screenshots, communication artifacts), and link the PIR notes. SOX, SOC 2, and HIPAA audits sample closed change records — a half-closed ticket is an audit finding.
Collects list Collects paragraph Collects file
Use this template
Copy it to your account, customize the steps, and run it with your team in minutes.
Browse hundreds of free templates across every team and industry.
Back to template libraryRun Release Management Checklist with your team
Customize the steps, assign roles, set a schedule, and keep a complete record for every run.