Release Checklist

Weekly production release runbook for a SaaS engineering team. Covers pre-release verification, the deploy window itself with canary and rollback paths, and post-deploy monitoring and wrap-up.

5 sections 26 steps Collects data
1

Pre-Release Preparation

  1. Cut the release branch from main
    • Branch off the latest green main commit and tag the candidate (e.g., v2024.45.0-rc.1). The release captain owns this step. If main is red, hold the cut until the failing build is investigated — don't branch from a broken tip.

  2. Confirm changelog entries for merged PRs
    • Diff merged PRs since the last release tag against the changelog. Customer-visible changes need a public note; internal refactors get an internal-only line. Missing changelog entries are the most common reason support gets blindsided post-release.

  3. Run the full e2e suite in staging
    • Trigger the Playwright/Cypress suite against the release-candidate build deployed to staging. Investigate every failure — do not classify as flaky without a linked ticket. Rerunning until green hides real regressions.

  4. Verify the rollback plan
    • Confirm the previous container image still exists in ECR/GAR and has not been pruned. Walk the rollback procedure on paper: if the deploy fails at 50% traffic, what command brings us back? If a migration is in this release, the rollback path must reverse cleanly or be forward-only safe.

  5. Determine whether a DB migration ships
    • Review the migrations folder for new entries since the last release tag. Flag any migration that adds a column with a default, drops a column, or renames a column on a table over 1M rows — these need a multi-step expand/contract pattern, not a single deploy.

    Collects list
2

Release Day Pre-Deploy

  1. Confirm no active SEV1 or SEV2 incidents
    • Check PagerDuty and the #incidents channel. Releasing on top of an active incident makes triage impossible — you can't tell whether new symptoms are from the release or the underlying issue. If a SEV is open, push the deploy to the next window.

  2. Confirm release captain and on-call coverage
    • Both the release captain and the primary on-call engineer must be at keyboard for the full deploy window plus 30 minutes of post-deploy monitoring. Split-brain coverage (captain leaves at deploy, on-call takes over for monitoring) is how the first error spike gets missed.

  3. Post the release announcement to engineering
    • Drop the release window, scope summary, and rollback contact in #engineering. Cross-post to #customer-support so they can triage tickets that arrive during the window. Pin the message until wrap-up is complete.

  4. Lock main to release-blocking PRs only
    • Enable the branch-protection lock or post the freeze in #engineering. Hotfixes for the in-flight release are fine; unrelated merges are not. The freeze ends after the wrap-up section completes.

3

Deploy

  1. Deploy the database migration first
    • Run the migration ahead of the application deploy so the new code lands on a schema that already supports it. Watch replication lag during the migration — if lag exceeds 30 seconds on the primary replica, pause and investigate before continuing. Backfills run in batches with sleeps, never as a single transaction.

  2. Deploy the backend canary at 5% traffic
    • Promote the release image to the canary deployment and shift 5% of production traffic via the load balancer or service mesh. Hold here for 10 full minutes — early canary errors usually appear within 2 minutes, but slow leaks (memory, connection pool) need the longer window.

  3. Evaluate the canary smoke result
    • Compare canary error rate, p99 latency, and saturation against the baseline pods. A canary error rate more than 2x baseline, or p99 latency degraded by more than 20%, fails the canary. New error signatures in Sentry that weren't present pre-deploy also fail it. Mark Fail and the rollback step will be triggered automatically.

    Collects list
  4. Roll back to the previous container image
    • Re-deploy the previous image tag verified during pre-release prep, shift canary traffic back to the stable pool, and confirm error rate returns to baseline. If a migration shipped in this release, run the down-migration only after confirming the new code is fully drained — otherwise the old code hits a schema it doesn't recognize.

  5. Roll out backend to 25%, 50%, 100%
    • Shift traffic in three steps with at least 5 minutes of dwell at each stage. Watch error rate and latency at each step; a regression at 50% is easier to roll back than at 100%. Don't skip stages even if the canary looked clean — load patterns at higher percentages can surface issues the canary missed.

  6. Deploy the frontend after backend stabilizes
    • Backend goes first because it's forward-compatible with the old frontend; the reverse is not true. Bust the CDN cache for the frontend bundle so users get the new client immediately. Confirm the build hash served from the CDN matches the deployed artifact.

  7. Run the production smoke test
    • Execute the synthetic user journey against production: login, core CRUD path, billing read, logout. This catches the misconfigurations that staging doesn't have — production secrets, third-party webhooks, real DNS.

4

Post-Deploy Monitoring

  1. Watch the error-rate dashboard for 30 minutes
    • Datadog/Sentry dashboard, filtered to the post-deploy window. New error signatures get triaged immediately — even one new exception class warrants a look before declaring the release stable. Cross-reference Sentry release tags against the deployed sha.

  2. Watch p99 latency for 30 minutes
    • Compare p50/p95/p99 against the trailing 7-day baseline at the same hour-of-day. A 20%+ p99 regression that holds for 15 minutes is a release issue, not a noise spike — file a hotfix ticket and consider rolling back if it persists.

  3. Monitor support inbound for spikes
    • Check Zendesk/Intercom inbound rate against the same-hour baseline. Customer-reported regressions often arrive 10–20 minutes after deploy, after the first wave of users hits the new code. A 2x ticket spike with overlapping symptoms is a release signal.

  4. Flip planned feature flags
    • Flags that gate new functionality flip after the deploy is confirmed stable, not at deploy time — keeps the variables independent. Flip via LaunchDarkly/Statsig with a gradual rollout cohort if the feature is risky. Note any flag flipped here in the release summary.

  5. Update the customer-facing status page
    • Post a Statuspage maintenance-completed note if you opened a maintenance window. Skip if the release was transparent to users. Don't post anything that contradicts the changelog — they get cross-referenced.

5

Wrap-Up

  1. Tag the deployed sha as the release version
    • Promote the rc tag to the final release tag (e.g., v2024.45.0-rc.1 → v2024.45.0). The deployed sha is what you'll need for the post-incident review if anything regresses next week — capture it now while it's fresh.

    Collects text Collects text
  2. Publish release notes and changelog
    • Customer-visible notes go to the public changelog (docs site or in-app). Internal-only changes stay in the engineering changelog. Attach the published notes here for the audit trail.

    Collects file
  3. Send the release summary to engineering
    • Post in #engineering: tag, sha, deploy duration, any rollback or partial-rollback events, error-rate and p99 deltas vs. baseline, flags flipped. Unfreeze main in the same message.

  4. File hotfix tickets for issues found
    • Anything weird seen during deploy or monitoring becomes a Jira/Linear ticket — even minor anomalies, even if they self-resolved. Patterns across releases only emerge if you write them down.

  5. Schedule retro if anything went sideways
    • Rollback fired, p99 spiked, customer tickets surged, or the deploy ran more than 2x the median window — schedule a blameless retro within 5 business days. If the deploy was uneventful, skip this step.

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.


Sections 5
Steps 26
Category Software Development
Price Free to start
Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Run Release Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.