Deployment Checklist

Pre-Release Preparation

    Branch from main and push a release-candidate tag (e.g., v2024.45.0-rc.1) following semver. The release captain owns this step. Make sure no late-merging PRs slip in after the cut without re-tagging.

    Cross-check PRs merged since the last release tag against the changelog. Flag breaking changes, deprecations, and customer-visible behavior so support and docs aren't surprised on release day.

    Deploy the RC to staging and run Playwright/Cypress e2e + integration tests. Do not ignore flakes — re-run once, then triage any persistent red. Required status checks must be green before sign-off.

    QA exercises sign-up, login, billing, and the top-three customer workflows in staging. Capture any regressions in Linear/Jira tickets linked to the release ticket.

    Verify the previous container image is still in the registry (not pruned), the DB migration is reversible or has no destructive change, and the rollback runbook references the correct sha. Quarterly rollback drill is what proves this works — release-day verification just confirms artifacts exist.

    Send a release summary to #customer-support with the deploy window, scope, and any new error messages or UI changes. Surprise UI changes drive ticket spikes that could have been pre-empted.

Database Migration Review

    On large tables, ADD COLUMN with a default, ALTER COLUMN, or non-concurrent index creation can take exclusive locks for hours. Use CREATE INDEX CONCURRENTLY, split adds from defaults, and batch backfills with sleeps to keep replication lag bounded.

    Restore last night's prod backup into the staging DB and run the migration end-to-end. Record duration and peak replication lag so the on-call knows what to expect during the real run.

    Backend code must read from both old and new schema during the rollout window — expand-migrate-contract, not a flip. Confirm the contract step is scheduled for a follow-up release, not this one.

Release Day Pre-Deploy

    Check PagerDuty and the incidents channel. Don't deploy on top of an active incident — you'll obscure the contributing factor and complicate rollback.

    Both the release captain and the primary on-call should be at a keyboard for the duration of the deploy plus the 30-minute monitoring window. No deploys with the on-call boarding a flight.

    Announcement includes the window, scope summary, RC tag, and rollback contact. Lock main to release-blocking PRs only until post-deploy monitoring completes.

Deploy

    Apply the migration before the backend deploy so the new code lands on a compatible schema. Watch replication lag in Datadog/Grafana — pause if lag climbs past the SLO.

    Route 5% of traffic via the load balancer or service mesh weight. Watch error rate, p99 latency, and saturation for 10 minutes against the golden-signals dashboard.

    Step the weight up gradually with a watch interval at each stage. If the error rate or latency dashboards show regression at any step, hold and investigate before proceeding.

    Frontend ships after backend is at 100% — backend is forward-compatible, frontend assumes the new API. Invalidate CDN caches per the release runbook.

    Hit the production synthetic that exercises sign-up, login, and a representative API call. A green CI suite plus a failing prod synthetic = config or env drift, not code.

Rollback

    Open an incident in PagerDuty/Incident.io, name an IC, and post to #incidents. Don't try to debug forward — get back to the last known-good state first, then investigate.

    Use the previous image tag captured pre-deploy. If a migration ran, follow the documented down-migration or use the expand-migrate-contract fallback to keep the old code compatible with the new schema.

    Watch the golden-signals dashboard for 15 minutes after rollback. Update the status page once metrics are clean.

Post-Deploy Monitoring

    Stay on the golden-signals dashboard — latency, traffic, errors, saturation. Sentry / Bugsnag spikes by release tag are the fastest signal of a regression that synthetic tests didn't catch.

    Only flip flags listed in the release plan, one at a time, with a few minutes between each so you can attribute any regression to the specific flip. Note flag owner so cleanup happens in the next quarterly flag review.

    Coordinate with support lead on a Zendesk/Intercom view filtered to the deploy window. A 3x ticket spike in 30 minutes is a release signal even if dashboards look green.

Wrap-Up

    Promote the RC tag to the GA version (e.g., v2024.45.0). Push the tag and confirm the release shows up in GitHub Releases / GitLab Releases.

    Customer-visible release notes go to the public changelog or status page. Internal-only details (infra changes, refactors) stay in the engineering changelog.

    Hotfix candidates, runbook gaps, dashboard gaps, alert tuning — log them in Linear/Jira while the context is fresh. These become the action items if a retro is scheduled.

    Record whether the release went clean, had issues but recovered, or required rollback. If anything went sideways, schedule a blameless PIR within 5 business days while contributing factors are still recoverable.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Software Development Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack