Acceptance Testing Checklist

Test Planning & Preparation

    Walk each user story's acceptance criteria with the PO before testing starts. Ambiguous criteria ("works as expected", "performs well") must be tightened into observable, testable outcomes — otherwise functional sign-off becomes a debate during the release window.

    Refresh the UAT database from a sanitized production snapshot so test data resembles real customer shapes. Confirm feature flags, secrets, and third-party sandbox endpoints (Stripe test mode, sandbox SSO IdP) match what production will see.

    Map test cases one-to-one against acceptance criteria. Flag cases that are automated vs. manual; the manual list is what the QA team executes in this run, the automated list lives in the regression suite covered later.

    Set up tenants, users, and edge-case data (long names, unicode, expired trials, multi-currency). Toggle the LaunchDarkly / Unleash flags to the state that will ship; testing the wrong flag combination is a common cause of release-day surprises.

Functional Acceptance

    Execute each story's manual cases in the order an end-user would. Record defects directly to Jira/Linear with the story link, environment, build SHA, and reproduction steps — not just a screenshot.

    Run the e2e suite against the UAT build. Triage any failures into real regressions vs. flakes; flakes get tracked separately so the green/red signal stays meaningful.

    Run contract tests (Schemathesis, Dredd, or Pact) so undocumented field changes don't slip through. Breaking changes to public endpoints require a deprecation cycle, not a silent ship.

Performance & Load Testing

    Drive expected peak RPS for at least 30 minutes against the UAT cluster, sized to match production. Capture p50/p95/p99 latency, error rate, and saturation on the golden signals dashboard.

    Ramp load past expected peak until the system degrades. Note where it breaks first — DB connection pool, downstream API rate limit, CPU saturation on a specific service — and confirm autoscaling reacts before users see errors.

    Compare measured p95/p99 against the documented SLO. A regression vs. last release's baseline counts as a breach even if the absolute numbers are still under SLO — that's how slow drift gets caught.

    Open release-blocking tickets with the flame graph, slow query log, or APM trace attached. Decide with engineering and product whether to fix-and-retest or descope before sign-off — "ship and monitor" is rarely the right call for a known SLO breach.

Security Validation

    Trigger Semgrep / CodeQL for SAST and Snyk / Dependabot for SCA against the release SHA. Triage criticals and highs; transitive CVEs without a fix path get an explicit risk acceptance, not a silent skip.

    Pair with AppSec on any auth, authz, or data-handling changes in this release. New endpoints get a quick threat-model review against the OWASP Top 10 — IDOR and broken access control are the most common findings on feature releases.

    Cross-check the latest pentest report against this release's scope. SOC 2 auditors will look for evidence that critical/high findings were closed before promotion; a screenshot of the ticket-closed state attached here is the audit trail.

    Notify the release captain and the CISO delegate that promotion is on hold. Open critical/high findings either get fixed-and-rescanned or formally risk-accepted with sign-off captured in the ticket — never bypassed in the deploy gate.

Usability & Accessibility

    Recruit 5–8 representative users for the new flows and observe via Loom or a moderated UserTesting session. Watch for hesitation, dead-end states, and confused terminology — these rarely surface in internal dogfooding.

    Run axe or Lighthouse on every new screen plus keyboard-only navigation through the changed flows. Color contrast, missing labels, and focus-trap on modals are the recurring offenders. Public-sector and EU customers expect AA conformance.

    Categorize findings as release-blocking, fast-follow, or backlog. The bar for blocking is "a target user cannot complete the core task" — papercuts get logged but don't gate the ship.

Regression & Sign-Off

    Kick off the nightly regression pipeline against the release candidate SHA. Attach the CI run URL here; a green run alone isn't sign-off — flakes have to be investigated, not retried until green.

    For each failure, decide: real regression (fix and re-run), known flake (quarantine with a tracking ticket), or environmental (UAT-only, document and override). "Just rerun it" without classification is how flaky suites lose their signal.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Software Development Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack