System Testing Checklist

QA-led system testing workflow for a release candidate, run from environment setup through test closure. Used by QA engineers, SDETs, and release captains to validate functional, non-functional, and integration behavior before promoting ...

1

Test Environment Setup

  1. Confirm staging is isolated from prod
    • Verify the staging VPC, IAM roles, and database credentials do not overlap with production. A common gotcha: a staging Lambda still pointing at the prod RDS endpoint because someone copied an env var. Spot-check the secrets manager paths and the outbound network ACLs before kicking off tests.

  2. Match staging parity to production
    • Confirm OS version, Postgres major version, Redis version, Kubernetes node image, and feature flag defaults match prod. Drift on minor versions (PG 15.4 vs 15.7) is fine; drift on major versions invalidates the test run.

  3. Seed staging with anonymized prod data
    • Use the nightly anonymized snapshot, not synthetic fixtures — synthetic data hides cardinality and locale issues. Confirm PII scrubbing ran (no real emails, names hashed) before exposing the dataset to non-engineering testers.

    Collects text
2

Test Planning and Preparation

  1. Review the release-candidate scope in Jira
    • Pull the fix-version filter for the release candidate and walk the linked PRs. Flag any story without acceptance criteria back to the PM before writing test cases — a fuzzy AC is the most common reason a defect gets bounced as 'works as designed' later.

  2. Update the regression test suite
    • Add or amend Playwright / Cypress cases for new user stories; mark superseded cases as deprecated rather than deleting (audit trail). Tag flaky cases with a quarantine label so they run but don't gate the suite.

  3. Stage test data for boundary conditions
    • Cover empty state, single record, max-page-size, unicode names, RTL locales, and the largest plausible tenant. The unicode + RTL cases are the ones consistently missed and the ones support tickets get filed against.

3

Functional Testing

  1. Run the regression suite against the RC build
    • Trigger the suite from CI against the tagged release candidate (e.g., v2024.45.0-rc.1), not against latest main. Capture the run ID and any quarantined-test results separately so the report is reproducible.

  2. Execute exploratory testing on new features
    • Time-box charters at 60-90 minutes per tester per feature. Take notes in a session log; file any unexpected behavior as a defect even if it doesn't violate a written AC — the AC may be incomplete.

  3. Record overall functional test result
    • Mark Pass only if every required case passed and any quarantined failures were investigated. Pass-with-issues is the right call when a SEV3 was found that the release captain has agreed to ship around with a follow-up ticket.

    Collects list Collects url Collects paragraph
4

Non-Functional Testing

  1. Run k6 load test against staging
    • Hit 2x peak prod RPS for 15 minutes against the critical user paths. Watch p95 and p99 latency, error rate, and DB connection pool saturation in Datadog. A green load test where p99 doubles is still a fail — eyeball the dashboard, don't trust the exit code.

  2. Run SAST and dependency scans
    • Trigger Semgrep / CodeQL on the RC sha and review Snyk or Dependabot output for new criticals. Any new CVE rated CVSS 7.0+ blocks the release unless the AppSec lead signs off on a documented exception.

  3. Validate WCAG 2.1 AA on changed screens
    • Run axe-core in CI against each changed route; manually keyboard-navigate critical paths. The EU Accessibility Act (effective June 2025) makes this non-optional for products sold into the EU.

5

Integration and Interoperability Testing

  1. Test third-party API integrations end-to-end
    • Hit Stripe, Auth0/Okta, Segment, and any sandbox endpoints used in the changed code paths. Confirm webhooks deliver and signatures verify; a silent webhook failure won't show in the regression suite.

  2. Verify message queue and event flow
    • Publish synthetic events through SQS / Kafka / RabbitMQ and confirm consumers process them with the expected schema. Check the DLQ count before and after — silent schema drift surfaces here first.

  3. Run database migration against a staging clone
    • Apply the migration to a clone seeded with prod-sized data, time it, and confirm reversibility. ADD COLUMN with a default on a 50M-row table is a classic foot-gun: it takes an exclusive lock and the deploy stalls. Capture wall-clock duration in the data field.

    Collects number
6

Defect Reporting and Management

  1. File defects in Jira with reproduction steps
    • Each ticket needs build sha, browser/OS, exact repro steps, expected vs actual, and a screenshot or HAR. Tickets without a sha get bounced — the first thing engineering will ask is which build it reproduced on.

  2. Triage defects by severity and impact
    • SEV1 = data loss or auth bypass, blocks release. SEV2 = critical path broken, blocks release unless workaround exists. SEV3 = ships with a follow-up. Triage with the release captain and a PM in the room — severity is a business call, not a QA call.

  3. Confirm release-blocker defect status
    • If any SEV1/SEV2 remains open, the release does not promote — the team cuts a new RC. If all blockers are closed or accepted, proceed to closure.

    Collects list
  4. Cut a new release candidate build
    • Tag the next RC (e.g., v2024.45.0-rc.2), notify the release captain in #engineering, and restart functional regression against the new sha. Do not cherry-pick fixes onto the prior RC tag — re-cut from the release branch.

  5. Retest fixed defects on the new build
    • Run the original repro steps verbatim and add a regression case to the automated suite so the same defect cannot ship again. Close the ticket only after the regression case is committed.

7

Test Closure and Sign-Off

  1. Publish the test summary report
    • Post to Confluence: cases run, pass rate, defects opened by severity, accepted-with-follow-up tickets, performance deltas vs prior release. Link the CI run, the load test dashboard, and the migration timing.

    Collects file
  2. Capture QA sign-off for release promotion
    • QA lead signs off only after the summary is reviewed by the release captain. This signature is the SOC 2 change-management artifact auditors ask for; do not skip it even on a quiet release.

    Collects signature Collects paragraph
  3. Hold the test-cycle retrospective
    • Walk what slipped past the suite, which charters surfaced the most defects, and where flaky tests masked real failures. File action items in Jira with named owners; an action item without an owner does not get done.