Bug Tracking and Resolution Checklist

Bug Identification

    Confirm the issue reproduces consistently or document the specific conditions that trigger it (browser version, user role, feature-flag state, tenant). Heisenbugs that only appear once get deprioritized — invest the time now to find a reliable repro recipe.

    Record the exact steps, expected vs. actual behavior, and attach Sentry/Datadog trace IDs, browser console output, network HAR, or a Loom screen recording. The point is that the assignee should not have to come back asking "what did you click?"

    SEV1 = production outage or data loss affecting many customers; SEV2 = major feature broken with no workaround; SEV3 = minor feature or workaround exists; SEV4 = cosmetic. SEV1 triggers the hotfix path and pages the on-call engineer immediately rather than going through normal triage.

Bug Reporting

    Use the Bug template — title formatted as "[Area] short description", labels for affected component and environment, link to any related Sentry issue. Paste the repro steps and trace IDs from the prior step rather than describing them again.

    Check the affected service in Backstage or the CODEOWNERS file to find the owning team. Don't drop bugs into a generic engineering queue — they sit there. If ownership is genuinely ambiguous, tag the platform team rather than guessing.

    Page the on-call engineer in PagerDuty, open an incident in Incident.io or FireHydrant, and post in #incidents with the customer impact summary. Update the public status page within 15 minutes if customer-facing. The hotfix branch comes off the latest production tag, not main.

Triage and Prioritization

    Tech lead or triage rotation reviews accuracy and completeness. Common rejections: missing repro steps, user error mistaken for a bug, behavior matches the spec. Push back to the reporter rather than letting an incomplete ticket sit in the backlog.

    Search Sentry by error fingerprint and Jira by component label. If duplicate, link with "is duplicated by" and close the new ticket; bump the priority on the original if this report adds new affected customers.

    Weight severity, number of customers affected, and revenue impact against current sprint capacity. P0 jumps the queue; P1 lands in the current sprint; P2 in the next; P3 goes to the backlog with no commitment. Don't promise a sprint slot you can't deliver — it erodes trust with support.

Resolution and Code Review

    Engineering manager assigns based on code familiarity and current WIP. For SEV1 hotfixes, pull from the on-call rotation rather than interrupting the assigned sprint owner.

    The test should fail on the unfixed branch and pass after the fix — that's the proof the fix actually addresses the reported behavior. Skipping this is the most common reason bugs come back six months later after someone refactors the surrounding code.

    PR title references the ticket ID. CI must be green — do not bypass branch protection because "the failing test is flaky." Reviewer reads the diff against the regression test, not just LGTM. Keep PRs under ~400 lines; split if larger.

    Run the original repro steps against staging — not just the regression test. Confirm no new errors appear in Sentry for related components in the 30 minutes after deploy.

QA Sign-Off

    Trigger the Playwright or Cypress suites for the components touched by the fix. Cross-browser matrix matters when the bug was browser-specific. Investigate any new flakes rather than retrying until green.

    Bug fixes regress neighbors more often than they regress themselves. Walk the related user paths the customer would touch — login, the affected feature, the settings panel that calls the same API.

    QA lead records the verification result, attaches the test run output, and signs off. "Verified with notes" means it ships but with a follow-up ticket attached. "Failed" sends it back to the assignee with new repro details.

Deploy and Post-Resolution

    Squash-merge the PR, let GitHub Actions or ArgoCD ship the build, and watch the canary error rate and p99 latency for 30 minutes before promoting to 100%. For SEV1 hotfixes, cherry-pick to the release branch and tag a patch version (e.g., v2024.45.1).

    Add an entry under the Fixed section with the customer-facing description (not the engineering one). Link the ticket. If the bug was reported via support, write the entry so the support team can copy it into the customer reply verbatim.

    Reply on the original support tickets, post in #customer-support with the deployed version, and resolve the status page incident if one was opened. Customers who took the time to file a bug deserve to know it shipped.

    Required for any SEV1 or SEV2 within 5 business days of resolution. Focus on contributing factors — missing alert, gap in test coverage, deployment automation hole — not individuals. File action items in Jira with named owners and due dates; review at the next sprint retro.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Software Development Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack