Software Development Plan Checklist

Project Planning

    The product manager and tech lead draft a one-page scope: in-scope deliverables, explicit non-goals, target launch quarter, and rough budget. Non-goals are the part teams skip — name them. Park the doc in Confluence or Notion and link it from the project's Jira/Linear epic.

    Name the responsible engineer(s), accountable EM/tech lead, consulted stakeholders (security, design, support), and informed parties (sales, CS). Ambiguity here is the most common reason architecture review stalls 4 weeks in.

    Tier the project so downstream review depth scales appropriately. T-shirt size drives whether full architecture review and threat modeling are required, or whether the team can move faster with lighter ceremony.

    Break the work into milestones with target dates: design complete, alpha (internal), beta (gradual rollout), GA. Add 20% buffer for unknowns and explicitly account for on-call rotations and holidays in the timeline.

Requirements and User Stories

    Run a working session with PM, support, and at least one customer-facing engineer. Capture jobs-to-be-done, not solutions. Watch for the gotcha where sales-team requests get translated into spec without validation against actual customer behavior.

    Spell out target SLOs (p95 latency, availability), expected request volume, data retention, accessibility (WCAG 2.1 AA if public-sector or EU), and any regulatory scope (HIPAA, PCI, GDPR, SOC 2). NFRs missed at planning surface as production fires.

    Each story has Given/When/Then acceptance criteria a QA engineer can test against. Avoid stories larger than 5 points — break them down. "As a user I want X" without acceptance criteria is a known anti-pattern that pushes ambiguity into code review.

    MoSCoW or weighted shortest job first — pick one and stick with it. Tag stories Must / Should / Could; the Coulds become the cut list when timeline pressure hits in week 6.

Design and Architecture

    One ADR per significant choice: data store, API style (REST/gRPC/GraphQL), sync vs. async messaging, deployment topology. Include the alternatives considered and why they were rejected — future-you will thank present-you.

    STRIDE walk-through with AppSec on the data flow diagram. Identify trust boundaries, authn/authz checkpoints, and PII flows. If the project handles PHI or cardholder data, this is a hard gate, not a checkbox.

    Present the ADR and threat model to senior engineers from at least two other teams. The goal is to surface blast-radius concerns and hidden coupling — not to debate code style. Capture decisions and unresolved questions in writing.

Development Environment Setup

    Create the repo with CODEOWNERS, branch protection on main (required reviews, required status checks, no direct push), Dependabot or Renovate enabled, and secret scanning on. Trunk-based or GitHub Flow — pick one and document it in the README.

    GitHub Actions or CircleCI workflow that runs lint, type check, unit tests, SAST (Semgrep or CodeQL), and SCA (Snyk or Dependabot) on every PR. Required-status-check ties into branch protection. "Just rerun, it's flaky" becomes a habit if you let it.

    Use Terraform or Pulumi modules so the three environments are structurally identical. Staging gets prod-shaped data (anonymized). Wire up secrets from AWS Secrets Manager or Vault — never bake secrets into images or commit .env files.

Quality and Compliance Gates

    Set targets per layer: unit (Jest/pytest/RSpec) at 70-80%, integration at critical-path coverage, e2e (Playwright or Cypress) only on the top 5-10 user journeys. e2e-heavy pyramids become flaky and ignored.

    If the project is in SOC 2 scope, confirm what change-management, access-review, and vulnerability-management evidence Vanta/Drata/Secureframe needs. Catching this at planning beats retrofitting audit trails six months in.

    External pentest engagements book 4-6 weeks out. Schedule against the beta milestone, not GA — you need time to remediate findings before customer traffic.

    k6 or Locust scenarios driving 2x expected peak load against staging. Capture p50/p95/p99 latency and error rate; agree on pass/fail thresholds before running, not after seeing results.

Observability and Incident Readiness

    Pick 2-3 SLIs (request success rate, p95 latency, freshness for async pipelines) and corresponding SLOs. The error budget makes the deploy-vs-stabilize tradeoff a number, not a debate.

    RED dashboard (rate, errors, duration) per service plus the four golden signals. Alerts page only on user-impact symptoms — alerting on CPU at 80% is a known noise generator. PagerDuty routing tied to the on-call rotation.

    One runbook per alert: what it means, how to triage, how to mitigate, how to escalate. Living doc in the repo or Confluence — link it from the alert payload itself so the on-call doesn't have to grep at 3am.

Release and Handover

    Feature-flagged dark launch, then canary at 5% / 25% / 50% / 100%. Rollback plan covers reversing DB migrations (or marking them forward-only and shipping a kill switch instead) and pinning to the previous container image. Test the rollback path before the launch, not during.

    Go/no-go meeting with EM, AppSec, support, and on-call. Walk the LRR checklist: SLOs configured, runbook published, pentest findings closed, support trained, rollback verified, status page templated.

    Support engineers shadow a debugging session, get the runbook walkthrough, and confirm Zendesk macros for the top 3 expected ticket categories. Name the long-term maintenance owner explicitly — orphaned services are the source of every "why is this on fire" 18 months from now.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Software Development Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack