Refactoring Checklist

Steps an engineering team runs to plan, execute, and ship a code refactor safely — from identifying smells through test coverage, technique selection, peer review, and gradual rollout.

6 sections 24 steps Collects data

Scope and Smell Assessment

Define the refactor scope and exit criteria
- The tech lead names the modules, packages, or services in scope and the concrete signal that signals done — cyclomatic complexity below a threshold, a SonarQube quality gate passing, or a specific class extracted. Vague scopes ("clean up the auth code") drift into multi-sprint rewrites.
Collects paragraph Collects url
Identify code smells with static analysis
- Run SonarQube, Code Climate, or Semgrep against the target paths. Flag long parameter lists (typically 4+), god classes, duplicated blocks, and high cyclomatic complexity. Capture the baseline metrics so post-refactor improvement is measurable.
Map duplicated code across the module
- Use jscpd, PMD CPD, or SonarQube duplication detection to surface copy-paste blocks. Resist the urge to DRY accidental duplicates — if two services happen to validate emails the same way today, extracting a shared helper couples them forever.
Review responsibilities for class extraction
- Look for classes that mix concerns — persistence with business logic, transport with domain rules. Note candidates for Extract Class, Move Method, or splitting along seams suggested by Michael Feathers' "Working Effectively with Legacy Code."

Test Coverage and Safety Net

Measure baseline coverage on the target code
- Run the suite with coverage (Istanbul, JaCoCo, coverage.py, SimpleCov) and capture line and branch coverage for the files in scope. Below 70% branch coverage is a red flag — write characterization tests before touching anything.
Collects list Collects file
Write characterization tests for legacy paths
- For undertested code, pin the current behavior with characterization tests before changing anything — even bugs. The point is to detect behavior change during refactoring, not to validate that the existing behavior is correct.
Run the full unit and integration suite green
- The full suite must pass on main before the refactor branch is cut. Quarantine flaky tests explicitly — don't start a refactor with a yellow CI status, because regressions will hide in the noise.
Confirm e2e smoke tests pass in staging
- Run Playwright, Cypress, or Selenium critical-path suites against staging on the pre-refactor sha. This is the reference recording — any divergence after the refactor is a regression to investigate, not a flake to retry.

Apply Refactoring Techniques

Cut a feature branch from main
- Branch from a known-green main sha. Plan to keep the branch short-lived (under a week) and rebase daily — long-lived refactor branches collide with concurrent feature work and become a separate merge project.
Apply Extract Method to long functions
- Break methods longer than ~30 lines or with multiple levels of abstraction into named helpers. Use the IDE's refactoring tool (IntelliJ, Rider, VS Code) rather than hand-editing — automated extraction preserves call sites and signatures correctly.
Rename symbols for intention-revealing names
- Replace `data`, `tmp`, `mgr`, `helper` with names that say what the value represents in the domain. Use the IDE rename refactor so references update across the repo. Commit renames as their own small commits to keep diffs reviewable.
Replace conditional logic with polymorphism where it pays
- Switch statements over a type tag that recur in three or more places are good candidates. Don't reach for polymorphism for a single 3-branch conditional — the indirection cost outweighs the win. See Fowler's "Replace Conditional with Polymorphism" for the canonical mechanics.
Introduce parameter objects for long argument lists
- Methods with 4+ parameters that move together (date range, address fields, pagination) get bundled into a value object or DTO. This often surfaces a missing domain concept that has been hiding in plain sight.
Commit in small, reviewable increments
- Each commit should compile and pass tests on its own. Keep commits under ~400 lines of diff so the eventual PR can be reviewed without reviewer fatigue. Mechanical refactors (renames, extractions) go in separate commits from behavior changes.

Performance Verification

Compare benchmarks against the pre-refactor baseline
- For hot paths, run JMH, BenchmarkDotNet, pytest-benchmark, or k6 load tests on both shas and compare. Watch for the classic refactor regression: a clean Extract Method that re-allocates inside a tight loop, or a polymorphic dispatch that defeats inlining.
Collects list Collects file
Profile and optimize the regressed paths
- Use a sampling profiler — async-profiler, py-spy, dotnet-trace, Chrome DevTools — to find the hot frame. Common culprits after a refactor: N+1 queries from a newly extracted repository method, boxed allocations, lost short-circuit evaluation.

Code Review and Documentation

Open the PR with before/after context
- The PR description names the smell, the technique applied, and the metric improved (cyclomatic complexity from 24 to 8, duplication from 12% to 2%). Link the baseline SonarQube run and the new one. Reviewers shouldn't have to reverse-engineer the intent.
Request review from CODEOWNERS of touched files
- GitHub auto-requests CODEOWNERS, but page the relevant tech lead in Slack for refactors that span ownership boundaries. Avoid LGTM-without-reading on large refactors — split the PR if a reviewer can't realistically read it in one sitting.
Address review feedback and re-run CI
- Push fixup commits during review; squash on merge so the history reads as a coherent change. Confirm the GitHub Actions or CircleCI pipeline goes green on the final sha — required status checks should make a red merge impossible, but re-verify.
Update ADRs and module-level docs
- If the refactor changes architectural seams — a new boundary, a new abstraction, a removed pattern — file an ADR (Architecture Decision Record) in the repo's docs/adr directory. Update the README or Backstage entry so the next engineer doesn't repeat the question.

Merge and Gradual Rollout

Merge to main behind a feature flag if behavior changed
- Pure refactors merge directly. Anything that changes observable behavior — even subtly, like a new code path — goes behind a LaunchDarkly, Unleash, or Statsig flag with a named owner and a cleanup ticket.
Collects list
Register the feature flag with an owner and expiry
- Set a 30-day expiry on the flag and assign a named owner. Stale flags are a chronic source of dead code paths and exploding test matrices — the quarterly flag review should already be on the calendar.
Canary the deploy to 5% of traffic
- Argo Rollouts, Spinnaker, or a manual ALB weighted target group. Watch error rate and p99 latency on the Datadog or Grafana dashboard for at least 30 minutes before promoting. Roll back on any divergence from baseline.
Promote to 100% and monitor for 24 hours
- Confirm Sentry error rate is flat and the RED-method dashboard (rate, errors, duration) shows no drift. After 24 hours of steady-state, file the flag-cleanup ticket so the temporary scaffolding doesn't outlive its usefulness.

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.

Use this workflow Start free trial

Sections 6

Steps 24

Category Software Development

Price Free to start

Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Related templates

More workflows your team can run.

Software Development

Run Refactoring Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.