Refactoring Checklist
Steps an engineering team runs to plan, execute, and ship a code refactor safely — from identifying smells through test coverage, technique selection, peer review, and gradual rollout.
Scope and Smell Assessment
-
Define the refactor scope and exit criteria
The tech lead names the modules, packages, or services in scope and the concrete signal that signals done — cyclomatic complexity below a threshold, a SonarQube quality gate passing, or a specific class extracted. Vague scopes ("clean up the auth code") drift into multi-sprint rewrites.
Collects paragraph Collects url -
Identify code smells with static analysis
Run SonarQube, Code Climate, or Semgrep against the target paths. Flag long parameter lists (typically 4+), god classes, duplicated blocks, and high cyclomatic complexity. Capture the baseline metrics so post-refactor improvement is measurable.
-
Map duplicated code across the module
Use jscpd, PMD CPD, or SonarQube duplication detection to surface copy-paste blocks. Resist the urge to DRY accidental duplicates — if two services happen to validate emails the same way today, extracting a shared helper couples them forever.
-
Review responsibilities for class extraction
Look for classes that mix concerns — persistence with business logic, transport with domain rules. Note candidates for Extract Class, Move Method, or splitting along seams suggested by Michael Feathers' "Working Effectively with Legacy Code."
Test Coverage and Safety Net
-
Measure baseline coverage on the target code
Run the suite with coverage (Istanbul, JaCoCo, coverage.py, SimpleCov) and capture line and branch coverage for the files in scope. Below 70% branch coverage is a red flag — write characterization tests before touching anything.
Collects list Collects file -
Write characterization tests for legacy paths
For undertested code, pin the current behavior with characterization tests before changing anything — even bugs. The point is to detect behavior change during refactoring, not to validate that the existing behavior is correct.
-
Run the full unit and integration suite green
The full suite must pass on main before the refactor branch is cut. Quarantine flaky tests explicitly — don't start a refactor with a yellow CI status, because regressions will hide in the noise.
-
Confirm e2e smoke tests pass in staging
Run Playwright, Cypress, or Selenium critical-path suites against staging on the pre-refactor sha. This is the reference recording — any divergence after the refactor is a regression to investigate, not a flake to retry.
Apply Refactoring Techniques
-
Cut a feature branch from main
Branch from a known-green main sha. Plan to keep the branch short-lived (under a week) and rebase daily — long-lived refactor branches collide with concurrent feature work and become a separate merge project.
-
Apply Extract Method to long functions
Break methods longer than ~30 lines or with multiple levels of abstraction into named helpers. Use the IDE's refactoring tool (IntelliJ, Rider, VS Code) rather than hand-editing — automated extraction preserves call sites and signatures correctly.
-
Rename symbols for intention-revealing names
Replace `data`, `tmp`, `mgr`, `helper` with names that say what the value represents in the domain. Use the IDE rename refactor so references update across the repo. Commit renames as their own small commits to keep diffs reviewable.
-
Replace conditional logic with polymorphism where it pays
Switch statements over a type tag that recur in three or more places are good candidates. Don't reach for polymorphism for a single 3-branch conditional — the indirection cost outweighs the win. See Fowler's "Replace Conditional with Polymorphism" for the canonical mechanics.
-
Introduce parameter objects for long argument lists
Methods with 4+ parameters that move together (date range, address fields, pagination) get bundled into a value object or DTO. This often surfaces a missing domain concept that has been hiding in plain sight.
-
Commit in small, reviewable increments
Each commit should compile and pass tests on its own. Keep commits under ~400 lines of diff so the eventual PR can be reviewed without reviewer fatigue. Mechanical refactors (renames, extractions) go in separate commits from behavior changes.
Performance Verification
-
Compare benchmarks against the pre-refactor baseline
For hot paths, run JMH, BenchmarkDotNet, pytest-benchmark, or k6 load tests on both shas and compare. Watch for the classic refactor regression: a clean Extract Method that re-allocates inside a tight loop, or a polymorphic dispatch that defeats inlining.
Collects list Collects file -
Profile and optimize the regressed paths
Use a sampling profiler — async-profiler, py-spy, dotnet-trace, Chrome DevTools — to find the hot frame. Common culprits after a refactor: N+1 queries from a newly extracted repository method, boxed allocations, lost short-circuit evaluation.
Code Review and Documentation
-
Open the PR with before/after context
The PR description names the smell, the technique applied, and the metric improved (cyclomatic complexity from 24 to 8, duplication from 12% to 2%). Link the baseline SonarQube run and the new one. Reviewers shouldn't have to reverse-engineer the intent.
-
Request review from CODEOWNERS of touched files
GitHub auto-requests CODEOWNERS, but page the relevant tech lead in Slack for refactors that span ownership boundaries. Avoid LGTM-without-reading on large refactors — split the PR if a reviewer can't realistically read it in one sitting.
-
Address review feedback and re-run CI
Push fixup commits during review; squash on merge so the history reads as a coherent change. Confirm the GitHub Actions or CircleCI pipeline goes green on the final sha — required status checks should make a red merge impossible, but re-verify.
-
Update ADRs and module-level docs
If the refactor changes architectural seams — a new boundary, a new abstraction, a removed pattern — file an ADR (Architecture Decision Record) in the repo's docs/adr directory. Update the README or Backstage entry so the next engineer doesn't repeat the question.
Merge and Gradual Rollout
-
Merge to main behind a feature flag if behavior changed
Pure refactors merge directly. Anything that changes observable behavior — even subtly, like a new code path — goes behind a LaunchDarkly, Unleash, or Statsig flag with a named owner and a cleanup ticket.
Collects list -
Register the feature flag with an owner and expiry
Set a 30-day expiry on the flag and assign a named owner. Stale flags are a chronic source of dead code paths and exploding test matrices — the quarterly flag review should already be on the calendar.
-
Canary the deploy to 5% of traffic
Argo Rollouts, Spinnaker, or a manual ALB weighted target group. Watch error rate and p99 latency on the Datadog or Grafana dashboard for at least 30 minutes before promoting. Roll back on any divergence from baseline.
-
Promote to 100% and monitor for 24 hours
Confirm Sentry error rate is flat and the RED-method dashboard (rate, errors, duration) shows no drift. After 24 hours of steady-state, file the flag-cleanup ticket so the temporary scaffolding doesn't outlive its usefulness.
Use this template
Copy it to your account, customize the steps, and run it with your team in minutes.
Browse hundreds of free templates across every team and industry.
Back to template libraryRun Refactoring Checklist with your team
Customize the steps, assign roles, set a schedule, and keep a complete record for every run.