Performance Optimization Checklist

A structured pass for engineering teams investigating production latency, throughput, or cost regressions — covering profiling, database tuning, infrastructure, frontend, network, and ongoing monitoring. Run quarterly or whenever SLO burn-rate alerts cross the 14-day threshold.

6 sections 26 steps Collects data
1

Baseline and Profiling

  1. Capture current p50, p95, p99 latency
    • Pull latency from Datadog / New Relic / Honeycomb for the last 14 days, broken out by the top 5 endpoints by traffic. Record traffic volume and error rate alongside latency — optimizing p99 on a low-traffic endpoint is rarely worth the effort.

    Collects list Collects file Collects paragraph
  2. Run a CPU and memory profile
    • Use pprof, py-spy, async-profiler, or the equivalent for your runtime. Capture a 60-second profile under representative load — not idle. Flame graphs from production traffic catch hot paths that synthetic benchmarks miss.

  3. Identify the top three bottlenecks
    • Rank by share of total wall-clock time, not by intuition. Common surprises: JSON serialization, ORM N+1 queries, synchronous logging in hot paths, gzip on already-compressed payloads.

  4. Define target SLOs for the optimization pass
    • Write the goal as a measurable SLI: "p95 checkout latency under 400ms over 7-day window." Vague targets like "make it faster" produce vague results. Get sign-off from the product manager so the team isn't optimizing in the dark.

2

Application Code Optimization

  1. Eliminate N+1 queries on hot paths
    • Use the ORM's eager-loading (Rails includes, Django select_related/prefetch_related, SQLAlchemy joinedload) or write the join explicitly. Add bullet-train or Bullet (Rails) / nplusone (Python) to CI to fail builds that introduce regressions.

  2. Replace synchronous calls with batched or async patterns
    • Loops that hit external APIs sequentially are a common culprit. Batch through DataLoader, asyncio.gather, or a background job (Sidekiq, Celery, BullMQ) when the work doesn't need to block the request.

  3. Add memoization for expensive pure functions
    • Cache results of pure deterministic functions in-process (functools.lru_cache, lodash memoize) or in Redis for cross-process. Watch the cache key — including a mutable object as a key is a bug magnet.

  4. Open a code-review PR with profiler-driven changes
    • Keep PRs under ~400 lines so reviewers actually read them. Include before/after profile snapshots in the PR description. Route through CODEOWNERS to engineers who own the touched modules.

3

Database Performance

  1. Review slow query logs from the last 7 days
    • In Postgres enable pg_stat_statements; in MySQL enable the slow query log with long_query_time=0.5. Sort by total time (calls × mean), not single-call time — the query running 100k times at 50ms is more important than the one running once at 8s.

  2. Run EXPLAIN ANALYZE on the worst offenders
    • Look for sequential scans on large tables, hash joins where a nested-loop with an index would be faster, and rows-removed-by-filter numbers that dwarf rows-returned. Paste plans into explain.depesz.com or pev2 for readability.

  3. Add indexes concurrently in production
    • Use CREATE INDEX CONCURRENTLY in Postgres and ALGORITHM=INPLACE / pt-online-schema-change in MySQL. Plain CREATE INDEX takes an exclusive lock and will pause writes on a busy table for the duration.

      Verify with EXPLAIN that the planner actually picks up the new index — composite index column order matters.

  4. Tune connection pooling and statement timeouts
    • For Postgres, use PgBouncer in transaction-pooling mode in front of the application. Set a server-side statement_timeout so a runaway query can't tie up a connection forever. Right-size pool size to (cores × 2) + effective_io, not arbitrarily large.

  5. Plan a backfill or denormalization if needed
    • If a query joins five tables on every request, a denormalized read model or materialized view may be the right fix. Backfill in batches with sleeps so replication lag stays under your alert threshold; do not run a single transaction over millions of rows.

4

Caching and Infrastructure

  1. Add a Redis or Memcached layer for hot reads
    • Cache at the read-model boundary, not inside the ORM. Use cache-aside with a sensible TTL plus explicit invalidation on writes. The two hard problems are still cache invalidation and naming things — write down your key schema before shipping.

  2. Configure CDN caching for static assets
    • Set long Cache-Control: max-age values for fingerprinted assets, short or no-cache for HTML. Verify with curl -I against CloudFront / Cloudflare / Fastly that you see X-Cache: Hit on second request.

  3. Right-size autoscaling and HPA thresholds
    • Default HPA at 80% CPU is often too high for latency-sensitive services — saturation creates queueing well before CPU pegs. Consider scaling on p95 latency or request concurrency instead. Ensure the cluster has burst headroom for sudden scale-ups.

  4. Decide whether to escalate to a load test
    • If the regression severity is moderate or severe, run a k6 or Locust load test in staging before shipping. For minor SLO-internal optimizations, canary in production is usually sufficient.

    Collects list
  5. Run k6 load test against staging
    • Replay realistic traffic shapes from production logs — synthetic uniform load misses tail behaviors. Hold for at least 30 minutes at target RPS to surface memory leaks, GC pauses, and connection-pool exhaustion. Compare p95/p99 against the SLO defined earlier.

    Collects list Collects file
5

Frontend and Network

  1. Audit Core Web Vitals with Lighthouse
    • Run Lighthouse against production from a throttled mobile profile, not your fiber-connected laptop. Track LCP, INP, and CLS — Google's thresholds are 2.5s / 200ms / 0.1. Field data from CrUX is more honest than lab data.

  2. Code-split and lazy-load heavy routes
    • Use dynamic import() in webpack/Vite/Next.js for non-critical routes and components. Defer below-the-fold images with native loading="lazy". Watch the bundle analyzer — a single 2MB lodash import in the main chunk is a common find.

  3. Enable HTTP/2 or HTTP/3 at the edge
    • Most CDNs support HTTP/3 (QUIC) with a flag flip. The win is largest on lossy mobile networks. Confirm with curl --http3 or Chrome DevTools Network panel that the protocol is actually negotiated.

  4. Reduce TLS and DNS handshake time
    • Enable OCSP stapling and TLS 1.3 0-RTT where session resumption is safe. Use a low-latency DNS provider (Route53, NS1, Cloudflare) and short TTLs only where needed — over-aggressive TTLs hurt more than they help.

6

Validation and Ongoing Monitoring

  1. Canary deploy at 5 percent traffic
    • Watch error rate and p99 on the canary fleet for at least 30 minutes before progressing. Have the rollback path tested and a kill-switch feature flag ready — performance regressions sometimes only show up under real production cardinality.

  2. Compare post-deploy metrics against the baseline
    • Pull the same dashboard captured at baseline. The optimization is only "done" if the SLI defined in the kickoff actually moved — declaring victory off a green CI run is how regressions ship.

    Collects list Collects file
  3. Add a perf regression test to CI
    • Pin the optimization in place with a benchmark gate in GitHub Actions / GitLab CI. Bundle-size budgets (size-limit), Lighthouse CI, or a k6 smoke test on every PR — pick the layer that matched the win.

  4. Open a follow-up incident review
    • Schedule a blameless review focused on why the regression went undetected for so long — alert tuning, missing dashboard, or a coverage gap in synthetic checks. Track action items to closure in Jira / Linear with named owners.

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.


Sections 6
Steps 26
Category Software Development
Price Free to start
Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Run Performance Optimization Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.