Capacity Planning Checklist

Current Resource Assessment

    Export the current device, VM, and SaaS inventory from your RMM (NinjaOne, Datto, ConnectWise Automate) and CMDB (ServiceNow, Hudu, IT Glue). Reconcile against vCenter, Intune, and cloud provider inventories — orphaned VMs and decommissioned hosts still showing as active are the most common discrepancy.

    Export CPU, memory, storage, and network utilization metrics from PRTG, LogicMonitor, Datadog, or Auvik over a rolling 90-day window. Capture p50, p95, and peak — averages hide saturation events.

    Flag any host running above 80% sustained CPU or memory, or above 75% datastore consumption. Also flag VMs idling below 5% — right-sizing reclaims capacity before purchase requests get approved.

    Check Veeam, Datto, or Rubrik repository fill rates and projected exhaustion dates. Backup storage runs out faster than primary because of retention growth — a common gotcha when restore tests fail because the repo started skipping jobs.

    Publish the baseline to IT Glue, Hudu, or Confluence with a date stamp. This becomes the trend anchor for the next quarterly review.

Forecasting Future Demand

    Run linear and seasonal projections against the last 12 months of utilization. Include user-count growth, mailbox growth, and storage consumption rates. Tools like LogicMonitor and Datadog forecast natively; otherwise pull to a spreadsheet.

    Talk to product, finance, and ops leadership about hiring plans, M&A activity, new product launches, and any pending data-residency commitments. Headcount-driven endpoint and license forecasts are usually 70% of the next quarter's spend.

    List Windows Server EOL, VMware/Broadcom licensing changes, M365 license SKU shifts, and any planned cloud-region migrations. These reset the capacity baseline and frequently invalidate prior forecasts.

    Model three demand trajectories with explicit assumptions (headcount, retention, M&A). Each scenario should produce a CPU, memory, storage, and license-count projection at 6 and 12 months out.

Capacity Plan and Thresholds

    Tie each forecast line to a budget code and a named owner. Reference the IT operating budget and any approved capex carryover. Tag items needing CAB review separately from standard adds.

    Configure warning and critical thresholds for CPU, memory, datastore, and bandwidth in PRTG, Datadog, or LogicMonitor. Common pattern: warn at 70%, critical at 85%, with auto-ticket creation in the PSA at critical.

    Required only when forecasting flagged expansion or refresh. Confirm Dell, HPE, and Cisco lead times — current cycles run 8-16 weeks for enterprise gear. For cloud, confirm reserved-instance commitments before quarter-end pricing locks.

    For each scaling action, write the steps: who triggers, what change request is needed, what rollback looks like. Auto-scaling groups in AWS, AKS node pools, and VMware DRS each need their own playbook.

Risk and Continuity

    Confirm backup windows still complete inside SLA and that DR-site capacity can host failover workloads. RTO drift is invisible until you actually fail over — verify against the published business RPO/RTO targets.

    Verify three copies, two media types, one offsite — with at least one immutable (object lock, hardened Linux repo, or air-gapped). Capacity plans that grow primary storage often forget to grow the immutable copy alongside.

    Reconcile Microsoft, VMware, and Oracle license counts against forecasted growth. Vendor audits surface six-figure true-ups when capacity grows ahead of license entitlement — Broadcom's VMware bundles especially.

    Book the next quarterly drill on the calendar with named participants. Capacity decisions made this quarter should be validated by the next drill — that's the closing of the loop.

Approval and Communication

    Walk through scenarios, threshold changes, and procurement asks at the change advisory board. Capture approver names and any conditions attached to approval.

    Address every condition or open question raised by the board, then resubmit. Keep the revision log inline so the audit trail is intact.

    Post the approved plan to IT Glue, Hudu, or Confluence in the standard capacity-planning folder. Link it from the CMDB so future ticket triage can reference it.

    Send the summary to business-unit leads, the vCIO, and the service desk. Tier 1 needs the new thresholds so they don't escalate noise; business leads need the procurement timeline so they don't promise their teams capacity that isn't ordered yet.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Systems Administration Checklists
Related Capacity Planning Checklists
Related Infrastructure Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack