Cloud Cost Management Checklist

Resource Inventory and Tagging

    Export from AWS Config / Resource Explorer, Azure Resource Graph, and GCP Cloud Asset Inventory across every linked account, subscription, and project. Include payer-account and dev/sandbox accounts — orphaned spend usually hides there, not in production.

    Required tags typically include CostCenter, Owner, Environment, and Application. Use AWS Tag Policies, Azure Policy, or GCP Org Policy to surface non-compliant resources. Anything below 95% coverage on cost-allocation tags will break chargeback.

    Drive the tag-compliance list to zero before billing close. For shared resources without an obvious owner, tag with the platform team's CostCenter and open a follow-up ticket — never leave production resources untagged.

    Unattached EBS volumes, idle Elastic IPs, stale snapshots over 90 days, empty load balancers, and unused NAT gateways are classic orphans. Send the list to suspected owners with a 7-day deletion deadline.

    Sync the reconciled inventory into the CMDB (ServiceNow, Hudu, IT Glue) so downstream change-management and access-review workflows operate against current truth.

Cost Allocation and Budgeting

    Map AWS CUR / Azure cost exports / GCP billing export rows to the finance chart of accounts. Untagged spend lands in a default 'unallocated' bucket — track its absolute size and percentage; over 5% is a finance escalation.

    Pull AWS Budgets, Azure Cost Management budgets, or your FinOps tool (CloudHealth, Apptio Cloudability, Vantage) and compare to MTD actuals. Flag any cost center over 100% of period budget or trending past forecast.

    Schedule a 30-minute call with the cost-center owner to walk the variance. Common drivers: a forgotten dev environment, data-transfer egress from a new integration, or a load test that ran on-demand instead of spot.

    Generate the per-cost-center showback (or chargeback if you cross-bill) with shared-service allocations broken out separately. Include shared-savings allocation from RIs / Savings Plans so teams see net effective rate, not list price.

    Use trailing 3-month run rate plus committed roadmap deltas (new app launches, migrations) to set next month's budget. Encode the new numbers back into AWS Budgets / Azure budgets so alerts fire at the right thresholds.

Resource Optimization

    Pull AWS Compute Optimizer, Azure Advisor, and GCP Recommender output. Filter to high-confidence recommendations only — the medium-confidence list churns weekly. Hand the actionable list to app owners with a 14-day implementation window.

    EC2 / VMs with under 5% CPU and under 5MB network for two weeks are decommission candidates. Stop first, snapshot, then terminate after a 7-day grace period — gives owners a chance to object before data is gone.

    Check that S3 / Blob lifecycle rules are tiering cold data to Glacier / Archive / Coldline. Watch for buckets without lifecycle policies and for Intelligent-Tiering small-object overhead — not every workload benefits from it.

    Pull RI utilization, Savings Plans coverage, and Azure Reservations / GCP CUDs reports. Target is 95%+ utilization on existing commitments and 70-80% coverage of steady-state compute. Below 90% utilization means you bought too much; below 60% coverage means you're leaving discount on the table.

    If undercovered, purchase additional Compute Savings Plans or convertible RIs for the steady-state baseline — never for spiky workloads. If underutilized, sell on the RI Marketplace where eligible or pause new purchases until expiry rolls off. Always keep a 12-month rolling view, not a single month.

    Review ASG / VMSS / GCE MIG min-max bounds and scaling policies. Common gotcha: scale-out is aggressive but scale-in is missing or too conservative, so capacity ratchets up over weeks and never comes down.

Cost Monitoring and Reporting

    Confirm AWS CUR 2.0, Azure cost exports, and GCP billing exports are landing in S3 / ADLS / BigQuery on schedule. A silent pipeline failure means anomaly detection runs on stale data and dashboards lie for days before anyone notices.

    Review AWS Cost Anomaly Detection, Azure cost alerts, and any FinOps-tool anomaly flags. Common real findings: NAT gateway spike from a new VPC endpoint missed, CloudWatch logs ingestion from a chatty new service, KMS request volume from a misconfigured retry loop.

    Open a ticket against the owning team with the anomaly window, drill-down by service / usage type, and a remediation deadline. Capture root cause in the ticket so the same pattern is recognizable next month.

    Update unit-economics views (cost per active user, cost per transaction, cost per environment) — those are what executives act on, not raw service totals. Annotate any one-time events so MoM deltas are interpretable.

    Send the report to engineering leads, finance, and the vCIO or platform director. Include MoM deltas, top 5 cost-center movers, savings realized this period, and the open-action list with owners.

Vendor Management and Negotiation

    Check expiry on AWS EDP / PPA, Microsoft EA / MCA, and Google CUDs. Negotiation leverage starts 6 months before expiry — anything inside 90 days is reactive and the discount is worse.

    Pull effective rate per service from the CUR and compare to public list pricing and any prior negotiated discounts. Watch egress and premium support — those are typically least discounted and highest leverage to push on at renewal.

    Bring trailing 12-month spend by service, growth trajectory, and roadmap workloads (e.g., GenAI inference, data lake migration) to the QBR with the cloud account team. Vendors offer better discounts on net-new commit categories than on existing baseline.

    Explore AWS Migration Acceleration Program funding, Azure Hybrid Benefit, GCP migration credits, and ISV co-sell discounts. Marketplace private offers can route software spend through committed cloud spend at full discount.

    Store the signed addendum, discount schedule, and renewal calendar entry in the contract management system. Tag with renewal date so the 6-month-out reminder fires automatically next cycle.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Systems Administration Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack