Cloud Migration Checklist
Steps an IT operations team or MSP runs to migrate workloads from on-premises infrastructure to a public cloud provider, from discovery and landing-zone build through cutover and post-migration optimization.
Discovery and Assessment
-
Inventory servers, VMs, and dependencies
Run discovery against the existing estate using AWS Application Discovery Service, Azure Migrate, or Google Migration Center. Capture VM specs, OS versions, installed software, and network dependencies. Undocumented service dependencies are the most common cause of cutover surprises — a forgotten cron job calling a legacy file share will break post-migration.
Collects file -
Classify workloads by migration strategy
Tag each workload with one of the 6 Rs: Rehost (lift-and-shift), Replatform, Refactor, Repurchase (SaaS), Retain, or Retire. Most estates have 30-50% Retire candidates that nobody admits to until forced — pressure-test that list before sizing the target environment.
Collects list -
Map compliance and data-residency requirements
Identify workloads subject to HIPAA, PCI DSS, SOC 2, GDPR, or CMMC. Confirm provider region selection meets data-residency obligations and that the provider offers a signed BAA / DPA for in-scope workloads. Schrems II concerns rule out some EU-to-US patterns even with SCCs.
-
Build the TCO model
Use the AWS Pricing Calculator, Azure TCO Calculator, or GCP Pricing Calculator. Include reserved/savings-plan commitments, egress charges, NAT gateway costs, and licensing (BYOL vs. license-included for Windows / SQL Server). Egress is the line item that surprises everyone — model real traffic patterns, not steady-state.
Collects number
Landing Zone and Network Design
-
Provision the landing zone and account structure
Use AWS Control Tower, Azure Landing Zones (CAF), or GCP Cloud Foundation Toolkit. Separate accounts/subscriptions for prod, non-prod, security, logging, and shared services. Apply guardrails (SCPs, Azure Policy, Org Policy) before any workload lands.
-
Design VPC, subnets, and connectivity
Choose between Direct Connect / ExpressRoute / Cloud Interconnect for production traffic, or site-to-site VPN for lower-volume needs. Plan CIDR ranges to avoid overlap with on-prem and future M&A targets — re-IPing later is far more expensive than a one-hour design session now.
-
Configure identity federation and SSO
Federate Entra ID or Okta to AWS IAM Identity Center / Azure RBAC / GCP IAM via SAML or OIDC. Enforce MFA on all human access; use IAM roles (not long-lived access keys) for workloads. Break-glass accounts stay local with hardware MFA, sealed credentials, and quarterly access tests.
-
Codify infrastructure with Terraform or Bicep
Click-ops landing zones become unmaintainable by month three. Commit Terraform / Bicep / CloudFormation modules to source control with PR review, drift detection, and automated plan/apply pipelines.
Pilot and Cutover Preparation
-
Run a pilot migration of a non-critical workload
Pick a low-risk app — internal tools, dev environments. Use AWS MGN, Azure Migrate, or CloudEndure to replicate. Time the cutover window, measure data-transfer throughput, and document every gotcha encountered. The pilot is where the runbook gets written.
-
Verify backups and snapshot baseline
Take a verified backup of every system in scope before the cutover window. Test restore on at least one critical workload — backup-success-green-but-restore-fails is the classic ransomware-day discovery. Use 3-2-1 with immutable storage (S3 Object Lock, Azure immutable blobs).
Collects list -
Document the rollback plan
Define the rollback trigger (e.g., data-validation failure, app unavailability past T+4 hours), the technical steps to revert DNS / connection strings, and the named decision-maker authorized to call it. CAB approval should reference this document, not a verbal plan.
-
Submit the change request to CAB
Include the migration runbook, rollback plan, blackout-window confirmation, communication plan to end users, and named on-call engineers. Production cutover requires CAB approval — not the manager's verbal yes.
-
Train operations staff on the target platform
Helpdesk and Tier 2 need walkthroughs of the cloud console, monitoring dashboards (CloudWatch / Azure Monitor / Cloud Operations), and incident-response runbooks before cutover, not after the first P1.
Cutover Execution
-
Replicate data to the target environment
For block-level: AWS MGN / Azure Migrate continuous replication. For databases: DMS, Azure DMS, or native log shipping. For large datasets: AWS Snowball / Azure Data Box for the seed, network for the delta. Monitor replication lag — cutover requires lag near zero.
-
Execute the cutover during the change window
Stop writes on source, drain final replication, flip DNS / load-balancer / connection strings, validate. Two engineers on bridge minimum — one executing, one cross-checking. Stay on the approved runbook; deviations get logged and reviewed post-change.
-
Validate application functionality and data integrity
Run the smoke-test checklist: app login, top-10 transactions, batch jobs, integrations to upstream/downstream systems, row counts on critical tables. Compare to pre-cutover baseline numbers captured during the pilot.
Collects list Collects paragraph -
Execute rollback if validation fails
Follow the documented rollback runbook: revert DNS, re-enable on-prem services, communicate to stakeholders. Capture a full timeline for the post-incident review. Don't try to fix-forward in the cutover window — that's how 4-hour outages become 12-hour outages.
Post-Migration Operations
-
Right-size instances and apply savings plans
After 2-3 weeks of real traffic, review Compute Optimizer / Azure Advisor / Recommender. Lift-and-shift typically over-provisions by 30-50%. Commit to Reserved Instances or Savings Plans only after right-sizing — locking in oversized instances wastes the discount.
-
Enable cloud-native monitoring and alerting
Wire CloudWatch / Azure Monitor / Cloud Operations into PagerDuty or Opsgenie. Define SLOs for the migrated services and alert on burn-rate, not on every CPU spike. Decommission on-prem monitoring agents only after cloud monitoring proves reliable for 14+ days.
-
Apply security baselines and run a vulnerability scan
Enable GuardDuty / Defender for Cloud / Security Command Center. Scan with Tenable or Qualys. Review CIS Benchmark compliance on the new estate. Public S3 buckets and over-permissive security groups are the top two findings on every cloud-migration scan.
-
Decommission on-premises systems
Wait at least 30 days post-cutover before decommissioning. Take a final cold backup, archive to immutable storage with a 7-year retention (or whatever legal hold requires), then power down. Update CMDB and license inventory; cancel maintenance contracts on the right vendor anniversary to avoid auto-renewal.
-
Hold the post-migration review
Compare actual cloud spend against the TCO model, MTTR before vs. after, incident count, and end-user satisfaction. Document lessons learned for the next migration wave — most cloud programs run as a sequence of waves, not a single cutover.
Collects file
Use this template
Copy it to your account, customize the steps, and run it with your team in minutes.
Browse hundreds of free templates across every team and industry.
Back to template libraryRun Cloud Migration Checklist with your team
Customize the steps, assign roles, set a schedule, and keep a complete record for every run.