Cloud Migration Checklist

Steps an IT operations team or MSP runs to migrate workloads from on-premises infrastructure to a public cloud provider, from discovery and landing-zone build through cutover and post-migration optimization.

5 sections 22 steps Collects data

Discovery and Assessment

Inventory servers, VMs, and dependencies
- Run discovery against the existing estate using AWS Application Discovery Service, Azure Migrate, or Google Migration Center. Capture VM specs, OS versions, installed software, and network dependencies. Undocumented service dependencies are the most common cause of cutover surprises — a forgotten cron job calling a legacy file share will break post-migration.
Collects file
Classify workloads by migration strategy
- Tag each workload with one of the 6 Rs: Rehost (lift-and-shift), Replatform, Refactor, Repurchase (SaaS), Retain, or Retire. Most estates have 30-50% Retire candidates that nobody admits to until forced — pressure-test that list before sizing the target environment.
Collects list
Map compliance and data-residency requirements
- Identify workloads subject to HIPAA, PCI DSS, SOC 2, GDPR, or CMMC. Confirm provider region selection meets data-residency obligations and that the provider offers a signed BAA / DPA for in-scope workloads. Schrems II concerns rule out some EU-to-US patterns even with SCCs.
Build the TCO model
- Use the AWS Pricing Calculator, Azure TCO Calculator, or GCP Pricing Calculator. Include reserved/savings-plan commitments, egress charges, NAT gateway costs, and licensing (BYOL vs. license-included for Windows / SQL Server). Egress is the line item that surprises everyone — model real traffic patterns, not steady-state.
Collects number

Landing Zone and Network Design

Provision the landing zone and account structure
- Use AWS Control Tower, Azure Landing Zones (CAF), or GCP Cloud Foundation Toolkit. Separate accounts/subscriptions for prod, non-prod, security, logging, and shared services. Apply guardrails (SCPs, Azure Policy, Org Policy) before any workload lands.
Design VPC, subnets, and connectivity
- Choose between Direct Connect / ExpressRoute / Cloud Interconnect for production traffic, or site-to-site VPN for lower-volume needs. Plan CIDR ranges to avoid overlap with on-prem and future M&A targets — re-IPing later is far more expensive than a one-hour design session now.
Configure identity federation and SSO
- Federate Entra ID or Okta to AWS IAM Identity Center / Azure RBAC / GCP IAM via SAML or OIDC. Enforce MFA on all human access; use IAM roles (not long-lived access keys) for workloads. Break-glass accounts stay local with hardware MFA, sealed credentials, and quarterly access tests.
Codify infrastructure with Terraform or Bicep
- Click-ops landing zones become unmaintainable by month three. Commit Terraform / Bicep / CloudFormation modules to source control with PR review, drift detection, and automated plan/apply pipelines.

Pilot and Cutover Preparation

Run a pilot migration of a non-critical workload
- Pick a low-risk app — internal tools, dev environments. Use AWS MGN, Azure Migrate, or CloudEndure to replicate. Time the cutover window, measure data-transfer throughput, and document every gotcha encountered. The pilot is where the runbook gets written.
Verify backups and snapshot baseline
- Take a verified backup of every system in scope before the cutover window. Test restore on at least one critical workload — backup-success-green-but-restore-fails is the classic ransomware-day discovery. Use 3-2-1 with immutable storage (S3 Object Lock, Azure immutable blobs).
Collects list
Document the rollback plan
- Define the rollback trigger (e.g., data-validation failure, app unavailability past T+4 hours), the technical steps to revert DNS / connection strings, and the named decision-maker authorized to call it. CAB approval should reference this document, not a verbal plan.
Submit the change request to CAB
- Include the migration runbook, rollback plan, blackout-window confirmation, communication plan to end users, and named on-call engineers. Production cutover requires CAB approval — not the manager's verbal yes.
Train operations staff on the target platform
- Helpdesk and Tier 2 need walkthroughs of the cloud console, monitoring dashboards (CloudWatch / Azure Monitor / Cloud Operations), and incident-response runbooks before cutover, not after the first P1.

Cutover Execution

Replicate data to the target environment
- For block-level: AWS MGN / Azure Migrate continuous replication. For databases: DMS, Azure DMS, or native log shipping. For large datasets: AWS Snowball / Azure Data Box for the seed, network for the delta. Monitor replication lag — cutover requires lag near zero.
Execute the cutover during the change window
- Stop writes on source, drain final replication, flip DNS / load-balancer / connection strings, validate. Two engineers on bridge minimum — one executing, one cross-checking. Stay on the approved runbook; deviations get logged and reviewed post-change.
Validate application functionality and data integrity
- Run the smoke-test checklist: app login, top-10 transactions, batch jobs, integrations to upstream/downstream systems, row counts on critical tables. Compare to pre-cutover baseline numbers captured during the pilot.
Collects list Collects paragraph
Execute rollback if validation fails
- Follow the documented rollback runbook: revert DNS, re-enable on-prem services, communicate to stakeholders. Capture a full timeline for the post-incident review. Don't try to fix-forward in the cutover window — that's how 4-hour outages become 12-hour outages.

Post-Migration Operations

Right-size instances and apply savings plans
- After 2-3 weeks of real traffic, review Compute Optimizer / Azure Advisor / Recommender. Lift-and-shift typically over-provisions by 30-50%. Commit to Reserved Instances or Savings Plans only after right-sizing — locking in oversized instances wastes the discount.
Enable cloud-native monitoring and alerting
- Wire CloudWatch / Azure Monitor / Cloud Operations into PagerDuty or Opsgenie. Define SLOs for the migrated services and alert on burn-rate, not on every CPU spike. Decommission on-prem monitoring agents only after cloud monitoring proves reliable for 14+ days.
Apply security baselines and run a vulnerability scan
- Enable GuardDuty / Defender for Cloud / Security Command Center. Scan with Tenable or Qualys. Review CIS Benchmark compliance on the new estate. Public S3 buckets and over-permissive security groups are the top two findings on every cloud-migration scan.
Decommission on-premises systems
- Wait at least 30 days post-cutover before decommissioning. Take a final cold backup, archive to immutable storage with a 7-year retention (or whatever legal hold requires), then power down. Update CMDB and license inventory; cancel maintenance contracts on the right vendor anniversary to avoid auto-renewal.
Hold the post-migration review
- Compare actual cloud spend against the TCO model, MTTR before vs. after, incident count, and end-user satisfaction. Document lessons learned for the next migration wave — most cloud programs run as a sequence of waves, not a single cutover.
Collects file

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.

Use this workflow Start free trial

Sections 5

Steps 22

Category Systems Administration

Price Free to start

Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Related templates

More workflows your team can run.

Systems Administration

Run Cloud Migration Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.