Cloud Migration Checklist
Discovery and Assessment
Run discovery against the existing estate using AWS Application Discovery Service, Azure Migrate, or Google Migration Center. Capture VM specs, OS versions, installed software, and network dependencies. Undocumented service dependencies are the most common cause of cutover surprises — a forgotten cron job calling a legacy file share will break post-migration.
Tag each workload with one of the 6 Rs: Rehost (lift-and-shift), Replatform, Refactor, Repurchase (SaaS), Retain, or Retire. Most estates have 30-50% Retire candidates that nobody admits to until forced — pressure-test that list before sizing the target environment.
Identify workloads subject to HIPAA, PCI DSS, SOC 2, GDPR, or CMMC. Confirm provider region selection meets data-residency obligations and that the provider offers a signed BAA / DPA for in-scope workloads. Schrems II concerns rule out some EU-to-US patterns even with SCCs.
Use the AWS Pricing Calculator, Azure TCO Calculator, or GCP Pricing Calculator. Include reserved/savings-plan commitments, egress charges, NAT gateway costs, and licensing (BYOL vs. license-included for Windows / SQL Server). Egress is the line item that surprises everyone — model real traffic patterns, not steady-state.
Landing Zone and Network Design
Use AWS Control Tower, Azure Landing Zones (CAF), or GCP Cloud Foundation Toolkit. Separate accounts/subscriptions for prod, non-prod, security, logging, and shared services. Apply guardrails (SCPs, Azure Policy, Org Policy) before any workload lands.
Choose between Direct Connect / ExpressRoute / Cloud Interconnect for production traffic, or site-to-site VPN for lower-volume needs. Plan CIDR ranges to avoid overlap with on-prem and future M&A targets — re-IPing later is far more expensive than a one-hour design session now.
Federate Entra ID or Okta to AWS IAM Identity Center / Azure RBAC / GCP IAM via SAML or OIDC. Enforce MFA on all human access; use IAM roles (not long-lived access keys) for workloads. Break-glass accounts stay local with hardware MFA, sealed credentials, and quarterly access tests.
Click-ops landing zones become unmaintainable by month three. Commit Terraform / Bicep / CloudFormation modules to source control with PR review, drift detection, and automated plan/apply pipelines.
Pilot and Cutover Preparation
Pick a low-risk app — internal tools, dev environments. Use AWS MGN, Azure Migrate, or CloudEndure to replicate. Time the cutover window, measure data-transfer throughput, and document every gotcha encountered. The pilot is where the runbook gets written.
Take a verified backup of every system in scope before the cutover window. Test restore on at least one critical workload — backup-success-green-but-restore-fails is the classic ransomware-day discovery. Use 3-2-1 with immutable storage (S3 Object Lock, Azure immutable blobs).
Define the rollback trigger (e.g., data-validation failure, app unavailability past T+4 hours), the technical steps to revert DNS / connection strings, and the named decision-maker authorized to call it. CAB approval should reference this document, not a verbal plan.
Include the migration runbook, rollback plan, blackout-window confirmation, communication plan to end users, and named on-call engineers. Production cutover requires CAB approval — not the manager's verbal yes.
Helpdesk and Tier 2 need walkthroughs of the cloud console, monitoring dashboards (CloudWatch / Azure Monitor / Cloud Operations), and incident-response runbooks before cutover, not after the first P1.
Cutover Execution
For block-level: AWS MGN / Azure Migrate continuous replication. For databases: DMS, Azure DMS, or native log shipping. For large datasets: AWS Snowball / Azure Data Box for the seed, network for the delta. Monitor replication lag — cutover requires lag near zero.
Stop writes on source, drain final replication, flip DNS / load-balancer / connection strings, validate. Two engineers on bridge minimum — one executing, one cross-checking. Stay on the approved runbook; deviations get logged and reviewed post-change.
Run the smoke-test checklist: app login, top-10 transactions, batch jobs, integrations to upstream/downstream systems, row counts on critical tables. Compare to pre-cutover baseline numbers captured during the pilot.
Follow the documented rollback runbook: revert DNS, re-enable on-prem services, communicate to stakeholders. Capture a full timeline for the post-incident review. Don't try to fix-forward in the cutover window — that's how 4-hour outages become 12-hour outages.
Post-Migration Operations
After 2-3 weeks of real traffic, review Compute Optimizer / Azure Advisor / Recommender. Lift-and-shift typically over-provisions by 30-50%. Commit to Reserved Instances or Savings Plans only after right-sizing — locking in oversized instances wastes the discount.
Wire CloudWatch / Azure Monitor / Cloud Operations into PagerDuty or Opsgenie. Define SLOs for the migrated services and alert on burn-rate, not on every CPU spike. Decommission on-prem monitoring agents only after cloud monitoring proves reliable for 14+ days.
Enable GuardDuty / Defender for Cloud / Security Command Center. Scan with Tenable or Qualys. Review CIS Benchmark compliance on the new estate. Public S3 buckets and over-permissive security groups are the top two findings on every cloud-migration scan.
Wait at least 30 days post-cutover before decommissioning. Take a final cold backup, archive to immutable storage with a 7-year retention (or whatever legal hold requires), then power down. Update CMDB and license inventory; cancel maintenance contracts on the right vendor anniversary to avoid auto-renewal.
Compare actual cloud spend against the TCO model, MTTR before vs. after, incident count, and end-user satisfaction. Document lessons learned for the next migration wave — most cloud programs run as a sequence of waves, not a single cutover.
Use this template in Manifestly
- Cloud Security Checklist
- User Access Review Checklist
- Data Recovery Checklist
- Containerization Rollout Checklist
- Database Backup Checklist
- Password Management Checklist
- Backup and Restore Checklist
- Network Upgrade Checklist
- Server Backup Checklist
- Business Continuity Plan Checklist
- Problem Management Checklist
- Server Decommissioning Checklist
- Cloud Monitoring Checklist
- Hardware Inventory Checklist
- IT Regulatory Compliance Review
- Release Management Checklist
- Server Maintenance Checklist
- Rollback Plan Checklist
- Customer Support Ticket Workflow
- Software Upgrade Checklist
- Quarterly Compliance Reporting Checklist
- Patch Management Checklist
- Hardware Maintenance Checklist
- Server Security Checklist
- IT Emergency Response Checklist
- Incident Management Checklist
- Disaster Recovery Plan Checklist
- User Role Management Checklist
- Software Installation Checklist
- Compliance Audit Checklist
- Access Control Checklist
- Cloud Cost Management Checklist
- IT Staff Performance Review
- Change Management Checklist
- Firewall Configuration Checklist
- Security Audit Checklist
- Quarterly Network Security Review
- Database Migration Checklist
- Employee Onboarding Checklist
- Capacity Planning Checklist
- IT Budgeting Checklist
- Network Monitoring Checklist
- Cloud Deployment Checklist
- Database Installation Checklist
- IT Service Request Checklist
- Database Security Checklist
- System Monitoring Checklist
- Hardware Troubleshooting Checklist
- IT Strategy Checklist
- Patch Deployment Checklist
- Hardware Upgrade Checklist
- Performance Tuning Checklist
- Application Performance Monitoring Checklist
- Employee Training Checklist
- User Onboarding Checklist
- IT Vendor Management Checklist
- Server Build and Hardening Checklist
- IT Policy Review Checklist
- Help Desk Ticket Handling Checklist
- Infrastructure as Code Checklist
- Hardware Disposal Checklist
- IT Resource Allocation Checklist
- Incident Response Checklist
- Network Troubleshooting Checklist
- User Offboarding Checklist
Ready to take control of your recurring tasks?
Start Free 14-Day TrialUse Slack? Sign up with one click
