Database Backup Checklist
Pre-Backup Preparations
Check free capacity on the backup destination (SAN LUN, NAS share, S3 bucket, Azure Blob, or Veeam repository) against the size of the last full plus expected log growth. A common gotcha: the repo is fine but the staging volume the backup tool writes to first is not — check both.
For SQL Server, run DBCC CHECKDB on the target databases or confirm the most recent run was clean. For PostgreSQL, check pg_stat_activity for long-running transactions that would bloat WAL during the backup. For MySQL/InnoDB, confirm no orphan transactions in SHOW ENGINE INNODB STATUS.
Post in the #db-ops channel and email app owners with the planned start, expected end, and any application impact (read-only, brief I/O latency, none). Skip this for routine nightly jobs; required for ad-hoc or extended-window backups.
Index rebuilds, statistics updates, ETL loads, and replication reseeds can collide with the backup window. Review the SQL Agent / cron / Airflow schedule for the host and any downstream replicas. Patch Tuesday and quarterly DR tests are common collision points.
Open the active job definition in Veeam, Commvault, Rubrik, or your scripted pg_dump/mysqldump wrapper. Confirm retention matches the documented RPO and that the immutable / object-lock copy is still configured — ransomware-resilient backup requires the offsite copy be unmodifiable from the production credential.
Backup Execution
Trigger the configured job in the backup tool (Veeam, Commvault, Rubrik, native SQL Agent, pgBackRest, mysqldump). For ad-hoc runs, use the documented runbook command — do not invent flags at the prompt. Capture the job ID for the audit trail.
Watch the tool's live job view for read/write MB/s, retry counts, and warnings. A throughput drop usually means the source disk is under contention or the network path to the repo is saturated. Page on errors; warnings get noted in the run log.
Full backups alone do not meet a sub-day RPO. Verify the log-chain job is also running on its schedule (SQL Server transaction log backups, PostgreSQL WAL archive to archive_command, MySQL binlog shipping). A broken log chain is silent until restore day.
Confirm the backup file or chunk set appears in the primary repo, the secondary copy, and the immutable / offsite tier (3-2-1: 3 copies, 2 media, 1 offsite). Spot-check file size against the prior night's run; a 10x size delta usually means a misconfigured filter or full vs. incremental confusion.
Log start time, end time, total bytes, dedup ratio, and the final status code in the run sheet or PSA ticket. This timing data is what feeds the next quarter's window-sizing decision.
Post-Backup Verification
Run RESTORE VERIFYONLY for SQL Server, pg_verifybackup for PostgreSQL, or the equivalent SureBackup / Live Mount verification in Veeam. This catches checksum corruption that a successful job-status code can hide.
Restore the most recent backup into the DR lab VLAN, mount the database, and run a smoke query (row count on a known table, latest timestamp on the audit log). This is the step that catches the silent failures — rotated credential the script depends on, vendor format change, encryption key the team no longer holds. Cadence: at least quarterly per the DR policy.
A failed restore drill is a P2 — the backups are not proven usable. Open a ticket in ServiceNow / Jira Service Management / ConnectWise PSA, page the on-call DBA, and do not close this run until a successful restore is demonstrated against an alternate backup point.
Skim the job log for VSS writer failures, snapshot quiesce timeouts, deduplication errors, or skipped objects. Warnings that recur across runs become tomorrow's failed restore — file a low-priority ticket rather than letting them accrete.
Record the run in IT Glue / Hudu / Confluence with backup set ID, retention expiry, restore-drill date, and any deviations. This is the artifact a SOC 2 or HIPAA auditor asks for — without it, the controls are not demonstrable.
Close the loop with app owners and the on-call rotation. For MSP-managed clients, push the result into the monthly QBR report so the customer has visible evidence the RPO/RTO commitments are being met.
Use this template in Manifestly
- Cloud Migration Checklist
- Cloud Security Checklist
- User Access Review Checklist
- Data Recovery Checklist
- Containerization Rollout Checklist
- Password Management Checklist
- Backup and Restore Checklist
- Network Upgrade Checklist
- Server Backup Checklist
- Business Continuity Plan Checklist
- Problem Management Checklist
- Server Decommissioning Checklist
- Cloud Monitoring Checklist
- Hardware Inventory Checklist
- IT Regulatory Compliance Review
- Release Management Checklist
- Server Maintenance Checklist
- Rollback Plan Checklist
- Customer Support Ticket Workflow
- Software Upgrade Checklist
- Quarterly Compliance Reporting Checklist
- Patch Management Checklist
- Hardware Maintenance Checklist
- Server Security Checklist
- IT Emergency Response Checklist
- Incident Management Checklist
- Disaster Recovery Plan Checklist
- User Role Management Checklist
- Software Installation Checklist
- Compliance Audit Checklist
- Access Control Checklist
- Cloud Cost Management Checklist
- IT Staff Performance Review
- Change Management Checklist
- Firewall Configuration Checklist
- Security Audit Checklist
- Quarterly Network Security Review
- Database Migration Checklist
- Employee Onboarding Checklist
- Capacity Planning Checklist
- IT Budgeting Checklist
- Network Monitoring Checklist
- Cloud Deployment Checklist
- Database Installation Checklist
- IT Service Request Checklist
- Database Security Checklist
- System Monitoring Checklist
- Hardware Troubleshooting Checklist
- IT Strategy Checklist
- Patch Deployment Checklist
- Hardware Upgrade Checklist
- Performance Tuning Checklist
- Application Performance Monitoring Checklist
- Employee Training Checklist
- User Onboarding Checklist
- IT Vendor Management Checklist
- Server Build and Hardening Checklist
- IT Policy Review Checklist
- Help Desk Ticket Handling Checklist
- Infrastructure as Code Checklist
- Hardware Disposal Checklist
- IT Resource Allocation Checklist
- Incident Response Checklist
- Network Troubleshooting Checklist
- User Offboarding Checklist
- Data Backup and Recovery Checklist
- Data Backup and Recovery Checklist
- Disaster Recovery Plan Checklist
- Disaster Recovery Checklist
- Data Backup Verification Checklist
- Disaster Recovery Plan Checklist
- Data Backup and Recovery Checklist
- Data Backup and Recovery Checklist
- Business Continuity Checklist
- Data Recovery Checklist
- Backup and Restore Checklist
- Server Backup Checklist
- Business Continuity Plan Checklist
- Disaster Recovery Plan Checklist
- Disaster Recovery Checklist
- E-commerce Backup and Recovery Checklist
- Backup and Recovery Checklist
Ready to take control of your recurring tasks?
Start Free 14-Day TrialUse Slack? Sign up with one click
