Business Continuity Plan Checklist

Annual BCP/DR review run by IT operations to validate that critical systems, recovery objectives, and incident playbooks still match the business. Covers BIA refresh, DR testing, incident response, alternate-site readiness, and post-test remediation.

6 sections 22 steps Collects data

Business Impact Analysis

Refresh the critical systems inventory
- Pull the current asset list from the RMM (NinjaOne, Datto, ConnectWise Automate) and reconcile against IT Glue or Hudu. Tag each system Tier 0 (identity, DNS, DHCP, AD/Entra), Tier 1 (line-of-business apps), or Tier 2 (supporting). Stale inventories are the most common reason DR tests fail — a system gets restored that nobody uses anymore while a new SaaS dependency goes unrecovered.
Collects file
Set RTO and RPO per Tier 0 system
- Work with system owners to confirm Recovery Time Objective and Recovery Point Objective for each Tier 0 and Tier 1 system. RTOs that the current backup posture cannot meet should be flagged for remediation rather than quietly accepted.
Map upstream and downstream dependencies
- Document SaaS, API, and on-prem dependencies for each critical app — SSO (Entra ID, Okta), DNS, certificate authority, payment processor, MX provider. A DR plan that restores the app but not its identity provider doesn't restore service.
Identify single points of failure
- Look for the classics: one DC running both DNS and DHCP, a single hypervisor host with no HA peer, an MFA service account with hardcoded credentials, a backup target writable from the production AD account.

Backup and Recovery Posture

Verify 3-2-1 with an immutable copy
- Confirm three copies on two media with one offsite, and that at least one copy is immutable — Veeam hardened repo, S3 Object Lock, write-once tape, or a separate cloud account the production AD cannot reach. Ransomware-encrypted backups are the failure mode this exists to prevent.
Collects list
Open a remediation ticket for immutability gaps
- File a P2 in the PSA (ConnectWise, Autotask, Jira Service Management) naming the affected systems, the proposed control (object lock, hardened repo, separate cloud account), and the target close date. Do not pass the BCP review with this open and undocumented.
Review backup job success rates over 90 days
- Pull the Veeam / Datto / Cohesity / Rubrik report. A green dashboard with a job that's been silently skipping a VM for six weeks is the canonical failure pattern.
Validate offsite replication lag
- Confirm that replication to the offsite or cloud target is keeping up with the RPO. Lag exceeding the stated RPO means the offsite copy is not what the BCP claims it is.

DR Test Execution

Schedule the isolated restore drill
- File the change request through CAB. Build the restore target in an isolated VLAN or sandbox tenant — never restore Tier 0 systems into production for a drill.
Restore a Tier 0 system end-to-end
- Pick AD/Entra, the file server, or the LOB database. Walk the full path: locate backup, decrypt, restore to isolated environment, validate application start, validate user authentication, validate data integrity against a known checkpoint.
Record the actual recovery time and data loss

Collects list Collects number Collects number Collects paragraph
File a P1 remediation plan for missed RTO
- If the drill missed RTO or RPO, the BCP is wrong — either the objective or the architecture has to change. Open the remediation ticket with named owner, target date, and the architecture change required (warm standby, additional replica, faster restore tier).

Incident Response Readiness

Update the on-call rotation in PagerDuty
- Confirm primary, secondary, and escalation tiers in PagerDuty / Opsgenie / xMatters match current staffing. Test that a synthetic alert pages the right person.
Refresh the incident commander runbook
- Update the SEV1 runbook in IT Glue / Hudu / Confluence: who declares, bridge call number, status page owner, executive notification, legal/PR triggers, customer communication template.
Verify the break-glass account works
- Test the emergency-access account in Entra ID / Okta. Confirm the credentials are sealed in two physical locations, MFA is excluded per policy, and sign-in is monitored. A locked-out admin during a real outage is the worst time to discover this.
Run a tabletop exercise with the IR team
- Walk through a ransomware scenario or a Tier 0 outage with the named IR team. Capture decision points where the runbook was unclear; those become the post-tabletop edits.

Alternate Site and Workforce Continuity

Confirm VPN and ZTNA capacity for full remote
- Verify the FortiGate / Palo Alto / Meraki concentrator (or Cloudflare / Zscaler ZTNA) can sustain 100% of staff remote. License headroom and tunnel limits are common surprise constraints.
Validate the alternate communication channel
- If M365 / Teams is the primary channel, the BCP needs an out-of-band fallback — Signal group, personal-email tree, mass-notification service (Everbridge, AlertMedia). Test it; don't just document it.
Review vendor and SaaS escalation contacts
- Update the after-hours phone numbers and support tier for the ISP, M365, the EDR vendor (CrowdStrike, SentinelOne), and the backup vendor. A general support queue at 2am is not an escalation path.

Plan Sign-Off and Maintenance

Capture lessons learned from the drill
- Run a blameless retro covering what worked, what didn't, and which runbook steps were ambiguous. Feed each item into the IT Glue / Hudu BCP doc as a tracked edit.
Schedule the next quarterly restore drill
Director of IT signs off on the BCP

Collects list Collects signature Collects paragraph

Use this template

Copy it to your account, customize the steps, and run it with your team in minutes.

Use this workflow Start free trial

Sections 6

Steps 22

Category Systems Administration

Price Free to start

Need a different process

Browse hundreds of free templates across every team and industry.

Back to template library

Related templates

More workflows your team can run.

Systems Administration

Run Business Continuity Plan Checklist with your team

Customize the steps, assign roles, set a schedule, and keep a complete record for every run.