Patch Management Checklist

Patch Identification & Prioritization

    Export the current month's advisories from your vulnerability scanner (Tenable, Qualys, Rapid7 InsightVM) and the Microsoft Patch Tuesday bulletin. Cross-reference third-party software advisories — Chrome, Firefox, Adobe, Java — that don't show up in the Microsoft feed.

    CVSS alone misses what's actually being exploited. Pull the CISA Known Exploited Vulnerabilities catalog and flag any CVE that appears there as P1 regardless of CVSS score. Also pull EPSS scores for anything CVSS 7.0+ to refine prioritization.

    Combine CVSS, KEV presence, EPSS, and asset exposure (internet-facing vs. internal) to produce a ranked list. KEV-listed vulnerabilities on internet-facing assets are emergency-cycle, not monthly-cycle.

    Page the on-call security engineer via PagerDuty and convene the emergency CAB. KEV-listed exploits on production assets bypass the standard monthly cycle — patch within 72 hours per CISA BOD 22-01 guidance, or implement compensating controls and document the exception.

    Identify patches that require reboots, prerequisite patches, and superseded patches. .NET and SQL Server cumulative updates frequently chain dependencies — missing a prerequisite leaves the system in a half-patched state that scanners flag inconsistently.

Change Management Approval

    Document scope, blast radius, rollback plan, and maintenance window in the PSA / ITSM tool (ServiceNow, Jira Service Management, ConnectWise Manage). Include the prioritized patch list as an attachment so CAB members aren't approving in the dark.

    Walk CAB through the impacted systems, the rollback runbook, and any known incompatibilities. Don't skip CAB on "small" patches — blast radius matters more than patch size, and a 1KB config change can take prod down as fast as a kernel update.

    Send the standard 72-hour-ahead notice via the firm's comms channel (email, Slack, status page). Include start time, expected duration, affected services, and a contact for issues. Check for blackout windows — quarter-end finance close, payroll runs, scheduled customer demos.

Staging & Pilot

    Push to the non-production ring via your patch tool (Intune, Action1, Automox, NinjaOne, SCCM/MECM). Mirror the production OS / application stack as closely as possible — staging that doesn't match prod produces false-clean test results.

    Validate that line-of-business applications still launch, authenticate, and complete a representative transaction. Pay attention to apps with kernel drivers, browser extensions, or legacy .NET dependencies — those are the usual breakage points.

    Roll out to 5-10% of endpoints — typically the IT team itself plus volunteer power users from each department. Monitor EDR (CrowdStrike, SentinelOne, Defender) for new alerts and the helpdesk queue for tickets attributable to the patch.

    Open tickets for each failure mode, isolate the offending patch (or interaction), and decide: defer the patch, deploy with workaround, or escalate to vendor support. Document everything in the RFC — production rollout doesn't proceed until pilot is clean or exceptions are signed off.

Production Deployment

    Confirm last-night backups completed in Veeam / Datto / Rubrik / AWS Backup, and take pre-patch snapshots of critical VMs. Patching without a known-good restore point turns a bad patch into an outage.

    Stage the rollout in waves — workstations first, non-critical servers next, then production servers in the approved maintenance window. Domain controllers and database servers go last and one-at-a-time, never in parallel.

    Watch the RMM dashboard (NinjaOne, Datto RMM, Kaseya) for failed installs and the EDR console for new alerts. Set a threshold — e.g., 5% install failure rate or any P1 alert — that pauses the rollout automatically.

    Trigger uninstall via the patch tool or restore from the pre-patch snapshot. Page the incident commander, open the war room channel, and declare a P2/P1 incident depending on user impact. The rollback runbook drafted in the RFC is the source of truth — don't improvise.

Post-Patch Validation

    Run a Tenable / Qualys / Rapid7 authenticated scan and confirm the targeted CVEs no longer appear. Unauthenticated scans miss patch-state on most modern OSes — always run authenticated.

    Check Datadog / New Relic / Grafana dashboards for SLI regressions — latency, error rate, throughput. Validate authentication, email flow, file shares, VPN, and any line-of-business services covered by SLA.

    Export patch-completion reports for SOC 2, PCI-DSS, or HIPAA evidence. Vanta / Drata / Secureframe can pull this automatically if connectors are wired up; otherwise upload manually to the GRC platform with the cycle date in the filename.

    Edit the runbook in IT Glue / Hudu / Confluence with anything learned this cycle — new app incompatibilities, vendors that need extra notice, dependency chains that bit you. Next month's run benefits from this month's discoveries.