Firewall Configuration Checklist

Initial Setup and Hardening

    Document whether this is a perimeter firewall, internal segmentation firewall, or branch SD-WAN edge. Confirm the upstream ISP handoff, downstream VLANs, and any HA pairing (active/passive on FortiGate, HA on Palo Alto, warm spare on Meraki MX). Topology assumptions get baked into rule design — wrong assumption here cascades through every later step.

    Check the vendor's recommended release train (FortiOS Mature, PAN-OS preferred, Meraki stable) — not bleeding-edge, not unsupported. Read the release notes for known issues affecting your feature set (SSL inspection, IPSec, BGP). Capture the version installed for the configuration record.

    Replace any vendor default account (admin/admin, cisco/cisco) with named admin accounts tied to the identity provider where possible — TACACS+ or RADIUS bound to Entra ID / Okta. Store the local break-glass credential in the password vault (1Password, Keeper, Hudu) with named owners and a rotation schedule.

    Disable HTTP, Telnet, and SNMPv1/v2c on every interface. Restrict HTTPS and SSH management to a dedicated mgmt VLAN or jump host subnet. Enforce MFA on the admin GUI where the platform supports it (FortiToken, Duo, Okta integration). Confirm idle-timeout and admin lockout thresholds are set.

    Export the running config to your documentation system (IT Glue, Hudu, Confluence) and your offline backup. Encrypted exports require the decryption passphrase be stored separately — losing the passphrase makes the backup useless during a recovery.

Access Control Policies

    Build rules from explicit allow lists with a default-deny at the bottom. Tag every rule with a description, owner, and ticket reference — undocumented rules are the first thing a future auditor flags. Application-aware platforms (Palo Alto App-ID, FortiGate Application Control) should use app identifiers rather than raw ports where possible.

    Limit SSH/HTTPS-mgmt source addresses to the admin jump-host subnet, the MSP's NOC range, or a ZTNA-fronted endpoint. Never expose the management interface to the public internet — Shodan finds it within hours and credential-spray attempts start the same day.

    Turn on session logging at start and end for every allow rule, plus the implicit deny. Without hit counts you cannot identify dead rules during the next audit, and without deny logging you have no signal for active reconnaissance.

    Backup, monitoring, and config-management integrations (RMM, SIEM, vuln scanner) get scoped read-only or feature-specific roles — not super-admin. Document each integration's account, role, and the system that uses it.

    Pull the rule-hit report and disable any rule with zero hits over the last 90 days. Disable before delete — leave the rule in place but inactive for one cycle so you can re-enable quickly if a quarterly job or a forgotten integration needs it.

Network Segmentation

    Standard zoning: untrust (internet), DMZ (public services), trust (corporate LAN), server (datacenter), guest, IoT/OT, and PCI/CDE if cardholder data applies. Each zone gets explicit policies — flat networks are how a single compromised endpoint becomes a domain-wide incident.

    Trunk VLANs from the access switches into the firewall on tagged subinterfaces. Confirm 802.1Q tagging matches the switch side — mismatched native VLANs are a common cause of phantom connectivity issues that look like firewall problems.

    Default-deny between zones. Open only the protocols each business flow actually needs — AD replication, SQL, SMB to the file server, RDP to jump hosts. Document the source application or owner for every cross-zone allow.

    If the environment processes cardholder data, run a segmentation-validation scan from each non-CDE zone against the CDE. Any reachable host outside expected ports expands PCI scope and triggers QSA findings. Skip this step if the org is not in PCI scope.

Monitoring and Logging

    Turn on traffic, threat, URL, and SSL-decryption logs at the level your platform supports. Confirm the logging disk has enough capacity for the expected event rate — log buffer overruns silently drop events and you find out only during incident response.

    Send firewall syslog to Splunk, Microsoft Sentinel, QRadar, or whichever SIEM the org runs. Confirm the parser recognizes the vendor format, and validate that a test event lands in the index within the expected latency. Local-only logs are not auditable.

    Tune alerts for IPS-block bursts, repeated denied admin logins, VPN brute force, and outbound C2 destinations. Route to PagerDuty / Opsgenie for SEV alerts and to a tickets queue for everything else. Untuned alerts cause fatigue — start strict and relax based on noise.

    PCI DSS requires one year with three months immediately available; HIPAA defers to org policy but six years is common; SOC 2 typically aligns to one year. Set the SIEM retention plus archival storage to the longest applicable requirement.

Maintenance and Change Management

    Push the candidate config to a lab device matching the production model and firmware. Validate inbound, outbound, and east-west flows for each affected zone. A virtual appliance (FortiGate-VM, PAN VM-Series) is acceptable when a hardware twin isn't available.

    Walk through each failure: misordered rule, NAT mismatch, missing route, broken SSL inspection. Fix in the lab, re-run the validation flows, and only then proceed to the RFC. Do not push known-broken changes to production under deadline pressure.

    File the change request in ServiceNow / Jira Service Management / ConnectWise with rule diff, blast-radius assessment, rollback plan, and notification list. Standard changes pre-approved in the catalog can skip CAB review; normal changes need CAB sign-off before the maintenance window.

    Execute the approved plan exactly as written. If you find yourself improvising mid-window, stop and roll back — out-of-script edits are how a routine change becomes a Sev-1. Capture a timestamped console log of every command issued.

    Run the validation checks from the RFC: synthetic transactions through each affected zone, VPN tunnel health, monitoring agent check-ins. Catch breakage inside the window so rollback stays in scope, not the next morning when users start logging tickets.

    Restore the pre-change config export and re-validate the same flows. File a post-incident review within five business days documenting why the lab test passed but production failed — environmental drift between lab and prod is the most common cause and worth fixing before the next change.

    Update IT Glue / Hudu / Confluence with the new rule set, diagram, and the change ticket reference. Re-export the configuration backup so the new state is the recovery baseline. Skipping this step is how next quarter's audit finds undocumented rules.

Use this template in Manifestly

Start a Free 14 Day Trial
Use Slack? Start your trial with one click

Related Systems Administration Checklists

Ready to take control of your recurring tasks?

Start Free 14-Day Trial


Use Slack? Sign up with one click

With Slack