Network Troubleshooting Checklist
Ticket Intake and Scope
Record what the user sees — "no internet," "slow Teams calls," "shared drive unreachable" — plus when it started and what changed recently (patch window, ISP work, office move). Vague tickets like "network is down" almost always narrow to one app, one subnet, or one VLAN once you ask.
Scope drives the next move. A single user is an endpoint problem; one VLAN or floor points at a switch or AP; a site-wide outage points at the firewall, ISP, or DNS. Confirm by asking a second user on the same subnet, or check PRTG / Auvik / Meraki dashboard for affected devices.
Site-wide or multi-site impact escalates to P1 — page the on-call engineer via PagerDuty / Opsgenie, post in the NOC channel, and start the incident timeline. Single-user issues stay at standard helpdesk priority.
Physical and Endpoint Checks
Confirm the patch cable is seated at both ends, the switchport link LED is amber/green, and PoE devices are drawing power. A surprising fraction of "network down" tickets are a kicked cable or a tripped PoE budget on an aging switch.
Run ipconfig /all (Windows) or ifconfig / ip addr (macOS/Linux). An APIPA address (169.254.x.x) means DHCP failed — skip ahead to the DHCP section. A correct lease with wrong DNS points at scope options or a static override on the NIC.
Ping the gateway first to isolate LAN vs. WAN. If the gateway responds but 8.8.8.8 doesn't, the problem is upstream of the firewall (ISP, WAN circuit). If the gateway itself doesn't respond, it's a LAN-side switch, VLAN, or cable issue.
Use tracert (Windows) or traceroute (macOS/Linux) to the destination the user can't reach. The hop where latency spikes or replies stop is your suspect. Asymmetric routing or an MPLS handoff is a common gotcha at the ISP boundary.
Network Device Diagnosis
SSH into the upstream switch (Meraki dashboard, Cisco IOS show logging, FortiGate diagnose, Aruba show log). Look for err-disable, STP topology change, port flap, or duplex mismatch entries within the incident window.
show interface status and show interface counters errors on Cisco; equivalent on your platform. CRC errors point at a bad cable or NIC; input drops point at a microburst or saturated uplink; err-disabled ports usually mean a port-security violation.
Confirm the access port is in the right VLAN and the upstream trunk carries it (show interface trunk). A native-VLAN mismatch across a trunk is a classic STP and broadcast-loop trigger.
Check show ip route for the destination prefix and confirm the active gateway peer (HSRP/VRRP) is the one you expect. A failover that didn't fail back is a common cause of intermittent connectivity after a maintenance window.
DNS and DHCP
Run nslookup or dig against the internal DNS server (DC, Windows Server DNS, Bind) and against an external resolver (1.1.1.1, 8.8.8.8, Quad9). If internal works but external fails, check forwarders; if external works but internal fails, the DC's DNS service is the suspect.
Open the DHCP console (Windows Server DHCP, ISC Kea, Meraki, FortiGate) and check the scope. A scope at 100% utilization gives new clients APIPA addresses and looks identical to a "network down" report. Expand the scope or shorten the lease as a temporary fix; investigate the device-count spike afterward.
Check Event Viewer (DHCP-Server, DNS-Server channels) or the equivalent on your platform for repeated NACKs, scope-exhausted entries, or zone-transfer failures. Cross-reference timestamps with the user's report.
Wireless Diagnosis
Have the user plug into a wired port (or test with a known-good wired endpoint nearby). If wired works and wireless doesn't, the problem is the AP, SSID, or RF environment — not the upstream network.
In the Meraki / UniFi / Aruba / Mist dashboard, confirm the nearest AP is online, on the right firmware, and not stuck with 60+ clients on a single radio. A single overloaded AP is the most common "wifi is slow" cause in conference rooms.
Use the controller's RF spectrum view or a tool like Ekahau / NetSpot to confirm signal strength at the affected location is above -67 dBm and the channel isn't being clobbered by a neighbor or rogue AP. 2.4 GHz channel overlap is the usual culprit in dense offices.
Confirm the SSID is broadcast on the correct AP group, and for 802.1X SSIDs, test a RADIUS auth from the controller against NPS / ClearPass / Cisco ISE. An expired RADIUS shared secret or a cert renewal that didn't propagate is a classic post-maintenance failure.
Resolution and Documentation
Don't close on "should be working now." Have the original reporter reproduce their workflow — the Teams call, the file open, the SaaS login — and confirm it succeeds. Restored ping is not the same as restored business function.
Write the ticket close-out in IT Glue / Hudu / Confluence with the symptom, scope, root cause, and the exact fix command or config change. Future-you (or the next on-call) will search this in six months when it recurs.
If the incident was site-wide or multi-site, hold a 30-minute blameless review within 48 hours. Capture preventive actions — monitoring gap, runbook gap, config drift — as tickets, not as wishes in meeting notes.
Use this template in Manifestly
- User Offboarding Checklist
- Application Performance Monitoring Checklist
- User Onboarding Checklist
- Employee Training Checklist
- Hardware Upgrade Checklist
- IT Strategy Checklist
- Hardware Troubleshooting Checklist
- Performance Tuning Checklist
- Patch Deployment Checklist
- IT Policy Review Checklist
- Database Security Checklist
- System Monitoring Checklist
- Software Installation Checklist
- Disaster Recovery Plan Checklist
- Patch Management Checklist
- Customer Support Ticket Workflow
- User Access Review Checklist
- Software Upgrade Checklist
- Cloud Monitoring Checklist
- Containerization Rollout Checklist
- Server Maintenance Checklist
- Business Continuity Plan Checklist
- Rollback Plan Checklist
- Password Management Checklist
- Server Decommissioning Checklist
- Network Upgrade Checklist
- Backup and Restore Checklist
- Server Backup Checklist
- IT Resource Allocation Checklist
- Incident Response Checklist
- Infrastructure as Code Checklist
- Hardware Disposal Checklist
- Database Backup Checklist
- Cloud Security Checklist
- Cloud Migration Checklist
- IT Service Request Checklist
- Network Monitoring Checklist
- Cloud Deployment Checklist
- IT Budgeting Checklist
- Database Installation Checklist
- Capacity Planning Checklist
- Security Audit Checklist
- Cloud Cost Management Checklist
- Database Migration Checklist
- Firewall Configuration Checklist
- Quarterly Network Security Review
- Change Management Checklist
- User Role Management Checklist
- IT Staff Performance Review
- Server Security Checklist
- Employee Onboarding Checklist
- Quarterly Compliance Reporting Checklist
- Access Control Checklist
- Incident Management Checklist
- Compliance Audit Checklist
- IT Emergency Response Checklist
- Hardware Maintenance Checklist
- Server Build and Hardening Checklist
- IT Regulatory Compliance Review
- Help Desk Ticket Handling Checklist
- Release Management Checklist
- Data Recovery Checklist
- Problem Management Checklist
- Hardware Inventory Checklist
- IT Vendor Management Checklist
Ready to take control of your recurring tasks?
Start Free 14-Day TrialUse Slack? Sign up with one click
