Network Troubleshooting Checklist
Ticket Intake and Scope
Record what the user sees — "no internet," "slow Teams calls," "shared drive unreachable" — plus when it started and what changed recently (patch window, ISP work, office move). Vague tickets like "network is down" almost always narrow to one app, one subnet, or one VLAN once you ask.
Scope drives the next move. A single user is an endpoint problem; one VLAN or floor points at a switch or AP; a site-wide outage points at the firewall, ISP, or DNS. Confirm by asking a second user on the same subnet, or check PRTG / Auvik / Meraki dashboard for affected devices.
Site-wide or multi-site impact escalates to P1 — page the on-call engineer via PagerDuty / Opsgenie, post in the NOC channel, and start the incident timeline. Single-user issues stay at standard helpdesk priority.
Physical and Endpoint Checks
Confirm the patch cable is seated at both ends, the switchport link LED is amber/green, and PoE devices are drawing power. A surprising fraction of "network down" tickets are a kicked cable or a tripped PoE budget on an aging switch.
Run ipconfig /all (Windows) or ifconfig / ip addr (macOS/Linux). An APIPA address (169.254.x.x) means DHCP failed — skip ahead to the DHCP section. A correct lease with wrong DNS points at scope options or a static override on the NIC.
Ping the gateway first to isolate LAN vs. WAN. If the gateway responds but 8.8.8.8 doesn't, the problem is upstream of the firewall (ISP, WAN circuit). If the gateway itself doesn't respond, it's a LAN-side switch, VLAN, or cable issue.
Use tracert (Windows) or traceroute (macOS/Linux) to the destination the user can't reach. The hop where latency spikes or replies stop is your suspect. Asymmetric routing or an MPLS handoff is a common gotcha at the ISP boundary.
Network Device Diagnosis
SSH into the upstream switch (Meraki dashboard, Cisco IOS show logging, FortiGate diagnose, Aruba show log). Look for err-disable, STP topology change, port flap, or duplex mismatch entries within the incident window.
show interface status and show interface counters errors on Cisco; equivalent on your platform. CRC errors point at a bad cable or NIC; input drops point at a microburst or saturated uplink; err-disabled ports usually mean a port-security violation.
Confirm the access port is in the right VLAN and the upstream trunk carries it (show interface trunk). A native-VLAN mismatch across a trunk is a classic STP and broadcast-loop trigger.
Check show ip route for the destination prefix and confirm the active gateway peer (HSRP/VRRP) is the one you expect. A failover that didn't fail back is a common cause of intermittent connectivity after a maintenance window.
DNS and DHCP
Run nslookup or dig against the internal DNS server (DC, Windows Server DNS, Bind) and against an external resolver (1.1.1.1, 8.8.8.8, Quad9). If internal works but external fails, check forwarders; if external works but internal fails, the DC's DNS service is the suspect.
Open the DHCP console (Windows Server DHCP, ISC Kea, Meraki, FortiGate) and check the scope. A scope at 100% utilization gives new clients APIPA addresses and looks identical to a "network down" report. Expand the scope or shorten the lease as a temporary fix; investigate the device-count spike afterward.
Check Event Viewer (DHCP-Server, DNS-Server channels) or the equivalent on your platform for repeated NACKs, scope-exhausted entries, or zone-transfer failures. Cross-reference timestamps with the user's report.
Wireless Diagnosis
Have the user plug into a wired port (or test with a known-good wired endpoint nearby). If wired works and wireless doesn't, the problem is the AP, SSID, or RF environment — not the upstream network.
In the Meraki / UniFi / Aruba / Mist dashboard, confirm the nearest AP is online, on the right firmware, and not stuck with 60+ clients on a single radio. A single overloaded AP is the most common "wifi is slow" cause in conference rooms.
Use the controller's RF spectrum view or a tool like Ekahau / NetSpot to confirm signal strength at the affected location is above -67 dBm and the channel isn't being clobbered by a neighbor or rogue AP. 2.4 GHz channel overlap is the usual culprit in dense offices.
Confirm the SSID is broadcast on the correct AP group, and for 802.1X SSIDs, test a RADIUS auth from the controller against NPS / ClearPass / Cisco ISE. An expired RADIUS shared secret or a cert renewal that didn't propagate is a classic post-maintenance failure.
Resolution and Documentation
Don't close on "should be working now." Have the original reporter reproduce their workflow — the Teams call, the file open, the SaaS login — and confirm it succeeds. Restored ping is not the same as restored business function.
Write the ticket close-out in IT Glue / Hudu / Confluence with the symptom, scope, root cause, and the exact fix command or config change. Future-you (or the next on-call) will search this in six months when it recurs.
If the incident was site-wide or multi-site, hold a 30-minute blameless review within 48 hours. Capture preventive actions — monitoring gap, runbook gap, config drift — as tickets, not as wishes in meeting notes.
Use this template in Manifestly
- Cloud Migration Checklist
- Cloud Security Checklist
- User Access Review Checklist
- Data Recovery Checklist
- Containerization Rollout Checklist
- Database Backup Checklist
- Password Management Checklist
- Backup and Restore Checklist
- Network Upgrade Checklist
- Server Backup Checklist
- Business Continuity Plan Checklist
- Problem Management Checklist
- Server Decommissioning Checklist
- Cloud Monitoring Checklist
- Hardware Inventory Checklist
- IT Regulatory Compliance Review
- Release Management Checklist
- Server Maintenance Checklist
- Rollback Plan Checklist
- Customer Support Ticket Workflow
- Software Upgrade Checklist
- Quarterly Compliance Reporting Checklist
- Patch Management Checklist
- Hardware Maintenance Checklist
- Server Security Checklist
- IT Emergency Response Checklist
- Incident Management Checklist
- Disaster Recovery Plan Checklist
- User Role Management Checklist
- Software Installation Checklist
- Compliance Audit Checklist
- Access Control Checklist
- Cloud Cost Management Checklist
- IT Staff Performance Review
- Change Management Checklist
- Firewall Configuration Checklist
- Security Audit Checklist
- Quarterly Network Security Review
- Database Migration Checklist
- Employee Onboarding Checklist
- Capacity Planning Checklist
- IT Budgeting Checklist
- Network Monitoring Checklist
- Cloud Deployment Checklist
- Database Installation Checklist
- IT Service Request Checklist
- Database Security Checklist
- System Monitoring Checklist
- Hardware Troubleshooting Checklist
- IT Strategy Checklist
- Patch Deployment Checklist
- Hardware Upgrade Checklist
- Performance Tuning Checklist
- Application Performance Monitoring Checklist
- Employee Training Checklist
- User Onboarding Checklist
- IT Vendor Management Checklist
- Server Build and Hardening Checklist
- IT Policy Review Checklist
- Help Desk Ticket Handling Checklist
- Infrastructure as Code Checklist
- Hardware Disposal Checklist
- IT Resource Allocation Checklist
- Incident Response Checklist
- User Offboarding Checklist
Ready to take control of your recurring tasks?
Start Free 14-Day TrialUse Slack? Sign up with one click
