CCNA Lab 9: Load Troubleshooting and Switch Performance

A switch that is still forwarding packets but struggling under load produces intermittent failures that are hard to diagnose. Every L2 engineer needs to know how to read CPU utilization, memory pressure, TCAM exhaustion, and interface error counters.

CPU Utilization

The CPU on a switch handles control plane traffic (STP, routing protocols, management) and software-forwarded data packets. High CPU does not always mean a loop — process starvation, ACL logging, and SNMP polling can all cause it.

Checking CPU

show processes cpu
show processes cpu sorted
show processes cpu history

The history output shows a bar chart across time:

      100
       95
       90
       85
       80
   %   75
   C   70
   P   65
   U   60
       55
       50
       45
       40
       35
       30
       25
       20
       15
       10
        5
                ....#....#....#....#....#....#....#....#....#....
                  0    5    0    5    0    5    0    5    0
                       50   55   00   05   10   15   20   25

Spikes above 80% that sustain across multiple intervals indicate a problem.

Identifying the Hog Process

show processes cpu sorted | ex 0.00

Look for processes consuming non-zero CPU:

PID    Runtime(ms)    Invoked     uSecs    5Sec    1Min    5Min    Process
269    1234567        98765       12493     12.23%  8.45%   5.67%   ARP Input
112    456789         12345       36987     8.45%   6.12%   3.89%   IP Input

High values in ARP Input often point to an ARP storm or a host with a /32 mask doing unresolved ARP. IP Input with high CPU suggests routed traffic being software-switched (no CEF).

Common CPU Hogs

ProcessLikely Cause
ARP InputARP flooding, subnet scan, STP TCN storm
IP InputCEF disabled, punted traffic
Spanning TreeTopology changes, TCN flood
Cat4k MgmtHigh SNMP polling, too many OIDs
ACL LoggingACL with log keyword hit by every packet
DHCP SnoopingHigh DHCP request rate

Memory Management

Check Memory

show memory
show memory statistics
show processes memory
show processes memory sorted

Memory Leak Detection

show processes memory | include Process Name
show processes memory sorted | head 10

Compare memory usage across reboots. A process that grows continuously without release is leaking.

Low Memory Symptoms

  • Configuration changes fail with % Not enough space
  • SSH sessions drop
  • SNMP polling fails intermittently
  • show commands return partial output
  • Syslog: %SYS-2-MALLOCFAIL

Emergency recovery: Reload the switch during a maintenance window. There is no graceful memory reclamation on most Catalyst switches.

TCAM Exhaustion

TCAM (Ternary Content Addressable Memory) stores ACL entries, QoS policies, and forwarding entries at wire speed. When TCAM fills, new entries are rejected or forwarded in software.

Check TCAM Utilization

show platform tcam utilization
show platform tcam counts
show sdm prefer

Example output:

CAM Utilization for ASIC 0
                         Max     Used
                     Masks/Values Masks/Values
 Unicast mac addresses: 6384/6384  512/512
 IPv4 IGMP groups    + 1024/1024   8/8
 IPv4 unicast routes:  2816/2816   12/12
 IPv4 direct adjacencies before load share: 2816/2816   12/12

If any category exceeds 80%, plan for expansion or TCAM optimization.

SDM Templates

Switches use SDM (Switch Database Management) templates to allocate TCAM:

show sdm prefer
 
! Change template (requires reload)
sdm prefer vlan      ! Maximizes MAC addresses
sdm prefer routing   ! Maximizes IPv4 routes
sdm prefer acl       ! Maximizes ACEs

Changing the template requires a reload. Plan accordingly.

Interface Error Counters

High CRC, runts, giants, or collisions point to physical layer issues.

show interfaces Gi0/1
show interfaces Gi0/1 counters errors
show interface statistics

Key Error Types

ErrorMeaning
CRCFCS error — bad cable, SFP, or duplex mismatch
RuntsFrame < 64 bytes — collisions or bad NIC
GiantsFrame > 1518 bytes — misconfigured MTU
Input errorsSum of all receive-side errors
Output errorsSum of all transmit-side errors
CollisionsLate collisions = duplex mismatch
OverrunsSwitch can not keep up with ingress rate
UnderrunsSwitch can not feed the egress line rate

Error Threshold

Zero is the only acceptable CRC count. Any non-zero CRC points to a physical issue.

! Clear counters for fresh measurement
clear counters GigabitEthernet0/1

Duplex Mismatch Detection

show interfaces Gi0/1 | include duplex

The most reliable sign of a duplex mismatch is late collisions on the half-duplex side and CRC errors on the full-duplex side. Modern switches should always be set to duplex full.

Switch Backplane Saturation

Indicates the switch fabric cannot handle the aggregate traffic.

Monitoring Backplane

show platform port-asic statistics
show platform backplane rate
show controllers ethernet-controller

High overrun/submit errors on multiple ports simultaneously suggest the backplane is saturated.

Real-World Scenarios

Scenario 1: “Slow SSH and intermittent SNMP timeouts”

show processes cpu | include CPU
# CPU: 5 sec = 92%, 1 min = 88%, 5 min = 75%
 
show processes cpu sorted | exclude 0.00
# ARP Input at 25%, IP Input at 18%, ACL-Log at 12%
 
show interfaces | include broadcast
# Gi0/24: 12000 broadcast packets/sec
 
show running-config | include log
# access-list 100 permit tcp any any log

ACL logging with log keyword punts every matched packet to the CPU. Remove the log keyword from ACL entries in the forwarding path.

Scenario 2: “Users report random drops on one switch”

show interfaces Gi0/1 counters errors
# CRC: 4523, Runts: 234, Late Collisions: 89
 
show interfaces Gi0/1 | include duplex
# Half-duplex
 
show interface Gi0/1 | include speed
# 10 Mbps

The port negotiated 10/half instead of 100/full. Likely a bad cable or faulty NIC. Hard-set the interface:

interface Gi0/1
 speed 100
 duplex full

Scenario 3: “Show commands take forever”

show processes cpu | include CPU
# CPU: 5 sec = 15%, 1 min = 12%, 5 min = 10%
 
show memory statistics
# Free memory: 45MB out of 256MB (low)

Memory pressure causes the switch to swap processes to slow memory. Schedule a reload.

TCAM Troubleshooting

! Check if an ACL entry was installed in TCAM
show platform tcam interface Gi0/1 acl
 
! Check if QoS policy matches in hardware
show mls qos interface Gi0/1 statistics

If TCAM is full, the switch either drops new ACL entries silently or punts them to software. Always check TCAM after adding large ACLs or QoS policies.

Proactive Monitoring Commands

Run these weekly on every switch:

show processes cpu sorted
show memory statistics
show platform tcam utilization
show interfaces counters errors | include CRC|error
show interfaces | include line protocol|rate
show environment
show logging | include down|err|flap|MALLOCFAIL
show controllers ethernet-controller | include overrun

Best Practices

  • Enable CEFip cef ensures hardware forwarding. Disabled CEF forces all traffic through CPU.
  • Limit SNMP polling — Poll no more than once per 5 minutes on production switches.
  • Remove ACL logging — Never use the log keyword on ACLs in the forwarding path.
  • Hard-set speed/duplex — Do not rely on auto-negotiation on critical links.
  • Monitor TCAM — Check utilization before ACL or QoS changes.
  • Schedule maintenance reloads — Memory fragmentation grows over time.
  • Use SDM template matching your role — Access switches need MAC addresses; distribution switches need routes.

Quick Commands

CommandWhen to Use
show processes cpu sortedHigh CPU — find the hog process
show processes memory sortedMemory leak — find the leaking process
show platform tcam utilizationTCAM exhaustion — before adding ACLs
show interfaces counters errorsPhysical layer issues
clear countersAfter replacing cable/SFP, verify fix
show environmentTemperature and power health
show controllers ethernet-controllerBackplane / ASIC errors
show sdm preferCurrent TCAM allocation template
terminal monitorView logs in SSH session
show loggingRecent events and errors