5.3 Checking Whether the Problem Is Caused by Network Flapping
When network flapping occurs, the network topology frequently changes. The switch is busy with network switching events, causing a high CPU usage. Network flapping includes STP flapping and OSPF route flapping.
STP Flapping
When STP flapping occurs, the switch frequently calculates STP topology, updates MAC address table, and ARP table, causing a high CPU usage.
1. Fault Location
If you consider that STP flapping may occur, run the display stp topology-change command multiple times at an interval of several seconds to view STP topology information. Alternatively, you can check the trap and log information on the switch to determine whether STP topology has changed.
# Run the command multiple times. Check whether the value of Number of topology changes increases.
display stp topology-change
CIST topology change information
Number of topology changes :35
Time since last topology change :0 days 1h:7m:30s
Topology change initiator(notified) :GigabitEthernet2/0/6
Topology change last received from :101b-5498-d3e0
Number of generated topologychange traps : 38
Number of suppressed topologychange traps: 8
MSTI 1 topology change information
Number of topology changes :0
When you confirm that network topology is frequently changed, run the display stp tc-bpdu statistics command after several seconds again. Check whether interfaces on the switch have received Topology Change (TC) BPDUs. If so, find out the source of the TC BPDUs, that is, the device causing the topology change.
n If only the TC(Send) value increases, the topology change is caused by the local switch.
□ If only the TC(Send) value of a single interface increases, the topology change is caused by this interface.
□ If the TC(Send) values of multiple interfaces increase, check the events and logs on the NMS to analyze the STP topology change reason. Find out the interface causing the flapping.
n If multiple values in the TC(Send/Receive) column increase, check the event and log information on the NMS to determine whether the local switch causes the topology change, and check whether STP flapping occurs on the device connected to the problematic interface.
# View statistics about TC/TCN packets on an interface.
display stp tc-bpdu statistics
-------------------------- STP TC/TCN information --------------------------
MSTID Port TC(Send/Receive) TCN(Send/Receive)
0 GigabitEthernet2/0/6 21/4 0/1
0 GigabitEthernet2/0/7 93/0 0/1
0 GigabitEthernet2/0/8 115/0 0/0
0 GigabitEthernet2/0/9 110/0 0/0
0 GigabitEthernet3/0/23 29/5 0/0
2. Suggestion
a. Enable TC protection trap to help you understand how the switch processes TC BPDUs.
Run the snmp-agent trap enable feature-name mstp and stp tc-protection commands in the system view to enable TC protection trap.
By default, a switch is enabled to prevent topology change attacks. That is, within the stp tc-protection interval, the switch processes a maximum number of stp tc-protection threshold TC BPDUs.
After the trap is enabled, the switch reports the MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.15 hwMstpiTcGuarded and MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.16 hwMstpProTcGuarded traps.
For details about the traps, see 8.1.2 Alarm Information.
b. Perform operations according to topology changes.
n STP topology changes when the access interface alternates between Up and Down.
Run the stp edged-port enable command in the interface view to set the access interface as an edge port, and run the stp bpdu-protection command in the system or STP process view to enable BPDU protection.
n The root bridge is changed unexpectedly.
Run the display stp command. Check whether CIST Root/ERPC is the expected interface MAC address. If not, the root bridge has changed unexpectedly.
Run the stp root-protection command in the interface view to enable root protection, ensuring the correct topology.
display stp
-------[CIST Global Info][Mode MSTP]-------
CIST Bridge:4096 .707b-e8c8-00e9
Config Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
Active Times:Hello 2s MaxAge 20s FwDly 15s MaxHop 20
CIST Root/ERPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root)
CIST RegRoot/IRPC:4096 .707b-e8c8-00e9 / 0 (This bridge is the root)
CIST RootPortId:0.0
BPDU-Protection:Disabled
CIST Root Type:Secondary root
TC or TCN received:1
TC count per hello:0
STP Converge Mode:Normal
Share region-configuration :Enabled
Time since last TC:1 days 14h:25m:38s
Number of TC:2
Last TC occurred:GigabitEthernet0/0/1
----[Port18(GigabitEthernet0/0/1)][LEARNING]----
Port Protocol:Enabled
Port Role:Designated Port
Port Priority:128
Port Cost(Dot1T ):Config=auto / Active=20000
Designated Bridge/Port:4096.707b-e8c8-00e9 / 128.18
Port Edged:Config=default / Active=disabled
Point-to-point:Config=auto / Active=true
Transit Limit:6 packets/s
Protection Type:None
Port STP Mode:STP
Port Protocol Type:Config=auto / Active=dot1s
BPDU Encapsulation:Config=stp / Active=stp
PortTimes:Hello 2s MaxAge 20s FwDly 15s RemHop 20
TC or TCN send:0
TC or TCN received:0
BPDU Sent:11
TCN: 0, Config: 12, RST: 0, MST: 1
BPDU Received:0
TCN: 0, Config: 1, RST: 0, MST: 0
c. If the topology change reason is unknown or the fault persists, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch agents.
OSPF Routing Protocol
Routing protocol flapping causes route re-advertisement and recalculation, which increases the load of the CPU. Generally, OSPF is configured to manage dynamic routing information. Therefore, OSPF route flapping is described here.
1. Fault Location
Run the display ospf peer last-nbr-down command to check the reason why the OSPF neighbor relationship goes Down.
The reason is displayed in the Immediate Reason and Primary Reason fields.
Check logs on the switch to determine why the OSPF neighbor becomes Down.
Run the display logbuffer command, and you can find the following log information:
OSPF/3/NBR_DOWN_REASON:Neighbor state leaves full or changed to Down. (ProcessId=[USHORT], NeighborRouterId=[IPADDR],NeighborAreaId=[ULONG], NeighborInterface=[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING],NeighborChangeTime=[STRING])
The NeighborDownImmediate reason field indicates the cause for the OSPF neighbor Down event.
2. Suggestion
Determine the reason depending on the key fields and take measures.
Possible causes of the fault are as follows:
Neighbor Down Due to Inactivity
The Hello packet is not received within the deadtime (set by the ospf timer dead command in the interface view).
When an OSPF neighbor is Down, OSPF neighbor flapping occurs and OSPF neighbor relationship cannot be set up. Run the display ospf peer brief command to check whether OSPF neighbor flapping occurs or OSPF neighbor relationship cannot be set up.
n OSPF neighbor relationship flaps.
OSPF neighbor flapping may caused by a small CPCAR value for OSPF, link flapping or congestion on interfaces, and LSA flooding.
1) Run the display cpu-defend statistics packet-type ospf command to view statistics about the OSPF packets sent to the CPU. If too many OSPF packets are discarded, check whether the switch undergoes an OSPF attack or the CPCAR value for OSPF is too small.
2) View the log to check whether interfaces alternate between Up and Down. If link flapping or congestion occurs, check the link on the interface.
3) If the holdtime of the OSPF neighbor relationship is smaller than 20s, run the ospf timer dead interval command to change the holdtime to be larger than 20s.
4) Run the sham-hello enable command in the OSPF view to enable the OSPF sham-hello function, so that the switch can maintain the neighbor relationship using non-Hello packets such as LSU. This allows the switch to detect OSPF neighbor relationships sensitively.
5) If the fault persists after the preceding operations are performed, contact Huawei switch agents.
n OSPF neighbor relationship cannot be set up.
Check whether the configurations in the OSPF view of devices on both ends are the same. If the configurations such as the OSPF area ID or area type (NSSA, stub area, or common area) are different, the two devices cannot establish an OSPF neighbor relationship.
Run the display ospf [ process-id ] interface command to check whether OSPF is successfully enabled on the interfaces.
display ospf 1 interface
OSPF Process 1 with Router ID 2.2.2.2
Interfaces
Area: 0.0.0.0 (MPLS TE not enabled)
Interface IP Address Type State Cost Pri
Eth0/1/1 10.1.1.2 Broadcast Waiting 1 1
□ If OSPF is not enabled on interfaces, run the ospf enable [ process-id ] area area-id command in the interface view to enable OSPF.
□ If the OSPF process has been enabled on the related interface, run the display ospf error command multiple times at an interval of several seconds to check whether OSPF authentication information on the two devices is the same according to the Bad authentication type and Bad authentication key fields.
display ospf 1 error
OSPF Process 1 with Router ID 2.2.2.2
OSPF error statistics
General packet errors:
0 : IP: received my own packet 3 : Bad packet
0 : Bad version 0 : Bad checksum
0 : Bad area id 0 : Drop on unnumbered interface
0 : Bad virtual link 3 : Bad authentication type
0 : Bad authentication key 0 : Packet too small
0 : Packet size > ip length 0 : Transmit error
0 : Interface down 0 : Unknown neighbor
0 : Bad net segment 0 : Extern option mismatch
- If the value of the Bad authentication type or Bad authentication key value keeps increasing, OSPF authentication information on the two devices is different. To configure the same authentication information for the two devices, run the ospf authentication-mode command in the interface views or run the authentication-mode command in the OSPF process view.
- If the Bad authentication type or Bad authentication key value does not increase, the authentication information is the same. If the neighbor intermittently disappears when the display ospf peer command is executed, OSPF neighbor relationship flaps. To resolve this problem, see section 6.4.
Neighbor Down Due to Kill Neighbor
If the interface is Down, BFD is Down, or the reset ospf process command is executed, the OSPF neighbor relationship goes Down.
View the NeighborDownPrimeReason field to determine the reason.
Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum Mismatch
When the OSPF status of the peer device goes Down first, the peer device sends a 1-Way Hello packet to the local device, causing OSPF on the local device to go Down.
Determine why OSPF status of the peer device becomes Down.
For other reasons, see OSPF/3/NBR_DOWN_REASON in 8.1.3 Log Information.
5.4 Checking Whether the Problem Is Caused by Network Loop
A network loop will cause MAC flapping. A large number of protocol packets are sent to the CPU, overwhelming the CPU.
1. Fault Location
A network loop may have the following symptoms:
The CPU usage of a switch exceeds 80%.
Indicators of interfaces in the VLAN where a loop has occurred blink faster than usual.
MAC flapping frequently occurs.
The administrator cannot remotely log in to the switch, and the switch responds to the operations on console port slowly.
A lot of ICMP packets are lost in ping tests.
The display interface command output shows a large number of broadcast packets received on an interface.
Loop alarms are generated after loop detection is enabled.
The PCs connected to switch receive a large number of broadcast or unknown unicast packets.
2. Suggestion
a. Observe interface indicators and collect traffic statistics on interfaces to locate the interfaces undergoing broadcast storms.
b. Check the devices hop by hop according to the topology to locate the devices that cause the loop.
c. Locate the interface that causes the loop and shut down the interface to remove the loop.
d. if the fault persists after the preceding operations are performed, collect network information (including interface connections) and logs (the log.log file or the display logbuffer command output), and provide collected information to Huawei switch agents.

This chapter describes only the method of locating network loops and handling suggestions. For more information, see the network loop troubleshooting guide.