Hello everyone,
Today, I will share with you how to handle the problem that BFD for OSPF on the NE40E goes Down on the Eth-Trunk.
Issue Description
NE40E-1 and NE40E-2 connect with each other with port Eth-Trunk 10. BFD for OSPF does not work well on the Eth-Trunk interface, when one of the member port down, OSPF peer rebuild.
Alarm Information
Mar 30 2018 11:47:04 NE40E_V6R7 D/4/STACHG_TOUP(l)[0]:Slot=1,Vcpu=0;BFD session changed to Up. (SlotNumber=1, Discriminator=8210, FormerStatus=Down, Applications=OSPF, BindInterfaceName=Eth-Trunk10, ProcessPST=False, PeerIp=12.1.1.1, SessType=D_IP_IF, RemoteDiscriminator=8210)
Mar 30 2018 11:47:03 NE40E_V6R7 %OSPF/4/NBR_CHANGE_E(l)[1]:Neighbor changes event: neighbor status changed. (ProcessId=1, NeighborAddress=12.1.1.1, NeighborEvent=LoadingDone, NeighborPreviousState=Loading, NeighborCurrentState=Full)
Mar 30 2018 11:47:03 NE40E_V6R7 %OSPF/4/NBR_CHANGE_E(l)[2]:Neighbor changes event: neighbor status changed. (ProcessId=1, NeighborAddress=12.1.1.1, NeighborEvent=ExchangeDone, NeighborPreviousState=Exchange, NeighborCurrentState=Loading)
Mar 30 2018 11:47:03 NE40E_V6R7 %OSPF/4/NBR_CHANGE_E(l)[3]:Neighbor changes event: neighbor status changed. (ProcessId=1, NeighborAddress=12.1.1.1, NeighborEvent=NegotiationDone, NeighborPreviousState=ExStart, NeighborCurrentState=Exchange)
Mar 30 2018 11:47:03 NE40E_V6R7 %OSPF/4/NBR_CHANGE_E(l)[4]:Neighbor changes event: neighbor status changed. (ProcessId=1, NeighborAddress=12.1.1.1, NeighborEvent=2WayReceived, NeighborPreviousState=Init, NeighborCurrentState=ExStart)
Mar 30 2018 11:46:57 NE40E_V6R7 %OSPF/4/NBR_CHANGE_E(l)[5]:Neighbor changes event: neighbor status changed. (ProcessId=1, NeighborAddress=12.1.1.1, NeighborEvent=HelloReceived, NeighborPreviousState=Down, NeighborCurrentState=Init)
Mar 30 2018 11:46:54 NE40E_V6R7 D/4/STACHG_TODWN(l)[6]:Slot=1,Vcpu=0;BFD session changed to Down. (SlotNumber=1, Discriminator=8209, Diagnostic=DetectDown, Applications=OSPF, ProcessPST=False, BindInterfaceName=Eth-Trunk10, InterfacePhysicalState=Down, InterfaceProtocolState=Down, PeerIp=12.1.1.1, SessType=D_IP_IF, RemoteDiscriminator=8209)
Mar 30 2018 11:46:54 NE40E_V6R7 D/4/STACHG_TODWN(l)[7]:Slot=1,Vcpu=0;BFD session changed to Down. (SlotNumber=1, Discriminator=20, Diagnostic=DetectDown, Applications=None, ProcessPST=False, BindInterfaceName=GigabitEthernet1/1/6, InterfacePhysicalState=Down, InterfaceProtocolState=Down, PeerIp=224.0.0.184, SessType=S_IP_IF, RemoteDiscriminator=21)
Mar 30 2018 11:46:54 NE40E_V6R7 %PHY/4/PHY_STATUS_UP2DOWN(l)[8]:Slot=1,Vcpu=0;GigabitEthernet1/1/6 change status to down due to being shut.
Handling Process
1. Check the parameter of BFD for OSPF. The detect RX interval and TX interval use default parameter, by default, the interval is 10ms.
#
ospf 1 router-id 10.159.240.147
bfd all-interfaces enable
area 0.0.0.0
network 12.1.1.0 0.0.0.3
#
#
ospf 1 router-id 10.159.240.148
bfd all-interfaces enable
area 0.0.0.0
network 12.1.1.0 0.0.0.3
#
2. When shutdown one of member port of Eth-Trunk, the BFD state down and the OSPF peer rebuild.
Root Cause
The root cause is that the detect-down performance of Eth-Trunk is 50ms on the NE40E router. But the detect down interval of BFD is 10ms * 3 by default. When one of the members' ports is down, the BFD data is still sent to the port witch already been shut down.
Solution
The solution:
1. When the cable is RJ45, configure the BFD RX and TX interval more than 500ms.
2. When the cable is fiber, configure the BFD RX and TX interval more than 100ms.
#
ospf 1 router-id 10.159.240.147
bfd all-interfaces enable
bfd all-interfaces min-tx-interval 100 min-rx-interval 100
area 0.0.0.0
network 12.1.1.0 0.0.0.3
#
That is all I want to share with you! Thank you!