Keywords :
BFD,Static route tracking,software upgrade
Abstract :
Before an upgrade operation from V6R8 to V8R10, BFD session was down but did not impact connectivity. BFD was used to track static route availability so that if route down, VRRP is informed to switch service to the other redundant node.After upgrade, the new software caused the BFD configuration to impact connectivity putting the service down.
Fault Type :
Operation and maintenance>>Upgrade or Update
Issue Description :
Fault Symptom :
After NE40E-X3 router upgrade from V600R008C10SPC300 to V800R010C10SPC500, connectivity for LTE RAN sites went down and LTE service was impacted for a whole region
Version Information :
Before upgrade we had the version below :
##########################################
Huawei Versatile Routing Platform Software
VRP (R) software, Version 5.160 (NE40E&80E V600R008C10SPC300)
Copyright (C) 2000-2014 Huawei Technologies Co., Ltd.
HUAWEI NE40E-X3 uptime is 1434 days, 15 hours, 50 minutes
Patch version : V600R008SPH091
##########################################
After upgrade, we had the version below :
Huawei Versatile Routing Platform Software
VRP (R) software, Version 8.180 (NE40E V800R010C10SPC500)
Copyright (C) 2012-2018 Huawei Technologies Co., Ltd.
HUAWEI NE40E-X3 uptime is 4 days, 7 hours, 24 minutes
Patch Version: V800R010SPH085

Section of Configuration Script concerned :
#######################################################################################
#
#
N.B : CONFIGURATIONS ARE SIMILAR FOR THE OTHER PE NODE
########################################################################
For the IPRAN Router :
There is no config for BFD. This is the reason why before the operation, the BFD was down but this did not impact connectivity to the eNodeB with the software version V600R008C10SPC300
#
interface Eth-TrunkX.Y
vlan-type dot1q Y
description To_NE40E-X3-PE-Router_Eth-TrunkX.Y
set flow-stat interval 300
ip binding vpn-instance LTE_VRF
ip address *.*.1.2 255.255.255.248
vrrp vrid 25 virtual-ip *.*.1.2
vrrp vrid 25 priority 110
vrrp vrid 25 preempt-mode timer delay 60
statistic enable
#
ip route-static vpn-instance LTE_VRF 0.0.0.0 0.0.0.0 VRRP_IP_ON_PE description SGSN(MME)
#
N.B : THE OTHER IPRAN ROUTER HAS A SIMILAR CONFIGURATION TO THE ABOVE
##########################################################################
Operation Scenario:
1. During night activity, initial checks were done on the node to be upgraded. (saved running configuration, check routing tables, arp tables, mac address tables, device and pic status, memories, cpu, connectivity, BFD sessions)
At this level, the bfd session was down before the operation but had no impact on eNodeB connectivity
Session MIndex : 32997 (Multi Hop) State : Down Name : vrrp833
--------------------------------------------------------------------------------
Session Type : Static
Bind Type : IP
Local/Remote Discriminator :Y/X
Vpn Instance Name : LTE_VRF
Received Packets : 0
Send Packets : 9231037
Received Bad Packets : 0
Send Bad Packets : 0
Down Count : 1
ShortBreak Count : 0
Send Lsp Ping Count : 0
Dynamic Session Delete Count : 0
Create Time : 2016-07-11 03:30+01:00
Last Down Time : 2016-07-11 03:30+01:00
Down Status Lasting Time : 1126D:08H:05M:24S
Last Up Time : 2016-07-11 03:30+01:00
Last Up Lasting Time : 059D:03H:37M:57S
Total Time From Create : 1185D:11H:43M:21S
--------------------------------------------------------------------------------
2. Copy new software and config file into the device
3. Change next startup files to the new software and config
4. Reboot the node. After reboot, all eNodeBs were unreachable impacting LTE service (BFD session was still down because it was not configured on the IPRAN node)
########################################################################################
<PE Router>
<PE ROUTER>dis bfd session all
(w): State in WTR
(*): State is invalid
--------------------------------------------------------------------------------
Local Remote PeerIpAddr State Type InterfaceName
--------------------------------------------------------------------------------
Y X VRRP_IP_RAN Down S_IP_PEER Eth-TrunkX.Y
##########################################################################################
Troubleshooting Process :
1. Ping was performed from the PE router to an eNode B site. It was noticed ping was not successful
2. Ping was performed to the next hope address and it was noticed it was not successful.
3. Route was checked and it was noticed BFD was configured.
4. BFD configuration for this route was shutdown and ping performed to eNodeB and it was successful
#####################################################################
bfd LTE_Route_BFD bind peer-ip *.*.1.2 vpn-instance LTE_VRF source-ip *.*.1.2
Root Cause :
The root cause was the BFD session which was down that made the connectivity to be impacted.
But it has to be noted that with the old software version (V600R008C10SPC300), the BFD was down but was ignored by the software (software was not sensitive to this configuration) and route was reachable.
When upgrade was done to new version (V800R010C10SPC500), the BFD session which was initially down impacted connectivity (new software was sensitive to this configuration)
This is inline with RFC 5880 (https://tools.ietf.org/html/rfc5880) which states that once the BFD session is up after configuration, anytime it goes down it will impact connectivity.
This means it is a BUG in the old software (V600R008C10SPC300) that was corrected in the new software (V800R010C10SPC500)
Solution :
Workaround :
Since we were already running a new version of the software, (V800R010C10SPC500), we were obliged to shutdown the BFD to bring back connectivity to the eNodeB
#############################################################################
<PE-ROUTER>dis bfd session all
(w): State in WTR
(*): State is invalid
--------------------------------------------------------------------------------
Local Remote PeerIpAddr State Type InterfaceName
--------------------------------------------------------------------------------
Y X VRRP_IP_RAN AdmDown S_IP_PEER -
############################################################################
Actual Solution :
BFD configuration was done on the IPRAN router to align the BFD connectivity between the IPRAN Router and the PE-Router so that connectivity is established and switching is rapid when a link/node is not available.
N.B: This configuration is similar to the one on the PE shown above but for the reverse order of the local and remote discriminators and the application of the BFD to the static route to reach the MME.
Suggestion / Recommendation :
1. Ensure before operation, BFD sessions should be checked and aligned
2. A patch should be used on the software (V600R008C10SPC300) to correct the issue of BFD down that did not impact connectivity
The following patch below was running on the node :
#################################################
<IPRAN-ROUTER>dis patch-information
Patch Package Name :cfcard:/v600r008sph091-300.pat
Patch Package Version:V600R008SPH091
The state of the patch state file is: Running
The current state is: Running
################################################

