Got it

Problem of forwarding at 01:00 hrs everyday on PE40MAIP01 device

Latest reply: Sep 30, 2018 11:32:49 1921 2 0 0 0
【Problem Summary】There were management and service problems with the equipment Site: Chiloe
【Problem Details】Customer reported that, from 13th, December, RNC_CHILOE always had a service interrupt around 1:02 a.m., and then restored by itself.
At 15th, December night, customer did some tracert/ping test from RNC_CHIO to RNC_PDV, the IP address for RNC_CHIO is 10.182.41.185, and the IP address for RNC_PDV is 10.182.40.169.
Before the problem happened, the NE40E upgraded from V6R3 to V6R7.
  • x
  • convention:

Ravger
Created Mar 27, 2016 12:21:21

Handling Process
1.       Traffic capture in the sw2 device with port mirroring, from at 00:55 hrs.

Perform tracert and display to route from sw2 device, with source interface next to RNC, and the same work from RNC. This work is going to perform  before and after of fail Perform tracert and display to route from PE of fixed network from RNC. This work is going to perform  before and after of fail Shutdown, tests, and undo shutdown again to each vlanif , until find the vlan with mistake.

2.        For this tests the customer replaced the NPGEP2 (RNC 2) for your laptop: GE3/0/18 -> UCHL2_NPGEP3_IFGE1 ( vlan 520 and 891).. the IP that use is 10.178.199.46/30, ( to moment of the problem continued the ping to GGSN.....ok)

After test and analysis, the root cause was confirmed: while device configured dot1q termination plus load-balance, V6R7 version software sometimes make mistake while update FIB table, this can affect data forwarding
Root Cause

When the device try to switch from the “single nexthop mapping table” to “load-balancing multi-nexthop mapping table”, it mis-released the index of the “single nexthop mapping table”.

If other routes apply for the index resource inside the device and get this wrong index, the forwarding service would have forwarding problem.



Scenarios that trigger the problem:

a.       When route A switch to “load-balancing multi-nexthop mapping table”, device mis-released the index of the “single nexthop mapping table” (A doesn’t need the index in the data forwarding plane, but A mis-release the index meanwhile remember it as its own index.)

b.       A new route B try to establish and it applies for an index resource inside the device, then it got the old index which belong to A.

c.       When Routing-table refreshing, both A and B tried to refresh the route item, and different order causes different problem:

·         If route A refresh the route item then B refresh it, route A will lead to a very short problem in the forwarding table and the service will be interrupted at recovered at once.

·         If route B refresh the route item then A refresh it, the normal information inside route B will be recovered by route A, and service will be interrupted and never recovered by itself
Solution

【Resolution Summary】

1.1 Temporary Solution

1.         Disable the FIB regularly-refresh, it is identified that has no impact to the live network service.

2.         Because routes flapping will also trigger this problem, disable the FIB regularly-refresh cannot 100% avoid the problem reproduction.

3.         The trigger of the FIB mistake is dot1q-termination plus sub-interface load-balancing, and now in the live network the dot1q termination sub-interface only terminated one vlan. So customer can change the sub-interface mode from dot1q-termination to vlan-type dot1q to avoid the problem. After change the interface mode, it’s necessary to reset the board to validate the configuration.

【Resolution Details】

Develop a new patch to solve it.

The planning and solution estimating is under discussing internally.

View more
  • x
  • convention:

Mysterious.color
Created Sep 30, 2018 11:32:49

The handling process is very clear
View more
  • x
  • convention:

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.