the S5720EI Stack Split troubleshooting

Latest reply: May 11, 2018 06:53:12 1064 2 0 0

issue:an S5720EI stack split occurred. As a result, Eth-Trunks connected to other devices flapped, affecting services. 

 

Handling Process

                              
Step 1     
According to the device
logs, a member switch is removed from the stack at 21:33, and the standby
switch becomes an independent switch after the split.

Feb 
8 2018 21:33:29+03:00 S5720.Pobedy2.Opt
%FSP/4/NBR_LOST(l)[30200]:Neighbor has been lost on port 1 in slot 1.

Feb 
8 2018 21:33:29+03:00 S5720.Pobedy2.Opt
%FSP/4/NBR_LOST(l)[30201]:Neighbor has been lost on port 2 in slot 1.

Feb 
8 2018 21:33:29+03:00 S5720.Pobedy2.Opt
%LOAD/6/SLOTLEFT(l)[30202]:Slot 2 left the stack.

Feb 
8 2018 21:33:29+03:00 S5720.Pobedy2.Opt
%ALML/4/ENT_PULL_OUT(l)[30204]:Board[2] was pulled out.

Feb 
8 2018 21:33:29+03:00 S5720.Pobedy2.Opt
%ALML/4/PUBLISH_EVENT(l)[30205]:Publish event. (Slot=2, Event
ID=BOARD_PLUGOUT).

                              
Step 2     
According to the logs
generated before device restart at the fault occurrence time, the master and
standby switches in the stack cannot communicate with each other, and heartbeat
packet loss occurs.

18-Feb-08 21:33:05.175.1+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210781]:Slot=2;Nvramstack trace
45: 2018-02-08 13:33.170:Stack port 1 does not receive any hello packet for 10
second(s).

18-Feb-08 21:33:05.175.2+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210782]:Slot=2;Nvramstack trace
46: 2018-02-08 13:33.170:Stack port 2 does not receive any hello packet for 10
second(s).

18-Feb-08 21:33:19.735.2+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210987]:Nvramstack trace 68:
2018-02-08 13:33.715:Stack port 1 does not receive any hello packet for 10
second(s).

18-Feb-08 21:33:19.735.3+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210988]:Nvramstack trace 69:
2018-02-08 13:33.715:Stack port 2 does not receive any hello packet for 10
second(s).

18-Feb-08 21:33:19.735.4+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210989]:Nvramstack trace 70:
2018-02-08 13:33.725:Stack port 1 does not receive any hello packet for 15
second(s).

18-Feb-08 21:33:19.735.5+03:00
S5720.Pobedy2.Opt 01FSP/6/NVSTACK_TRACE(D)[14210990]:Nvramstack trace 71:
2018-02-08 13:33.735:Stack port 2 does not receive any hello packet for 15
second(s).

                              
Step 3     
According to the diagnostic
logs, a large number of IPC message errors are generated at the fault
occurrence time. As a result, heartbeat packet loss occurs, causing a stack
split.

18-Feb-08 21:32:58.175.2+03:00
S5720.Pobedy2.Opt 01IPC/4/RESENDNOTMATCH(D)[14210740]:The IPC sending sequence
number was not the expected number because retransmitted.
(DestinationChannelId=31014931, SourceNode=2/0, QueueId=1, SendNum=357,
ExceptedNum=360, LostNum=65535,FragFlag=30, MessageId=313979825)

18-Feb-08 21:32:58.175.3+03:00
S5720.Pobedy2.Opt 01IPC/6/RECVFISTRESENDMSG(D)[14210741]:IPC received the first
retransmitted message. (DestinationChannelId=31027278, SourceNode=2, QueueId=2,
SendSequenceNumber=113, ExceptedSequenceNumber=118, LostSequenceNumber=65535,
ExceptedResendSequenceNumber=65535, FragmentFlag=30, MessageId=186982377)

18-Feb-08 21:32:58.175.4+03:00
S5720.Pobedy2.Opt 01IPC/4/RESENDNOTMATCH(D)[14210742]:The IPC sending sequence
number was not the expected number because retransmitted.
(DestinationChannelId=31027278, SourceNode=2/0, QueueId=2, SendNum=113, ExceptedNum=118,
LostNum=65535,FragFlag=30, MessageId=186982377)

18-Feb-08 21:32:58.255.1+03:00
S5720.Pobedy2.Opt 01IPC/4/RESENDINFO(D)[14210743]:Slot=2;The IPC message was
retransmitted. (DestinationChannelId=19, DestinationNode=[9], QueueId=1,
SendSequenceNumber=357, QueueLength=3, QueueTail=359, ResendTimes=1,
FragmentFlag=30, MessageId=313979825)

18-Feb-08 21:32:58.255.2+03:00
S5720.Pobedy2.Opt 01IPC/4/RESENDINFO(D)[14210744]:Slot=2;The IPC message was
retransmitted. (DestinationChannelId=78, DestinationNode=[9], QueueId=2,
SendSequenceNumber=113, QueueLength=5, QueueTail=117, ResendTimes=1,
FragmentFlag=30, MessageId=186982377)

----End



                                                Root CauseBased on the preceding analysis and fault
symptom, the root cause is described as follows:

IPC message errors on the S5720EI stack cause heartbeat packet loss between member devices, leading to
a stack split.


                                                SolutionThe latest patch of V200R010 has solved
this issue. However, the switch version on the live network is V200R010SPC300,
not the GA version. To solve this issue, upgrade the switches on the live
network to V200R010SPC600 and install the latest patch.
 
 





 




 


This post was last edited by Cybertan at 2018-02-08 13:33.
  • x
  • convention:

MVE Created Apr 13, 2018 10:46:41 Helpful(0) Helpful(0)

useful document, thanks
  • x
  • convention:

Telecommunications%20Engineer%2C%20currently%20senior%20project%20manager%20of%20the%20radio%20access%20network%20and%20partner%20of%20Huawei%20de%20Tunisia.
Created May 11, 2018 06:53:12 Helpful(0) Helpful(0)

useful document, thanks
  • x
  • convention:

Reply

Reply
You need to log in to reply to the post Login | Register

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " Privacy."
If the attachment button is not available, update the Adobe Flash Player to the latest version!

Login and enjoy all the member benefits

Login
Fast reply Scroll to top