Got it

1+1 Linear MS on an OptiX OSN 3500 on the Live Network Is in Starting State

138 0 0 0 0

 

Hello, everyone!

 

This post will tell you one issue for 1+1 Linear MS on an OptiX OSN 3500 on the Live Network Is in Starting State.

Problem description:

On a site, 32 linear MSP protection groups were configured on OptiX OSN 3500 NEs. The single-ended non-revertive mode was configured for the 23rd to 32nd 1+1 linear MSP protection groups. However, the protocols were in starting state. After the protocols were restarted, the problem was resolved.

#0x90cbe:cfg-get-lmsstate:23;

                               LMS-SWITCH-STATE                                 

                    PG-ID  PU-ID  SWITCH-REQUEST  SWITCH-STATE                   

                    23     0      LPS_NR          Starting     . 

 Handling procedure:

NE software version: 5.21.18.50P01

Cross-connect board version: SSN1UXCSA 8.13

            Step 1      Analyzed the black box of the board in slot 9, and found that the board in slot 9 was switched to the standby board, and the board in slot 10 was the active board since September 16. When the board in slot 10 was switched to the active board, the timer failed to be started. As a result, the protocols of some linear MSP groups were in starting state.

3904 2010-09-15 20:14 0x77 0C 40 01     //The active and standby cross-connect boards were manually switched.

3905 2010-09-15 20:14 0x95 0C E2 09 01  //The board in slot 9 was switched to the standby board.

            Step 2      Analyzed the black box of the board in slot 10, and found that the board was warm reset for the last time at 4:11:42.

4 2010-09-15 20:14 10 0xF0000010 0x3

The timer failed to be started because delivered a failure message was delivered whose error code was 524299, indicating that the number of message queues reached a maximum value.

243 2010-09-15 20:14 0xAC Level:2, Apsadpt.cpp, Line:8230, dwRc[524299], SendMsg Err

244 2010-09-15 20:14 0xAC Level:2, LpsAdpt.cpp, Line:5736, dwRe[1], Timer ERR

245 2010-09-15 20:14 0xAC Level:2, Apsadpt.cpp, Line:8230, dwRc[524299], SendMsg Err

When the protocol was restarted, four timers were disabled and one timer was disabled on each linear MS. However, a queue of the timer could only contain 128 bytes. 28 bytes were configured in the MS. Therefore, the message queue of the timer module overflows, and some timers in the linear MS could not be started. As a result, the linear MS was in starting state.

10     1  EXTCMDTIMER_STOP     0x0000   2010-09-15 20:14  0x042f8faa 

10     2  EXTCMDTIMER_STOP     0x0001   2010-09-15 20:14  0x042f90ff 

10     3  T1_STOP              0x0000   2010-09-15 20:14  0x042f9249 

10     4  T2_STOP              0x0000   2010-09-15 20:14  0x042f9394 

10     5  TK12_STOP            0x0000   2010-09-15 20:14  0x042f94df 

10     6  K_ON_OFF             0x0000   2010-09-15 20:14  0x042f9694 

10     7  K_ON_OFF             0x0001   2010-09-15 20:14  0x042f9775 

10     8  T1_START             0x0000   2010-09-15 20:14  0x042f9dc5 

 

            Step 3      Analyzed the ocplog, and found that the NE was upgraded at 03:45:29 on September 16, 2010. Before the upgrade, the board in slot 10 was the active board, and the board in slot 9 was the standby board. At 04:14:45 on September 16, 2010, the standby board was upgraded first and then the active board. However, the versions of the active and standby boards were different. As a result, the protocol run on the board in slot 10 was restarted after active/standby switching.

   Init OCP Log OK 2010-09-15 20:14 18 1

   NESOFT_VER: 5.21.18.50P01 Feb 25 2010 11:49:45

The 32rd MSP group was in stop state. Why?

            Step 4      Analyzed the bb4.log, and found that the command for starting the protocol of the 32rd MSP group was not received.

4030 2010-09-15 20:14 0x46 25 00 1A 00 00 00 01   //The 26th MSP group was created.

4186 2010-09-15 20:14 0xAC 25 08 19               //The 25th MSP group was enabled.

4223 2010-09-15 20:14 0x77 0C 40 01    //The board with a larger slot number was switched to the active board.

4224 2010-09-15 20:14 0x77 0C 40 01    //The board with a larger slot number was switched to the active board.

4225 2010-09-15 20:14 0x95 0C E2 0A 00 //The board in slot 10 was switched to the active board.

            Step 5      Recovered the NE database on the live network, and reset the standby board in the lab. After the standby board was on line, the problem was reproduced by switching the active and standby cross-connect boards. However, the commands were lost after the configurations of the 25th MSP group were received.

 bb4.log                               2010-09-15 20:14          25 a0 19 00 

bb4.log                               2010-09-15 20:14          25 04 19 02 58  

The length of the receive queue for the board software was 128. When the protocol of each MSP group was started, six commands were delivered. If multiple commands were delivered at the same time, the commands delivered later would be discarded. As a result, the MSP protocol was in stop state due to no command for starting the protocol was received.

Root cause:

When the number of linear MSP groups was more than 22, the lengths of the message queue of the linear MS timer and command receiving module were insufficient. As a result, the state was abnormal.

Solution:

Workaround: Manually restart the linear MSP group protocol.

Solution: Upgrade the software of the cross-connect board.     

That's all, I welcome everyone to leave a message and exchange in the comment area!

 

Thank you

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.