Hello,
Today, I would like to share with you this topic which expose problem when the Service Faults Caused by Incorrect Slot Selection on OSN1500B.
Problem
In the SDH network of a project, one site (B55) is equipped with a set of OSN1500B equipment, which forms an STM-1 ring with the other 3 network elements. The topology is shown in the picture below. There are EPL Ethernet services between station B55 and station CS7. The Ethernet service processing versions are EFS0 and EFS0A respectively. Station B55 is EPL configuration, CS7 is EPLAN configuration, and Ethernet services are protected by SNCP on the line side.

Alarm message
After the optical cable between site B52 and site B55 is interrupted, a VCTRUNK on the EFS0A board of CS7 reports the VCAT_LOA alarm, and the client side inquires that the Ethernet service is interrupted.
Process
The first time, optical cable break between B52 and B55:
1. When the optical cable is interrupted, the network management has no VCAT_LOA alarm, and the client side services is normal; the engineer does not care and does his best to deal with the optical cable interruption failure.
2. A few hours later (the optical cable is still not repaired at this time), a VCTRUNK on the CS7 network element EFS0A board reports the VCAT_LOA alarm, and the customer side Ethernet service is interrupted at this time; the engineer is dealing with the optical cable failure at the site, and the local engineer relies on his own in my experience, I deleted the Ethernet service between B55 and CS7 and reconfigured it again. At this time, the alarm disappeared and the service resumed. The local engineer believed that the fault had been resolved, and notified the engineer to repair the optical cable and return.
3. The alarm did not reappear, and everything seemed normal.
The second time, a few weeks later, the optical cable was disconnected again between stations B52 and B55:
1- When the optical cable is interrupted, the phenomenon is the same as the first time, the network management has no alarm, and the services is normal.
2- A few hours later, the fault reappeared, the CS7 network element reported the VCAT_LOA alarm, and the client-side Ethernet service was also interrupted. After carefully analyzing the cause of the VCAT_LOA alarm, the engineer inquired about the board manufacturing information of the B55 network element in the network management, and found that the CXL1 board in slot 4/5 of the OSN1500B was not inserted in accordance with the engineering design requirements.
According to engineering design requirements,
Slot 4 should be CXLL112----STM-1 System Control, Cross-connect, Optical Interface Board (L1.2, LC);
Slot 5 should be CXLL111----STM-1 System Control, Cross-connect, Optical Interface Board (L1.1, LC);
The slot 4 circuit board should be connected to site B52;
The slot 5 circuit board should be connected to site B57.
After the engineer went to the site, he found that the positions of the boards in slot 4/5 had been reversed, which did not meet the design requirements.
After swapping the 4/5 slots board according to the design, and re-delivering data from the network management system to the network element, the alarm disappeared and the service resumed.
Causes
VCAT_LOA is an alarm of excessive virtual cascade delay. This alarm indicates that the delay time of the time slot bound by VCTRUNK exceeds the time allowed by the virtual concatenation delay.
When transmitting service data, the delay alignment time of the virtual concatenation is too long, and the time slot cannot form a data frame, so the service has packet loss.
The possible cause of the VCAT_LOA alarm are as follows:
The configured VCTRUNK time slots have passed through physical links of different distances.
Supplement:
The virtual concatenation delay time of the VC-12 time slot of the EFS0/EFS0A board is 30ms, and the virtual concatenation delay time of the VC-3 time slot is 15ms.
Summary
1- When the fiber is broken, SNCP switching occurs. At this time, there should be multipath due to SNCP switching, that is, the service of the VCTRUNK reaches the opposite end through different paths, and the difference between the two paths is large, which causes the services on the two paths to reach the opposite end. The time difference between the terminals exceeds the virtual concatenation delay time, which affects service interruption.
2- At the same time, if the circuit board is connected incorrectly, there will be a service multipath problem. When the service reaches the opposite end, the virtual cascade delay will be exceeded, and an alarm will be reported and the service will be interrupted.
3- Why doesn't the alarm appear immediately:
The multipath of its own services to reach the opposite end may be within the virtual concatenation delay, but a certain link has a certain delay at a certain period of time, or the line side is configured with SNCP switching. After the switching, the path changes and the virtual cascade occurs. Excessive cascading delay.
You are welcome to leave a message and exchange in the comment area. Thank you!




