Hello everyone!
Today, I’d like to share a case.
Problem Description
MA5600 V300R003C05SPC300
U2000 V100R001C01SPC100
When making SELT test to ports at 1 DSLAM from U2000, it always fails with error timed out.
Problem Analysis
1. Make SELT test from DSLAM itself, it succeed, so it is not DSLAM issue.
2. Delete the NE from U2000 and add it again to resolve any problem in DB of the U2000, problem still exists.
3. Capture packets from U2000 to the SELT operation for the problematic DSLAM & a non-problematic one to compare them.
We found some traps missing in the problematic DSLAM case which causes U2000 to not be able to read the result of the whole operation.
Root Cause
network issue
Solution Description
To prove it to customer it was very hard to convince him that the problem is in his network, so there were 2 approaches:
1. Capture packets from uplink of DSLAM & U2000 at the same time and compare them which each other to prove packet loss exist at the network.
2. Backup DB of DSLAM to another controller in lab and make SELT test in lab, if it succeeded, take the card to the site and make the SELT operation from site, also bring the controller in site to the lab and make SELT test.
Customer agreed on 2nd proposal as capture from uplink of MA5600 was not easy task.
1. As SELT succeeded in lab using same DB, this shows that DB doesn't have error.
2. After the controller in lab sent to site and problem also occurs, this supports our conclusion that packets are lost in network.
3. Finally after the main controller card in network was imported to lab, the test succeeded, this removed any doubts.
SELT results are reported as traps, which are UDP packets, so if any packet is lost, then this means that it will not be sent again causing whole test to fail.
Suggest
Sometimes substitution & compare methods solve problems regardless of understanding of problem technically
Welcome to leave a message below.
We’ll study together.
Thank you!
