Got it

PTN troubleshooting

Latest reply: Jun 15, 2021 08:04:57 539 87 28 0 0

Hello,


Today I would like to share with you PTN troubleshooting.


I. Overview


PTN network troubleshooting is very important in daily maintenance. In order to better carry out the daily maintenance of PTN network, a PTN network troubleshooting manual is specially compiled.


The troubleshooting of PTN equipment in this manual mainly involves three parts: fault analysis and positioning, business interruption fault emergency recovery, and common fault handling methods. These two parts will be described in detail below.


II. Network fault analysis and location method


Based on the experience of handling network element de-management or service interruption in the existing network, generally follow the "one analysis, two switch/reset, three board replacement" solution. Ensure the stable operation of the PTN network and minimize unexpected accidents.


When handling a fault, you should start with analyzing the fault phenomenon and locate the cause of the fault as soon as possible. This section introduces various methods, application scenarios, and application examples for analyzing and locating faults.


II.1 Alarm analysis method


The alarm analysis method is one of the commonly used methods for locating faults. When equipment fails, it is usually accompanied by a large number of alarms. Through the analysis of the alarm, the type and location of the fault can be roughly judged.


Query alarms through U2000: As long as you right-click on the NE icon of the U2000 main topology, you can query the following alarm information:


  • Current alarm

  • Historical alarms on the network element side

  • Historical alarms on the network management side


By analyzing and locating the cause of the alarm, clear the alarm and eliminate the fault.


When obtaining alarm information through the U2000, care should be taken to ensure that the current time of each network element in the network is synchronized with the network management time. If the current time of the network element is not synchronized with the network management time, it will cause an error in the information report. During the maintenance process, after reconfiguration of a certain network element, special attention should be paid to synchronizing the current time of the network element with the network management time. Otherwise, the network element will work in the default time, and the default time is not the current time.


Example 1: In simple networking, when the alarm is generally cleared, the fault is also eliminated.


In the link diagram shown in the figure below, the network management computer is connected to NE2.


Symptom: The E-Line service between NE1 and NE2 is interrupted, and NE2 reports an ETH_LOS alarm.


Fault analysis and location: Check the possible causes of the ETH_LOS alarm, and finally locate the cause of the service interruption fault. After the alarm is cleared, the business returns to normal and the fault is rectified.


Example 2: In a complex network, by analyzing new alarms and cleared historical alarms, the key to troubleshooting can be found. There was a sudden broadcast storm in a complex ring topology network, and a large number of FLOW_OVER alarms appeared on each network element, and business was interrupted. Although the service was restored by methods such as disconnecting the loop fiber, and the FLOW_OVER alarm was cleared, the cause of the failure could not be located. Analyzing the alarms of the entire network, it is found that when a UNI port reports the FLOW_OVER alarm, the historical alarm ETH_LOS of the port is automatically cleared. Following this clue, it was discovered that a remote loopback occurred on the third-party device connected to the UNI port, which caused a loop on the network. After the loopback is released, the fault is completely eliminated.


II.2 Performance statistical analysis method


The performance statistical analysis method analyzes and locates faults through statistics of "current performance" and "RMON performance". Determine whether the performance statistics of boards, ports, tunnels, and PWs are normal, and you can determine whether there are faults.


Current performance: The following table distinguishes different "objects" and lists the judgment criteria for current performance statistics.


Table Current performance statistics table


Object

Judgment criteria

Physical board/port


  • For a single board, its optical power, operating temperature, and CPU/memory occupancy rate

    should be within the normal range.


  • · For the port, there should be no errors.


MPLS Tunnel


There is no packet loss in the tunnel.

IP/GRE Tunnel


There is no packet loss in the tunnel.

Ethernet business OAM


There is no packet loss.


  • Turn on the current performance statistics function of the network element.

  • The specific meaning and explanation of the current performance supported by the network element.


RMON performance: The following table distinguishes different "objects" and lists the judgment criteria for RMON performance statistics.


Table RMON performance statistics table


Object

Judgment criteria

Physical board/port


  • For the main control board, the CPU occupancy rate should not be too high.


  • For ports, there are counts in the sending/receiving direction.

MPLS Tunnel


  • There are counts in the sending/receiving direction.


  • No packet loss.

IP/GRE Tunnel


  • There are counts in the sending/receiving direction.


  • No packet loss.

Business PW


  • There are counts in the sending/receiving direction

    of the PW.


  • There is no abnormal statistics such as packet loss, out-of-sequence packets, and bit errors.


  • The tunnel is bidirectional, you can select the forward tunnel or the reverse tunnel in the "object".

  • The specific meaning and explanation of the RMON performance supported by the network element.


Example: Two tunnels were previously configured for two network elements, but the APS protection group was configured on only one end of the network element, resulting in poor ATM service quality. Query the RMON performance of the ATM service on the network management system, and found out-of-order packet count, and the out-of-order packet count is about 50% of the received cell count. From this, it is judged that the service is double-received and the cause of the problem is found.


You are welcome to leave a message and exchange in the comment area. Thank you!

The post is synchronized to: Community Blog

  • x
  • convention:

Irina
Admin Created 2 days 08:04

Hello, @wissal
Because your article is very qualitative and valuable, we've decided to feature it on our Blog Collection: https://forum.huawei.com/enterprise/en/forum.php?mod=collection&action=view&ctid=431&orderby=views&order=desc

Congrats!
View more
  • x
  • convention:

wissal
wissal Created 2 days 17:42 (0) (0)
Thank you so much  
GhaziAsad
Created May 9, 2021 17:59:11

Cool
View more
  • x
  • convention:

Adriale
Adriale Created May 11, 2021 14:27:46 (0) (0)
 
GhaziAsad
GhaziAsad Reply Adriale  Created May 11, 2021 16:40:34 (0) (0)
 
GhaziAsad
GhaziAsad Reply Adriale  Created May 11, 2021 16:40:44 (0) (0)
 
GhaziAsad
Created May 9, 2021 17:59:22

Thanks for sharing Wissal
View more
  • x
  • convention:

shakeela
shakeela Created May 9, 2021 18:08:02 (0) (0)
 
wissal
wissal Created May 9, 2021 19:18:08 (0) (0)
Thanks for your reading!  
andersoncf1
Moderator Created May 9, 2021 18:03:47

Well done
View more
  • x
  • convention:

shakeela
shakeela Created May 9, 2021 18:08:10 (0) (0)
 
wissal
wissal Created May 9, 2021 19:18:20 (0) (0)
Thanks for your reading!  
Irshadhussain
Irshadhussain Created May 22, 2021 17:09:47 (0) (0)
 
shakeela
Created May 9, 2021 18:07:53

Thanks for sharing
View more
  • x
  • convention:

wissal
wissal Created May 9, 2021 19:18:29 (0) (0)
Thanks for your reading!  
Irshadhussain
Irshadhussain Created May 22, 2021 17:09:53 (0) (0)
 
gabo.lr
Created May 9, 2021 18:23:06

Good sharing!
View more
  • x
  • convention:

wissal
wissal Created May 9, 2021 19:18:38 (0) (0)
Thanks for your reading!  
Vlada85
MVE Author Created May 9, 2021 19:42:48

Very good post! Thank you for sharing!
View more
  • x
  • convention:

wissal
wissal Created May 9, 2021 19:48:50 (0) (0)
Thanks for your reading!  
zaheernew
zaheernew Created May 21, 2021 10:14:03 (0) (0)
 
MahMush
Created May 9, 2021 20:33:06

Very good sharing
View more
  • x
  • convention:

wissal
wissal Created May 9, 2021 20:41:53 (0) (0)
Thanks for your reading!  
user_4000619
user_4000619 Created May 13, 2021 19:23:03 (0) (0)
yes  
LilStylz237
Created May 9, 2021 22:56:28

That's great, thanks
View more
  • x
  • convention:

wissal
wissal Created May 10, 2021 06:50:51 (0) (0)
Thanks for your reading!  
Nino_Chou
Admin Created May 10, 2021 00:46:46

Thanks for sharing.
View more
  • x
  • convention:

wissal
wissal Created May 10, 2021 06:51:00 (1) (0)
Thanks for your reading!  

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " Privacy."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.