Got it

Impact on Voice Service on IPCORE due to queue-depth Issue Highlighted

Latest reply: Feb 22, 2022 20:06:49 311 46 21 0 0

Hi there!


Today I'd like to share with you how to tackle the impact on voice service on IPCORE due to queue-depth issue. Have a read below in order to learn more about the subject.


ISSUE DESCRIPTION


The IP Core is a very critical area of the IP network, as it links different cities. In a network having most of its CORE GSM nodes in one city, all the other cities must use the IP Core to get access to this main city for charging or billing purposes. This makes the IP Core a very major point of concern.


In this scenario, we received many customer complaints related to voice traffic which are:


  • difficulty in initiating voice calls;


  • in cases which the voice calls can be initiated, we have a long call setup time;


  • silent voice calls;


  • calls dropped abruptly within a communication.


HANDLING PROCESS


Step 1. The Network Operation Center got the customer complaints of not being able to make voice calls and immediately logged an incident ticket in emergency mode, as this is a major issue that affects the Net Promoter Score (NPS), Customer Experience (CEX) and revenue. This ticket was assigned to the back office team. 


Step 2. The Back Office team checked on the NCE and noticed the fiber links linking the major cities were all down, so traffic automatically shifted to the backup MW links with the capacity of 2.5Gbps. In this case, the fiber link is 20G. With a QoS mechanism in place, it was expected the data traffic was impacted due to the lack of adequate capacity (traffic moved from the fiber with 20G capacity to Microwave with 2.5G capacity). However, it was not expected that the voice traffic would be impacted, as it is less than 1.5Gbps and can be accomodated in the Microwave. Below is the configuration of the IP Core router interface connected to the Microwave:


<IPCORE-ROUTER>dis cur int Eth-Trunk 7

#

interface Eth-Trunk7

 mtu 9100

 description Backup-MW-link-to-Router:2.G:Eth-trunk7

 ipv6 enable

 ip address a.b.c.d/30

 ipv6 address xxxxxxxxxx/127

 trust upstream default

 isis enable 1

 isis ipv6 enable 1

 isis circuit-type p2p

 isis circuit-level level-2

 isis authentication-mode md5 cipher xxxxxxxxxxxx

 isis cost 10000

 isis ldp-sync

 isis bfd enable

 isis bfd min-tx-interval 100 min-rx-interval 100 detect-multiplier 5

 mpls

 mpls te

 mpls rsvp-te

 mpls rsvp-te hello

 mpls ldp

 port shaping 2200

 port-queue be wfq weight 15 port-wred be outbound

 port-queue af1 wfq weight 15 port-wred af1 outbound

 port-queue af2 wfq weight 30 port-wred af2 outbound

 port-queue af3 wfq weight 40 outbound

 port-queue ef pq port-wred ef outbound

 port-queue cs6 pq port-wred cs6 outbound

 statistic enable

#


[IPCORE-ROUTER]dis cur | b queue-depth

port-wred af1

 color green low-limit 70 high-limit 100 discard-percentage 100

 color yellow low-limit 60 high-limit 90 discard-percentage 100

 color red low-limit 40 high-limit 80 discard-percentage 100

#

port-wred be

 color green low-limit 70 high-limit 100 discard-percentage 100

#

port-wred af2

 color green low-limit 70 high-limit 100 discard-percentage 100

 color yellow low-limit 60 high-limit 90 discard-percentage 100

 color red low-limit 40 high-limit 80 discard-percentage 100

#

port-wred cs6

 queue-depth 2750

#

port-wred ef

 queue-depth 2750

#


The Microwave capacity is 2.5Gb, so we set our port-shaping value at 2.2G for QoS, as we usually experience a drop in the capacity of the microwave link related to the degradation of the microwave link KPIs due to poor atmospheric conditions, as well as the interference from external parties. Part of the microwave capacity along the axis between cities is used to drop access traffic for 2G/3G/4G sites. All these helped us set the port-shaping value as 2.2G for QoS to effectively happen.


Step 3. Considering the voice traffic is less than 1.5Gpbs and the Microwave is at 2.5Gbps, we were not expecting the voice to be affected and so had to do further investigations to understand the issue. We checked the port-queue statistics using the display port-queue statistics interface interface_name inbound|outbound command and saw the output below:


<IPCORE-ROUTER>dis port-queue statistics int eth  7 outbound

……

  [ef]

 Current usage percentage of queue: 0

    Total pass:  

                       5,783,052,890   packets,            686,727,448,968   bytes

    Total discard:

                         261,292,404 packets,               36,121,514,395 bytes

    Drop tail discard:

                                   0   packets,                          0   bytes

    Wred discard:

                         261,292,404   packets,             36,121,514,395   bytes

    Last 30 seconds pass rate:

                             315,066   pps,                    276,745,008 bps

    Last 30 seconds discard rate:

                                   0   pps,                              0 bps

    Drop tail discard rate:

                                   0   pps,                              0 bps

    Wred discard rate:

                                   0   pps,                              0 bps

    buffer size:                  2750 kbytes

    used buffer size:                 31 kbytes

[cs6]

 Current usage percentage of queue: 0

    Total pass:  

                      11,947,945,076   packets,         10,298,270,875,310   bytes

    Total discard:

                         147,158,938 packets,            138,442,612,638 bytes

    Drop tail discard:

                                   0   packets,                          0   bytes

    Wred discard:

                         147,158,938   packets,            138,442,612,638   bytes

    Last 30 seconds pass rate:

                             266,283 pps,                  1,683,300,768 bps

    Last 30 seconds discard rate:

                                   0   pps,                              0 bps

    Drop tail discard rate:

                                   0   pps,                              0 bps

    Wred discard rate:

                                   0   pps,                              0 bps

    buffer size:                  2750 kbytes

  used buffer size:                0 kbytes  


In the above screenshot, we noticed there was discard in both the EF class and the CS6 class. The EF class is the one carrying the voice service and the CS6 class carries the charging/billing service traffic. Kindly note that it is only when the charging is successful that a call can go through (here, the platform checks if the user has airtime to use the service before giving the user access to the service).


Step 4. Having in mind the available capacity was adequate to carry services for both the EF and the CS6 classes, the next parameter to investigate was buffer-size. It was at 2.75M. In this case, if we had the burst greater than buffer-size, there would be drops, as the buffer wwould be full. The below screenshot shows a display of the peak rate:


<IPCORE_ROUTER)>dis port-queue statistics int eth  7 outbound

[ef]

Current usage percentage of   queue: 0

  ……

  buffer   size:                  2750 kbytes

  used buffer   size:                  0 kbytes

  Peak rate:

                            2020-01-07   20:15:06                  340,976,120 bps

 

[cs6]

Current usage percentage of   queue: 0

  ……

  Peak rate:

                            2020-01-17    9:30:53                2,441,866,872 bps


We see for the CS6 class that we peak at 2.44 Gbps, which is more than the port-shaping value of 2.2Gbps. The peak value is a second-level value, so in microseconds it will be higher. This means the buffer will be full and there will be a drop of packets and the service will be impacted.


SOLUTION


Solution 1. Increase the buffer size by modifying the queue depth of the CS6 and EF Classes.

The calculations below are necessary to estimate buffer-size to be used:


  • average packet length of EF Queue = 276,745,008 / 315,066 / 8 = 110 bytes;


  • average packet length of CS6 Queue = 1,683,300,768 / 266,283 / 8 = 791 bytes;


  • free list resource number for the EF class = 110.

 

With the above calculations, it is recommended to use:


  • for EF Class, queue depth set at 25600;


  • for CS6 Class, queue depth set at 128000.


The queue depth can be set using the command below:


#

port-wred ef

  queue-depth 25600

#

port-wred cs6

 queue-depth 128000


The idea is, if buffer size is bigger, more packets will be sent. However, each packet that accesses the buffer will occupy a free list resource. The free list is one kind of resource for the whole chip. It has a value of 512000 in our case (LPU type LPU CR5DLPUF5070 and board type CR5D0L2XFA70). If the free-list resource is also exhausted, the router will drop the packets and the service will be affected.


The free list resource number for the EF class = 25600 / 110 = 233K.


For the CS6 class, the free list resource number = 128000 / 791 = 162K.


(EF queue-depth / EF queue packet average length) + (CS6 queue-depth / CS queue packet average length) < 512K.


This can be accomodated by our case and it resolved the issue. Voice packets could be transmitted with no drops and no customer complaints were received on voice.


Advantage of solution 1

It decreases the freelist resource used for the low priority queue and it provides more freelist resource for the EF and CS6 queues.


Risk of solution 1

The risk for this method is that the calculation is based on the current traffic model. If the traffic model is changed, the value of the free list should be changed.  For example, the CS6 queue packet average length changes from 791byte to 200byte.


The freelist resource number will be 128000/200 = 640K, which is greater than our case (512K) and the service will be affected, as packets will be dropped. The free list resource will not be enough to hold the packets in the buffer when traffic burst is more than the buffer value.


Solution 2. Remove some packets from the CS6 class.

In this method, some services in the CS6 class could be moved to some lower priority classes like AF (AF13, AF33, AF12, AF28, AF10, AF26). However, this depends on the scenario and it is important to make sure this deprioritized services are less important services and not really needed for voice traffic. A simulation can be made below:


  • the average CS6 queue average speed is less than 2G (around 1.6G as per the above display);


  • the Second-level peak is at 2.4Gbps;


  • Real peak value is 2.4Gbps  + 2.4Gbps x 0.3 = 3.2Gpbs.


If we can keep the CS6 Queue average speed less than 1G, the peak value will be 1.3G, which is less than the port-shaping value of 2.2Gpbs and the traffic will not be dropped.


This method can be applicable only if we can remove the services not related to voice in our scenario from the CS6 Class. This also applies when highly sensitive traffic is not also deprioritized.


Note: this should be applied on the source and destination routers within the MPLS Domain.


Solution 3. Upgrade the Microwave link and increase the port shaping value.

In our scenario, we could see based on the display output that:


  • the Second-level peak is at 2.4Gbps;


  • Real peak value is 2.4Gbps  + 2.4Gbps x 0.3 = 3.2Gpbs.


If the Microwave link is upgraded to 5G and the port-shaping value is kept at 3.6G, it will resolve the issue.


However, this solution is CAPEX-related, takes time to be done and might not be used in our case, as we need to solve the customer complaints in less than 2 hours, as customers are not able to make voice calls, so it's is very critical. This solution is classified as a long term solution.


Solution 4. Replace the LPU Board to increase the free list resource.

The network is not static, as more and more services are deployed and more and more customers are onboarded. This might change the traffic model and, as seen in Solution 1, the free list resource might become insufficient to process the traffic. In this case, we opt for a solution to increase the free list resource. This is hardware-related and considers that we upgrade to a board with more free list resources so we can also have the possiblity to increase the buffer size.


In our scenario we are using LPU type LPU CR5DLPUF5070 and board type CR5D0L2XFA70, which gives us a free list resource of 512 K.


We can upgrade the LPU board to LPUF240A main board + eTM subcard (such as CR57L5XF or CR57L5XFE), which gives a free list resource of 1M and so on if more free list resource is needed.


A free list resource can carry one or several packets. One free list resource is about 5000 bytes. If one packet is 1000 bytes, it means 5 packets will use one freelist resource. So for the 512K freelist, we can carry twice less packets compared to the 1M Freelist.


With our scenario, we can increase the buffer size further from 128M to 256M. However, kindly note a big buffer can induce a big delay for the voice service. If the burst traffic stays for a long time, it will drop the packet. For example in our scenario, if the traffic stays at 2.4Gbps for 60 seconds, considering our port shaping value is at 2.2Gpbs, the packets will be dropped.


Considering the level of criticality of service (voice) and also the impact it caused on the customers and the loss of revenue involved, it is recommended to go with Solution 1 for a short term and a combination of Solutions 2, 3 and 4 could be applied for a medium or long term.

The post is synchronized to: Author group

Lucfabrice
MVE Author Created Jan 19, 2022 05:02:29

Your feedback on this post will be highly welcome. Thanks.
@olive.zhao @Irina @dragos_v @BAZ @wissal @umaryaqub @Rumana @ander.sanchez @Lan59 @nuchi @Vlada85 @hemin88 @Unicef @little_fish @gzzz @nochhie @chantha @smileymind @Navin_kay @user_4358465 @Majdi.Chebil @AndresMoreno @shakeela @phuta @Ayeshaali @user_4001805 @VinceD @lucian2003 @dengdengdeng @Sara_Obaid @AL_93 @MahMush @Y_T_Z @Kevin_Thomas @Saqib123 @bobi @richie9999 @user_4359501 @MesayW. @Khalid_Gul @LilStylz237 @MMshaikh @Laiheang @Chanbora @Sokrin @simchamnan @user_4237671 @Abdussamed @andersoncf1 @Herediano @Vesper_EvenStar @taha_29four @user_4326135 @Assis_bsb @Serges_armel @sachandio @hamza11 @mouh1991 @Tiplu @Null_0 @Tongun @Haseeb_Haris @Diego.Silva @Caroline_Herrera @kunthea @Somemeow @Anno7 @chenhui @jason_hu @Popeye_Wang @alopez @Chenxintao @E.DR_91 @stephen.xu @DDSN @Malik3000 @Zemo_Mccracken @adrian_alucard @Precious @Kwesi @imransumayari @abdul_basit7233 @Andre_G @Murat87 @LucianoNhantumbo @Vien @titusmahwe @DragonVN @Zebra @thisu @Funstuf @DKetrari @4TEch @rkahya_4 @scidox @faysalji @user_3134129 @SamB @mustafa211 @rimon @RajK @Funstuff @Abrar_Akbar @Kh_Elias65 @James_Nel @Zonger @Hurr @15393597009 @safecity @LeeMARK @jerry_zhuzi @bruno.guedes @Kashif @DrDoom @mrppa @sliawatimena @daniellima @thibay @maithi @hanhcao @wonderj @mytruc @huyvan @manpham @Imnh @hugu @nagu @sam_san @NTan33 @Faridrami @I_Am_Batman @amr_rashedy @Ignatius @Saqibaz @user_4252339 @Satya_Syam @Vijji @user_4413531 @Wieczorekcool @user_4400653 @Sirajs @Dia0205 @abdelali @Irshadhussain @cmarban @javaid100 @Natan_Oliveira @backwaves @alexander.grosello @Confucius @Soliman_Mohammed @sohaib.ansar @csk99 @OneDan @bek7 @Farah_O @AymanOT @Asimsaad @Salah @gabo.lr @Mr.Jack @Steffy @h89151 @Alibaba8000 @SidzHuawei
View more
  • x
  • convention:

zaheernew
zaheernew Created Jan 19, 2022 08:48:42 (0) (0)
Useful info  
wissal
wissal Created Jan 20, 2022 18:26:52 (0) (0)
 
andersoncf1
andersoncf1 Created Feb 1, 2022 12:39:39 (0) (0)
 
good one
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 19, 2022 05:34:22 (0) (0)
Thanks @MahMush  
Well done friend
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 19, 2022 05:34:40 (0) (0)
Thanks @Unicef  
Very good information
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:09:11 (0) (0)
Thanks @user_4358465  
good one. briefly described
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:09:39 (0) (0)
Thanks @Zonger  
Good case to share along with the handling procedure and solutions
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:10:09 (0) (0)
Thanks @faysalji  
Good share.
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:10:37 (0) (0)
Thanks @4TECH  
Thanks for sharing this post!
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:11:03 (0) (0)
Thanks @Zemo_Mccracken  
AL_93
Moderator Created Jan 19, 2022 11:14:16

Helpful post! Thank you for sharing!
View more
  • x
  • convention:

Lucfabrice
Lucfabrice Created Jan 21, 2022 06:11:28 (0) (0)
Thanks @AL_93  
123
Back to list

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.