Got it

[Introduction to 5G] EVS codec

Latest reply: Apr 8, 2022 10:39:04 1520 12 12 0 0

The SA network has not yet been deployed on a large scale, while VoNR is still mainly in the testing phase and not yet commercially available on a large scale among operators. For those who know 4G VoLTE, you will have noticed that the key technologies on the wireless side of VoNR, including the unique EVS codec and MAC CE-based speed regulation, are also benchmarked with 4G VoLTE. In fact, VoNR has adopted the principle of "taking the best and removing the worst" in the process of benchmarking with 4G VoLTE, and the customer interface has been designed to be simpler and easier to deploy; and as 5G commercial terminals continue to gain popularity, more unique and competitive highlights of VoNR will emerge.

Why is speech coding necessary?

More than 20% of the information that humans obtain from the outside world comes from hearing. The sound that the human ear can perceive is a mechanical vibration wave with a frequency range between 20Hz and 20000Hz. The frequency of sound waves emitted by the human articulatory organs can be as high as 15,000Hz. To facilitate research and processing, people convert the corresponding mechanical vibration waves of sound into electrical signals, i.e. analogue speech signals. The analogue speech signal is characterised by a high degree of redundancy, i.e. there are audio signals that cannot be perceived by the human ear (outside the frequency range of 20Hz to 20,000Hz) and audio signals that need to be masked out (e.g. if a strong tone signal and a weak tone signal are present at the same time, the weak tone signal will be masked by the strong tone signal and cannot be heard), which are of no help in determining the timbre and pitch of a sound.

Therefore, based on the high redundancy of analogue speech signals, it is necessary to remove the redundant components of the speech signal with as little distortion as possible; at the same time, as analogue speech signals are low-frequency signals, which are not conducive to long-distance transmission and storage, it is also necessary to digitally convert analogue speech signals to make long-distance speech communication possible. This is the purpose of speech coding.

How is speech coding usually implemented?

As can be seen from the purpose of speech coding described earlier, speech coding consists of two parts: analogue-to-digital conversion and compression, and the detailed process is shown in the diagram.

5G

The speech coding process is as follows:

  • Sampling: Sampling the analogue speech signal to realise the temporal discretization of the speech signal. According to Nyquist's theory, the sampling frequency needs to be two times higher than the highest frequency of the analogue speech signal to ensure that the sampled sound can be restored to the original sound. This is the familiar PCM (pulse coded modulation) coding.

  • Quantization: The temporally discrete speech signal is then quantized by a level quantization process to achieve a discrete amplitude, i.e. the speech signal is represented by a finite number of quantized level values.

  • Encoding: The quantized level is expressed as a binary code, thus digitising the speech signal.

  • Compression: The digitised speech signal still occupies a large bandwidth. Combined with the high redundancy of the speech signal, the digitised speech signal can be compressed at a compression rate of 10x to 20x to reduce the occupation of transmission and storage resources.

The purpose of speech coding is for efficient long-distance voice communication, so the higher the speech coding rate, the better the listening experience during voice communication. So what determines the speech coding rate?

With a quantization level of 256, for example, each quantization level value is represented by 8 (log2256) bits, where 8 is the encoding word length. Speech encoding rate = sampling frequency x encoding word length x compression rate, so the higher the sampling frequency used (the wider the band), the higher the speech encoding rate, given the same encoding word length and compression rate:

  • Speech signals are usually classified into four categories according to the width of the frequency band used.

  • Narrowband (abbreviated as NB)

    The signal band is from 300Hz to 3400Hz and is used for all types of telephone communication. The sampling frequency for digitalisation is often 8000Hz, i.e. 8kHz.

  • Wideband speech (wideband, abbreviated as WB)

    The signal band is 50 Hz to 7000 Hz and is used for teleconferencing, video conferencing, etc. The sampling frequency is often 16 kHz when digitising.

  • Super wideband (abbreviated as SWB)

    The signal band is from 20Hz to 15000Hz and is used for digital audio broadcasting, etc. The sampling frequency is often 32kHz when digitising.

  • Full band (abbreviated as FB)

    The signal band is from 20Hz to 20000Hz and is used for VCD, DVD, CD records, HDTV soundtracks etc. The sampling frequency is often 48kHz when digitising.

The listening experience corresponding to the different frequency band ranges of the speech signal can be displayed in the diagram.

5g

The above figure shows that narrowband coding can meet the basic requirements of voice calls, while wideband coding provides better sound quality and is closer to face-to-face conversations than narrowband coding, so voice coding is also gradually developing towards wideband coding and decoding as the algorithm capability of voice codec processors improves. The main voice codecs used in each generation of Huawei's communication networks are shown in the figure. With the increase in network bandwidth and people's increasingly high requirements for voice quality, narrowband and wideband voice codecs have been gradually replaced by ultra-wideband and full-band voice codecs.

5G

What are the advantages of EVS?

EVS was standardised and evaluated in 3GPP in September 2014, but for 4G, only a few terminals supported EVS due to the late launch and lack of maturity of the 4G industry chain, and therefore it was not commercially available on a large scale. As EVS does not only enable HD voice but can also effectively improve coverage, it was finally included in the 3GPP standard as a mandatory choice for 5G voice, thanks to the joint efforts of telecom operators and telecom equipment vendors.

As the table shows, EVS enhances coding flexibility and efficiency by supporting multi-rate voice codecs, allowing operators to choose the voice coding rate they need to support based on factors such as the capacity of terminals in their existing networks; different adjacent coding rates map to different MOS split experiences. EVS also enhances the voice experience by supporting faster voice codec rates, for example, from 13.2kbits/s onwards. EVS-SWB voice quality is already close to 'direct source' (original) voice quality.


Encoding scheme 

Supported speech coding rate(kbit/s)

SF (kHz)

EVS-NB

5.9、7.2、8.0、9.6、13.2、16.4、24.4

8

EVS-WB

5.9、7.2、8.0、9.6、13.2、16.4、24.4、32、48、64、96、128

16

EVS-SWB

9.6、13.2、16.4、24.4、32、48、64、96、128

32

EVS-FB

16.4、24.4、32、48、64、96、128

48

AMR-WB I/O

6.6、8.85、12.65、14.25、15.85、18.25、19.85、23.05、23.85

16

Note: The length of the code word in the EVS coding scheme is 16 bits.

Finally, let's look at just how much the voice experience can be improved with the EVS voice codec.


5G

With EVS coding and decoding, MOS scores can reach 4.6, which is close to a full score of 5; therefore, the use of EVS coding makes 5G voice quality a big step up.

The post is synchronized to: Introduction to 5G

Great share
View more
  • x
  • convention:

Thanks you
View more
  • x
  • convention:

Rumana
Rumana Created Sep 10, 2021 19:30:58 (0) (0)
 
Good share
View more
  • x
  • convention:

zaheernew
MVE Author Created Sep 10, 2021 19:07:40

[Introduction to 5G] EVS codec-4129733-1
View more
  • x
  • convention:

Very interesting, good job.
View more
  • x
  • convention:

thanks for sharing
View more
  • x
  • convention:

Vlada85
MVE Author Created Sep 24, 2021 17:00:39

Good job!
View more
  • x
  • convention:

amr_rashedy
MVE Author Created Sep 26, 2021 07:56:00

Excellent
View more
  • x
  • convention:

Majdi.Chebil
Moderator Author Created Sep 26, 2021 15:43:22

Thanks for sharing. Very interesting topic.
Looking forward to using VoNR.
View more
  • x
  • convention:

12
Back to list

Comment

You need to log in to comment to the post Login | Register
Comment

Notice: To protect the legitimate rights and interests of you, the community, and third parties, do not release content that may bring legal risks to all parties, including but are not limited to the following:
  • Politically sensitive content
  • Content concerning pornography, gambling, and drug abuse
  • Content that may disclose or infringe upon others ' commercial secrets, intellectual properties, including trade marks, copyrights, and patents, and personal privacy
Do not share your account and password with others. All operations performed using your account will be regarded as your own actions and all consequences arising therefrom will be borne by you. For details, see " User Agreement."

My Followers

Login and enjoy all the member benefits

Login

Block
Are you sure to block this user?
Users on your blacklist cannot comment on your post,cannot mention you, cannot send you private messages.
Reminder
Please bind your phone number to obtain invitation bonus.