Introducing GSM

The "Global System for Mobile communications" (GSM) is a digital mobile radio system which is extensively used throughout Europe, and also in many other parts of the world. GSM has used a variety of voice codecs to compress 3.1 kHz audio into between 6.5 and 13 kbps. Two codecs were designed originally, called Half Rate and Full Rate codecs. They were named after the data channel types that were used. Both Half Rate and Full Rate codecs use a system that is based on LPC (Linear Predictive Coding).

Half Rate (5.6 kbit/s) and Full Rate (13 kbit/s) codecs used a system based upon linear predictive coding (LPC). LPC helps represent - in a compressed form - the spectral range of digital signal of speech. To achieve it, information of a linear predictive model is used.

In addition to being efficient with bitrates, these codecs also made it easier to identify more important parts of the audio, allowing the air interface layer to prioritize and better protect these parts of the signal. These codecs were further enhanced with EFR (Enhanced Full Rate) codec. It operates at 12,2 kbps so it uses a full rate channel.

The GSM full rate speech codec operates at 13 kbits/s and uses a Regular Pulse Excited (RPE) codec. Basically the input speech is split up into frames 20 ms long, and for each frame a set of 8 short term predictor coefficients are found. Each frame is then further split into four 5 ms sub-frames, and for each sub-frame the encoder finds a delay and a gain for the codec's long term predictor. Finally the residual signal after both short and long term filtering is quantized for each sub-frame as follows.

The 40 sample residual signal is decimated into three possible excitation sequences, each 13 samples long. The sequence with the highest energy is chosen as the best representation of the excitation sequence, and each pulse in the sequence has its amplitude quantized with three bits. At the decoder the reconstructed excitation signal is fed through the long term and then the short term synthesis filters to give the reconstructed speech. A postfilter is used to improve the perceptual quality of this reconstructed speech.

The GSM codec provides good quality speech. Its main advantage over other low rate codecs is its relative simplicity - it runs easily in real time on my 66 Mhz 486 PC for example, whereas CELP codecs need a dedicated DSP to run in real time.

AMR-NB (Adaptive Multi-Rate Narrowband) is a variable-rate codec that ensures high quality and robust against interference in case of full rate channels. If it is used on half-rate channels it is less robust but still quite high quality.

GSM codecs:

  1. Half Rate (also called HR, GSM-HR or GSM 06.20). It is a speech coding system developed for GSM. The codec operates at 5.6 kbps (meaning that it uses only the half bandwidth of the Full Rate codec); the network capacity used for voice transmission is doubled (however, it results in reduced audio quality); the sample rate is 8 kHz with 13 bit; frame length 160 samples (20 ms) and sub-frame length 40 samples (5 ms).

    Technology

    • Encoded bandwidth: ~ 200-3400 Hz
    • Standardized: ETSI 1994
    • Coding type: VSELP (Vector Sum Excited Linear Prediction)
    • Bit rate: 6.5 kbps
    • Delay (ms):
      • Frame size: 20
      • Lookahead: 5
    • Quality: < Toll
    • Complexity:
      • MIPS: 30
      • RAM (words): 4 K

  2. Full Rate (also called FR, GSM-FR or GSM 06.10). It was the first digital speech coding standard for GSM and it was developed in early 1990s. From this reason it does not ensure so high quality speech that is why it is gradually replaced by EFR and AMR codecs since they offer higher speech quality at lower bit rate. Full Rate codec is based on RPE-LTP (Regular Pulse Excitation - Long Term Prediction) speech coding paradigm.

    Technology

    • Encoded bandwidth: ~ 200-3400 Hz
    • Standardized: ETSI 1987
    • Coding type: RPE-LTP (Regular Pulse Excitation with Long-Term Prediction)
    • Bit rate: 13 kbps
    • Delay (ms):
      • Frame size: 20
      • Lookahead: 0
    • Quality: < Toll
    • Complexity:
      • MIPS: 4.5
      • RAM (words): 1K

  3. Enhanced Full Rate (also called EFR, GSM-EFR or GSM 06.60). It is the enhanced development of GSM-Full Rate as it produces higher speech quality. Despite the high speech and call quality this codec needs about 5% more energy. It is based on Algebraic Code Excited Linear Prediction Coder (ACELP) algorithm.

    Technology

    • Encoded bandwidth: ~ 200-3400 Hz
    • Standardized: ETSI 1997
    • Coding type: ACELP® (Algebraic Code Excited Linear Prediction)
    • Bit rate: 12.2 kbps
    • Delay (ms):
      • Frame size: 20
      • Lookahead: 0
    • Quality: Toll
    • Complexity:
      • MIPS: 15-20
      • RAM (words): 4K

Adaptive Multi-Rate (also called AMR or AMR-NB). It is an audio data compression scheme optimized for speech coding. Link adaptation is used for selecting one of eight different bit rates based on link conditions. This codec uses different techniques like ACELP (Algebraic Code Excited Linear Prediction), DTX (Discontinuous Transmission), VAD (Voice Activity Detection) and CNG (Comfort Noise Generation).

Summary for GSM codecs

Audio compression format Algorithm Sample Rate Bit rate Bits per sample Latency CBR VBR Stereo Multi -
channel
GSM-HR Lossy 8 kHz 5.6 kbps 13 25ms Yes No No No
GSM-FR Lossy 8 kHz 13 kbps 13 20-30ms Yes No No No
GSM-EFR ACELP, Lossy 8 kHz 12.2 kbps 13 20-30ms Yes No No No

Related Pages

More information