Code-excited Linear Prediction
   HOME

TheInfoList



OR:

Code-excited linear prediction (CELP) is a linear predictive
speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic d ...
algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as
residual-excited linear prediction Residual-excited linear prediction (RELP) is an obsolete speech coding algorithm. It was originally proposed in the 1970s and can be seen as an ancestor of code-excited linear prediction (CELP). Unlike CELP however, RELP directly transmits the resid ...
(RELP) and
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive mod ...
(LPC)
vocoders A vocoder (, a portmanteau of ''voice'' and ''encoder'') is a category of speech coding that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption or voice transformation. The vocoder was i ...
(e.g.,
FS-1015 FIPS 137, originally issued as FED-STD-1015, is a secure telephony speech encoding standard for Linear Predictive Coding vocoder developed by the United States Department of Defense and finished on November 28, 1984. It was based on the earlier STA ...
). Along with its variants, such as
algebraic CELP Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the code- ...
, relaxed CELP,
low-delay CELP G.728 is an ITU-T standard for speech coding operating at 16  kbit/s. It is officially described as ''Coding of speech at 16 kbit/s using low-delay code excited linear prediction''. Technology used is LD-CELP, low-delay code excited linear pre ...
and
vector sum excited linear prediction Vector sum excited linear prediction (VSELP) is a speech coding method used in several cellular standards. The VSELP algorithm is an analysis-by-synthesis coding technique and belongs to the class of speech coding algorithms known as CELP (Code Exc ...
, it is currently the most widely used speech coding algorithm. It is also used in
MPEG-4 Audio MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was publish ...
speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.


Background

The CELP algorithm is based on four main ideas: * Using the source-filter model of speech production through
linear prediction Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples. In digital signal processing, linear prediction is often called linear predictive coding (LPC) and ...
(LP) (see the textbook "speech coding algorithm"); * Using an adaptive and a fixed codebook as the input (excitation) of the LP model; * Performing a search in closed-loop in a "perceptually weighted domain". * Applying
vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...
(VQ) The original algorithm as simulated in 1983 by Schroeder and Atal required 150 seconds to encode 1 second of speech when run on a Cray-1 supercomputer. Since then, more efficient ways of implementing the codebooks and improvements in computing capabilities have made it possible to run the algorithm in embedded devices, such as mobile phones.


CELP decoder

Before exploring the complex encoding process of CELP we introduce the decoder here. Figure 1 describes a generic CELP decoder. The excitation is produced by summing the contributions from fixed (a.k.a. stochastic or innovation) and adaptive (a.k.a. pitch) codebooks: :e e_f e_a , where e_ /math> is the fixed (a.k.a. stochastic or innovation) codebook contribution and e_ /math> is the adaptive ( pitch) codebook contribution. The fixed codebook is a
vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...
dictionary that is (implicitly or explicitly) hard-coded into the codec. This codebook can be algebraic (
ACELP Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the cod ...
) or be stored explicitly (e.g.
Speex Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntro ...
). The entries in the adaptive codebook consist of delayed versions of the excitation. This makes it possible to efficiently code periodic signals, such as voiced sounds. The filter that shapes the excitation has an all-pole model of the form 1/A(z), where A(z) is called the prediction filter and is obtained using linear prediction ( Levinson–Durbin algorithm). An all-pole filter is used because it is a good representation of the human vocal tract and because it is easy to compute.


CELP encoder

The main principle behind CELP is called analysis-by-synthesis (AbS) and means that the encoding (analysis) is performed by perceptually optimizing the decoded (synthesis) signal in a closed loop. In theory, the best CELP stream would be produced by trying all possible bit combinations and selecting the one that produces the best-sounding decoded signal. This is obviously not possible in practice for two reasons: the required complexity is beyond any currently available hardware and the “best sounding” selection criterion implies a human listener. In order to achieve real-time encoding using limited computing resources, the CELP search is broken down into smaller, more manageable, sequential searches using a simple perceptual weighting function. Typically, the encoding is performed in the following order: * Linear prediction coefficients (LPC) are computed and quantized, usually as
line spectral pairs Line spectral pairs (LSP) or line spectral frequencies (LSF) are used to represent linear prediction coefficients (LPC) for transmission over a channel. LSPs have several properties (e.g. smaller sensitivity to quantization noise) that make them s ...
(LSPs). * The adaptive (pitch) codebook is searched and its contribution removed. * The fixed (innovation) codebook is searched.


Noise weighting

Most (if not all) modern audio codecs attempt to shape the coding noise so that it appears mostly in the frequency regions where the ear cannot detect it. For example, the ear is more tolerant to noise in parts of the spectrum that are louder and vice versa. That's why instead of minimizing the simple quadratic error, CELP minimizes the error for the ''perceptually weighted'' domain. The weighting filter W(z) is typically derived from the LPC filter by the use of
bandwidth expansion Bandwidth expansion is a technique for widening the bandwidth or the resonances in an LPC filter. This is done by moving all the poles towards the origin by a constant factor \gamma. The bandwidth-expanded filter A'(z) can be easily derived from ...
: :W(z) = \frac where \gamma_1 > \gamma_2.


See also

*
MPEG-4 Part 3 MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was publish ...
(CELP as an MPEG-4 Audio Object Type) * G.728 – Coding of speech at 16 kbit/s using low-delay code excited linear prediction *
G.718 G.718 is an ITU-T Recommendation embedded scalable speech and audio codec providing high quality narrowband (250 Hz to 3.5 kHz) speech over the lower bit rates and high quality wideband (50 Hz to 7 kHz) speech over the complete ...
– uses CELP for the lower two layers for the band (50–6400 Hz) in a two-stage coding structure *
G.729.1 G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is ''G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband cod ...
– uses CELP coding for the lower band (50–4000 Hz) in a three-stage coding structure *
Comparison of audio coding formats The following tables compare general and technical information for a variety of audio coding formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test. General informati ...
* CELT is a related audio codec that borrows some ideas from CELP.


References

* B.S. Atal, "The History of Linear Prediction," ''IEEE Signal Processing Magazine'', vol. 23, no. 2, March 2006, pp. 154–161. * M. R. Schroeder and B. S. Atal, "Code-excited linear prediction (CELP): high-quality speech at very low bit rates," in ''Proceedings of the IEEE
International Conference on Acoustics, Speech, and Signal Processing ICASSP, the International Conference on Acoustics, Speech, and Signal Processing, is an annual flagship conference organized of IEEE Signal Processing Society. All papers included in its proceedings have been indexed by Ei Compendex. The first ICA ...
'' (ICASSP), vol. 10, pp. 937–940, 1985.


External links

* This article is based on
paper
presented a
Linux.Conf.Au
* Some parts based on the
Speex Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntro ...
code
manual


of CELP 1016A (CELP 3.2a) and LPC 10e.

h2>

Selected readings




Speech Processing: Theory of LPC Analysis and Synthesis
{{Compression formats Speech codecs