HOME

TheInfoList



OR:

An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of
digital audio Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samp ...
(such as in
digital television Digital television (DTV) is the transmission of television signals using digital encoding, in contrast to the earlier analog television technology which used analog signals. At the time of its development it was considered an innovative adva ...
,
digital radio Digital radio is the use of digital technology to transmit or receive across the radio spectrum. Digital transmission by radio waves includes digital broadcasting, and especially digital audio radio services. Types In digital broadcasting s ...
and in audio and video files). Examples of audio coding formats include
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
, AAC,
Vorbis Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder ( codec) for lossy audio compression. Vorbis is most commonly used in con ...
,
FLAC FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference softwa ...
, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
audio coding format in software. Some audio coding formats are documented by a detailed
technical specification A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard. There are different types of technical or engineering specificat ...
document known as an audio coding specification. Some such specifications are written and approved by standardization organizations as
technical standard A technical standard is an established norm or requirement for a repeatable technical task which is applied to a common and repeated use of rules, conditions, guidelines or characteristics for products or related processes and production methods, ...
s, and are thus known as an audio coding standard. The term "standard" is also sometimes used for ''de facto'' standards as well as formal standards. Audio content encoded in a particular audio coding format is normally encapsulated within a
container format A container format (informally, sometimes called a wrapper) or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. No ...
. As such, the user normally doesn't have a raw AAC file, but instead has a .m4a audio file, which is a MPEG-4 Part 14 container containing AAC-encoded audio. The container also contains
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
such as title and other tags, and perhaps an index for fast seeking. A notable exception is
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
files, which are raw audio coding without a container format. De facto standards for adding metadata tags such as title and artist to MP3s, such as ID3, are
hack Hack may refer to: Arts, entertainment, and media Games * ''Hack'' (Unix video game), a 1984 roguelike video game * ''.hack'' (video game series), a series of video games by the multimedia franchise ''.hack'' Music * ''Hack'' (album), a 199 ...
s which work by appending the tags to the MP3, and then relying on the MP3 player to recognize the chunk as malformed audio coding and therefore skip it. In video files with audio, the encoded audio content is bundled with video (in a video coding format) inside a
multimedia container format A container format (informally, sometimes called a wrapper) or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. No ...
. An audio coding format does not dictate all
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s used by a
codec A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or ...
implementing the format. An important part of how lossy audio compression works is by removing data in ways humans can't hear, according to a psychoacoustic model; the implementer of an encoder has some freedom of choice in which data to remove (according to their psychoacoustic model).


Lossless, lossy, and uncompressed audio coding formats

A lossless audio coding format reduces the total data needed to represent a sound but can be de-coded to its original, uncompressed form. A
lossy In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data si ...
audio coding format additionally reduces the bit resolution of the sound on top of compression, which results in far less data at the cost of irretrievably lost information. Consumer audio is most often compressed using lossy audio codecs as the smaller size is far more convenient for distribution. The most widely used audio coding formats are
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
and
Advanced Audio Coding Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate. AAC has been stan ...
(AAC), both of which are lossy formats based on
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...
(MDCT) and perceptual coding algorithms. Lossless audio coding formats such as
FLAC FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference softwa ...
and Apple Lossless are sometimes available, though at the cost of larger files. Uncompressed audio formats, such as
pulse-code modulation Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the ...
(PCM, or .wav), are also sometimes used. PCM was the standard format for Compact Disc Digital Audio (CDDA), before lossy compression eventually became the standard after the introduction of MP3.


History

In 1950,
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
filed the patent on
differential pulse-code modulation Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities based on the prediction of the samples of the signal. The input can be an analog signal or a digital ...
(DPCM). Adaptive DPCM (ADPCM) was introduced by P. Cummiskey, Nikil S. Jayant and James L. Flanagan at
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
in 1973. Perceptual coding was first used for
speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic d ...
compression, with
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive mod ...
(LPC). Initial concepts for LPC date back to the work of
Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) method ...
(
Nagoya University , abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of ...
) and Shuzo Saito (
Nippon Telegraph and Telephone , commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 55th in ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as well as the third la ...
) in 1966. During the 1970s,
Bishnu S. Atal Bishnu S. Atal (born 1933) is an Indian physicist and engineer. He is a noted researcher in acoustics, and is best known for developments in speech coding. He advanced linear predictive coding (LPC) during the late 1960s to 1970s, and developed ...
and Manfred R. Schroeder at
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...
developed a form of LPC called
adaptive predictive coding Adaptive predictive coding (APC) is a narrowband analog-to-digital conversion that uses a one-level or multilevel sampling system in which the value of the signal at each sampling instant is predicted according to a linear function of the past valu ...
(APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the code-excited linear prediction (CELP) algorithm which achieved a significant compression ratio for its time. Perceptual coding is used by modern audio compression formats such as
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
and AAC. Discrete cosine transform (DCT), developed by Nasir Ahmed, T. Natarajan and
K. R. Rao Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of di ...
in 1974, provided the basis for the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...
(MDCT) used by modern audio compression formats such as MP3 and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987, following earlier work by Princen and Bradley in 1986. The MDCT is used by modern audio compression formats such as
Dolby Digital Dolby Digital, originally synonymous with Dolby AC-3, is the name for what has now become a family of audio compression technologies developed by Dolby Laboratories. Formerly named Dolby Stereo Digital until 1995, the audio compression is loss ...
,
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
, and
Advanced Audio Coding Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate. AAC has been stan ...
(AAC).


List of lossy formats


General


Speech

{{further, Speech coding *
Linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive mod ...
(LPC) **
Adaptive predictive coding Adaptive predictive coding (APC) is a narrowband analog-to-digital conversion that uses a one-level or multilevel sampling system in which the value of the signal at each sampling instant is predicted according to a linear function of the past valu ...
(APC) ** Code-excited linear prediction (CELP) ** Algebraic code-excited linear prediction (ACELP) ** Relaxed code-excited linear prediction (RCELP) **
Low-delay CELP G.728 is an ITU-T standard for speech coding operating at 16  kbit/s. It is officially described as ''Coding of speech at 16 kbit/s using low-delay code excited linear prediction''. Technology used is LD-CELP, low-delay code excited linear pre ...
(LD-CELP) ** Adaptive Multi-Rate (used in
GSM The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation ( 2G) digital cellular networks used by mobile devices such ...
and 3GPP) ** Codec2 (noted for its lack of patent restrictions) ** Speex (noted for its lack of patent restrictions) *
Modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...
(MDCT) ** AAC-LD **
Constrained Energy Lapped Transform The Celts (, see Names of the Celts#Pronunciation, pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-Europea ...
(CELT) ** Opus (mostly for real-time applications)


List of lossless formats

* Apple Lossless (ALAC – Apple Lossless Audio Codec) * Adaptive Transform Acoustic Coding (ATRAC) * Audio Lossless Coding (also known as MPEG-4 ALS) * Direct Stream Transfer (DST) * Dolby TrueHD * DTS-HD Master Audio * Free Lossless Audio Codec (FLAC) * Lossless discrete cosine transform (LDCT) *
Meridian Lossless Packing Meridian Lossless Packing, also known as Packed PCM (PPCM), is a lossless compression technique for PCM audio data developed by Meridian Audio, Ltd. MLP is the standard lossless compression method for DVD-Audio content (often advertised with t ...
(MLP) * Monkey's Audio (Monkey's Audio APE) * MPEG-4 SLS (also known as HD-AAC) * OptimFROG *
Original Sound Quality {{unreferenced, date=March 2010 Original Sound Quality (OSQ) is an audio file format developed in 2002 by '' Steinberg Media Technologies GmbH'' and implemented e.g. in their audio editing software '' Wavelab 4'' (and following releases) for lossl ...
(OSQ) * RealPlayer (RealAudio Lossless) * Shorten (SHN) * TTA (True Audio Lossless) * WavPack (WavPack lossless) * WMA Lossless (Windows Media Lossless)


See also

*
Comparison of audio coding formats The following tables compare general and technical information for a variety of audio coding formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test. General informati ...
* Data compression#Audio * Audio file format * List of audio compression formats


References