An audio coding format (or sometimes audio compression format) is a
content representation format for storage or transmission of
digital audio
Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, sa ...
(such as in
digital television
Digital television (DTV) is the transmission of television signals using digital encoding, in contrast to the earlier analog television technology which used analog signals. At the time of its development it was considered an innovative advanc ...
,
digital radio
Digital radio is the use of digital technology to transmit or receive across the radio spectrum. Digital transmission by radio waves includes digital broadcasting, and especially digital audio radio services.
Types
In digital broadcasting syst ...
and in audio and video files). Examples of audio coding formats include
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
,
AAC,
Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conjun ...
,
FLAC
FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software p ...
, and
Opus
''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.).
Opus or OPUS may refer to:
Arts and entertainment Music
* Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
. A specific software or hardware implementation capable of
audio compression and decompression to/from a specific audio coding format is called an
audio codec
An audio codec is a device or computer program capable of encoding or decoding a digital data stream (a codec) that encodes or decodes audio. In software, an audio codec is a computer program implementing an algorithm that compresses and decompres ...
; an example of an audio codec is
LAME
Lame or LAME may refer to:
Music
* "Lame" (song) by Unwritten Law
* ''Lame'' (album) by Iame
People
* Ibrahim Lame (born 1953), Nigerian educator and politician
* Jennifer Lame (), American film editor
* Quintín Lame (1880–1967), Colombian ...
, which is one of several different codecs which implements encoding and decoding audio in the
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
audio coding format in software.
Some audio coding formats are documented by a detailed
technical specification
A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard.
There are different types of technical or engineering specificati ...
document known as an audio coding specification. Some such specifications are written and approved by
standardization organization
A standards organization, standards body, standards developing organization (SDO), or standards setting organization (SSO) is an organization whose primary function is developing, coordinating, promulgating, revising, amending, reissuing, interpr ...
s as
technical standard
A technical standard is an established norm or requirement for a repeatable technical task which is applied to a common and repeated use of rules, conditions, guidelines or characteristics for products or related processes and production methods, ...
s, and are thus known as an audio coding standard. The term "standard" is also sometimes used for
''de facto'' standards as well as formal standards.
Audio content encoded in a particular audio coding format is normally encapsulated within a
container format. As such, the user normally doesn't have a raw
AAC file, but instead has a .m4a
audio file
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, o ...
, which is a
MPEG-4 Part 14
MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows stream ...
container containing AAC-encoded audio. The container also contains
metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive metadata – the descriptive ...
such as title and other tags, and perhaps an index for fast seeking. A notable exception is
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
files, which are raw audio coding without a container format. De facto standards for adding metadata tags such as title and artist to MP3s, such as
ID3
ID3 is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, and other information about the file to be stored in the file itself.
There are tw ...
, are
hack
Hack may refer to:
Arts, entertainment, and media Games
* ''Hack'' (Unix video game), a 1984 roguelike video game
* ''.hack'' (video game series), a series of video games by the multimedia franchise ''.hack''
Music
* ''Hack'' (album), a 199 ...
s which work by appending the tags to the MP3, and then relying on the MP3 player to recognize the chunk as malformed audio coding and therefore skip it. In video files with audio, the encoded audio content is bundled with video (in a
video coding format
A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression algo ...
) inside a
multimedia container format
A container format (informally, sometimes called a wrapper) or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Not ...
.
An audio coding format does not dictate all
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s used by a
codec
A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder.
In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or da ...
implementing the format. An important part of how lossy audio compression works is by removing data in ways humans can't hear, according to a
psychoacoustic model
Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated wit ...
; the implementer of an encoder has some freedom of choice in which data to remove (according to their psychoacoustic model).
Lossless, lossy, and uncompressed audio coding formats
A
lossless
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...
audio coding format reduces the total data needed to represent a sound but can be de-coded to its original, uncompressed form. A
lossy
In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...
audio coding format additionally reduces the
bit resolution of the sound on top of compression, which results in far less data at the cost of irretrievably lost information.
Consumer audio is most often compressed using lossy audio codecs as the smaller size is far more convenient for distribution. The most widely used audio coding formats are
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
and
Advanced Audio Coding
Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate.
AAC has been stan ...
(AAC), both of which are lossy formats based on
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) and
perceptual coding
Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated wit ...
algorithms.
Lossless audio coding formats such as
FLAC
FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software p ...
and
Apple Lossless
The Apple Lossless Audio Codec (ALAC), also known as Apple Lossless, or Apple Lossless Encoder (ALE), is an audio coding format, and its reference audio codec implementation, developed by Apple Inc. for lossless data compression of digital music. ...
are sometimes available, though at the cost of larger files.
Uncompressed audio
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, ofte ...
formats, such as
pulse-code modulation
Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the ...
(PCM, or .wav), are also sometimes used. PCM was the standard format for
Compact Disc Digital Audio
Compact Disc Digital Audio (CDDA or CD-DA), also known as Digital Audio Compact Disc or simply as Audio CD, is the standard format for audio compact discs. The standard is defined in the ''Red Book'', one of a series of Rainbow Books (named ...
(CDDA), before lossy compression eventually became the standard after the introduction of MP3.
History
In 1950,
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
filed the patent on
differential pulse-code modulation
Differential pulse-code modulation (DPCM) is a signal encoder that uses the baseline of pulse-code modulation (PCM) but adds some functionalities based on the prediction of the samples of the signal. The input can be an analog signal or a digital ...
(DPCM).
Adaptive DPCM
Adaptive differential pulse-code modulation (ADPCM) is a variant of differential pulse-code modulation (DPCM) that varies the size of the quantization step, to allow further reduction of the required data bandwidth for a given signal-to-noise ratio ...
(ADPCM) was introduced by P. Cummiskey,
Nikil S. Jayant and
James L. Flanagan
James Loton Flanagan (August 26, 1925 – August 25, 2015) was an American electrical engineer. He was Rutgers University's vice president for research until 2004. He was also director of Rutgers' Center for Advanced Information Processing and t ...
at
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
in 1973.
Perceptual coding
Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how humans perceive various sounds. More specifically, it is the branch of science studying the psychological responses associated wit ...
was first used for
speech coding
Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
compression, with
linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC).
Initial concepts for LPC date back to the work of
Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) method ...
(
Nagoya University
, abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of T ...
) and Shuzo Saito (
Nippon Telegraph and Telephone
, commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 55th in Fortune Global 500, ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as w ...
) in 1966. During the 1970s,
Bishnu S. Atal and
Manfred R. Schroeder at
Bell Labs
Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984),
then AT&T Bell Laboratories (1984–1996)
and Bell Labs Innovations (1996–2007),
is an American industrial research and scientific development company owned by mult ...
developed a form of LPC called
adaptive predictive coding (APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the
code-excited linear prediction
Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
(CELP) algorithm which achieved a significant compression ratio for its time.
Perceptual coding is used by modern audio compression formats such as
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
and
AAC.
Discrete cosine transform (DCT), developed by
Nasir Ahmed, T. Natarajan and
K. R. Rao in 1974,
provided the basis for the
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) used by modern audio compression formats such as MP3
and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987, following earlier work by Princen and Bradley in 1986. The MDCT is used by modern audio compression formats such as
Dolby Digital
Dolby Digital, originally synonymous with Dolby AC-3, is the name for what has now become a family of audio compression technologies developed by Dolby Laboratories. Formerly named Dolby Stereo Digital until 1995, the audio compression is lossy ...
,
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
,
and
Advanced Audio Coding
Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate.
AAC has been stan ...
(AAC).
List of lossy formats
General
Speech
{{further, Speech coding
*
Linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC)
**
Adaptive predictive coding (APC)
**
Code-excited linear prediction
Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
(CELP)
**
Algebraic code-excited linear prediction
Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the cod ...
(ACELP)
**
Relaxed code-excited linear prediction
Relaxed code-excited linear prediction (RCELP) is a method used in some advanced speech codecs. The RCELP algorithm does not attempt to match the original signal exactly. Instead, it matches a time-warped version of this original signal that confo ...
(RCELP)
**
Low-delay CELP G.728 is an ITU-T standard for speech coding operating at 16 kbit/s. It is officially described as ''Coding of speech at 16 kbit/s using low-delay code excited linear prediction''.
Technology used is LD-CELP, low-delay code excited linear pre ...
(LD-CELP)
**
Adaptive Multi-Rate
The Adaptive Multi-Rate (AMR, AMR-NB or GSM-AMR) audio codec is an audio compression format optimized for speech coding. AMR speech codec consists of a multi-rate narrowband speech codec that encodes narrowband (200–3400 Hz) signals at var ...
(used in
GSM
The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation ( 2G) digital cellular networks used by mobile devices such ...
and
3GPP)
**
Codec2 Codec 2 is a low-bitrate speech audio codec (speech coding) that is patent free and open source. Codec 2 compresses speech using sinusoidal coding, a method specialized for human speech. Bit rates of 3200 to 450 bit/s have been successfully cre ...
(noted for its lack of patent restrictions)
**
Speex
Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntrodu ...
(noted for its lack of patent restrictions)
*
Modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT)
**
AAC-LD
The MPEG-4 Low Delay Audio Coder (a.k.a. AAC Low Delay, or AAC-LD) is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the ...
**
Constrained Energy Lapped Transform (CELT)
**
Opus
''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.).
Opus or OPUS may refer to:
Arts and entertainment Music
* Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
(mostly for real-time applications)
List of lossless formats
*
Apple Lossless
The Apple Lossless Audio Codec (ALAC), also known as Apple Lossless, or Apple Lossless Encoder (ALE), is an audio coding format, and its reference audio codec implementation, developed by Apple Inc. for lossless data compression of digital music. ...
(ALAC – Apple Lossless Audio Codec)
*
Adaptive Transform Acoustic Coding
Adaptive Transform Acoustic Coding (ATRAC) is a family of proprietary audio compression algorithms developed by Sony. MiniDisc was the first commercial product to incorporate ATRAC in 1992. ATRAC allowed a relatively small disc like MiniDisc to h ...
(ATRAC)
*
Audio Lossless Coding
MPEG-4 Audio Lossless Coding, also known as MPEG-4 ALS, is an extension to the MPEG-4 Part 3 audio standard to allow lossless audio compression. The extension was finalized in December 2005 and published as ISO/IEC 14496-3:2005/Amd 2:2006 in 200 ...
(also known as MPEG-4 ALS)
*
Direct Stream Transfer
Super Audio CD (SACD) is an optical disc format for audio storage introduced in 1999. It was developed jointly by Sony and Philips Electronics and intended to be the successor to the Compact Disc (CD) format.
The SACD format allows multiple au ...
(DST)
*
Dolby TrueHD
Dolby TrueHD is a lossless, multi-channel audio codec developed by Dolby Laboratories for home video, used principally in Blu-ray Disc and compatible hardware. Dolby TrueHD, along with Dolby Digital Plus (E-AC-3) and Dolby AC-4, is one of the i ...
*
DTS-HD Master Audio
DTS-HD Master Audio (DTS-HD MA; known as DTS++ before 2004) is a multi-channel, lossless audio codec developed by DTS as an extension of the lossy DTS Coherent Acoustics codec (DTS CA; usually itself referred to as just DTS). Rather than being ...
*
Free Lossless Audio Codec
FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference softwar ...
(FLAC)
*
Lossless discrete cosine transform (LDCT)
*
Meridian Lossless Packing
Meridian Lossless Packing, also known as Packed PCM (PPCM), is a lossless compression technique for PCM audio data developed by Meridian Audio, Ltd. MLP is the standard lossless compression method for DVD-Audio content (often advertised with t ...
(MLP)
*
Monkey's Audio
Monkey's Audio is an algorithm and file format for lossless audio data compression. Lossless data compression does not discard data during the process of encoding, unlike lossy compression methods such as Advanced Audio Coding, MP3, Vorbis, a ...
(Monkey's Audio APE)
*
MPEG-4 SLS
MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 (Scalable Lossless Coding), is an extension to the MPEG-4 Part 3 (MPEG-4 Audio) standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio ...
(also known as HD-AAC)
*
OptimFROG
OptimFROG is a proprietary lossless audio data compression codec developed by Florin Ghido. OptimFROG is optimized for very high compression (small file sizes) at the expense of encoding and decoding speed, and consistently measures among the hi ...
*
Original Sound Quality (OSQ)
*
RealPlayer
RealPlayer, formerly RealAudio Player, RealOne Player and RealPlayer G2, is a cross-platform media player app, developed by RealNetworks. The media player is compatible with numerous container file formats of the multimedia realm, including MP ...
(RealAudio Lossless)
*
Shorten (SHN)
*
TTA TTA may refer to
*Tan Tan Airport, Morocco, IATA code
*Teacher Training Agency, former name of the Training and Development Agency for Schools, England
* Technical Theatre Awards, UK
*Terran Trade Authority, the setting for a series of science-fict ...
(True Audio Lossless)
*
WavPack
WavPack is a free and open-source lossless audio compression format and application implementing the format. It is unique in the way that it supports hybrid audio compression alongside normal compression which is similar to how FLAC works. I ...
(WavPack lossless)
*
WMA Lossless (Windows Media Lossless)
See also
*
Comparison of audio coding formats
The following tables compare general and technical information for a variety of audio coding formats.
For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test.
General informatio ...
*
Data compression#Audio
*
Audio file format
An audio file format is a file format for storing digital audio data on a computer system. The bit layout of the audio data (excluding metadata) is called the audio coding format and can be uncompressed, or compressed to reduce the file size, o ...
*
List of audio compression formats
The following is a list of compression formats and related codecs.
Audio compression formats
Non-compression
* Linear pulse-code modulation (LPCM, generally only described as PCM) is the format for uncompressed audio in media files and it is als ...
References