An audio coding format (or sometimes audio compression format) is a
content representation format for storage or transmission of
digital audio
Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, sa ...
(such as in
digital television,
digital radio
Digital radio is the use of digital technology to transmit or receive across the radio spectrum. Digital transmission by radio waves includes digital broadcasting, and especially digital audio radio services.
Types
In digital broadcasting syst ...
and in audio and video files). Examples of audio coding formats include
MP3,
AAC
AAC may refer to:
Aviation
* Advanced Aircraft, a company from Carlsbad, California
* Alaskan Air Command, a radar network
* American Aeronautical Corporation, a company from Port Washington, New York
* American Aviation, a company from Cleveland, ...
,
Vorbis,
FLAC
FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software p ...
, and
Opus. A specific software or hardware implementation capable of
audio compression and decompression to/from a specific audio coding format is called an
audio codec; an example of an audio codec is
LAME, which is one of several different codecs which implements encoding and decoding audio in the
MP3 audio coding format in software.
Some audio coding formats are documented by a detailed
technical specification document known as an audio coding specification. Some such specifications are written and approved by
standardization organizations as
technical standards, and are thus known as an audio coding standard. The term "standard" is also sometimes used for
''de facto'' standards as well as formal standards.
Audio content encoded in a particular audio coding format is normally encapsulated within a
container format
A container format (informally, sometimes called a wrapper) or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Notab ...
. As such, the user normally doesn't have a raw
AAC
AAC may refer to:
Aviation
* Advanced Aircraft, a company from Carlsbad, California
* Alaskan Air Command, a radar network
* American Aeronautical Corporation, a company from Port Washington, New York
* American Aviation, a company from Cleveland, ...
file, but instead has a .m4a
audio file, which is a
MPEG-4 Part 14 container containing AAC-encoded audio. The container also contains
metadata
Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:
* Descriptive metadata – the descriptive ...
such as title and other tags, and perhaps an index for fast seeking. A notable exception is
MP3 files, which are raw audio coding without a container format. De facto standards for adding metadata tags such as title and artist to MP3s, such as
ID3, are
hacks which work by appending the tags to the MP3, and then relying on the MP3 player to recognize the chunk as malformed audio coding and therefore skip it. In video files with audio, the encoded audio content is bundled with video (in a
video coding format) inside a
multimedia container format.
An audio coding format does not dictate all
algorithms used by a
codec implementing the format. An important part of how lossy audio compression works is by removing data in ways humans can't hear, according to a
psychoacoustic model; the implementer of an encoder has some freedom of choice in which data to remove (according to their psychoacoustic model).
Lossless, lossy, and uncompressed audio coding formats
A
lossless audio coding format reduces the total data needed to represent a sound but can be de-coded to its original, uncompressed form. A
lossy audio coding format additionally reduces the
bit resolution of the sound on top of compression, which results in far less data at the cost of irretrievably lost information.
Consumer audio is most often compressed using lossy audio codecs as the smaller size is far more convenient for distribution. The most widely used audio coding formats are
MP3 and
Advanced Audio Coding (AAC), both of which are lossy formats based on
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) and
perceptual coding algorithms.
Lossless audio coding formats such as
FLAC
FLAC (; Free Lossless Audio Codec) is an audio coding format for lossless compression of digital audio, developed by the Xiph.Org Foundation, and is also the name of the free software project producing the FLAC tools, the reference software p ...
and
Apple Lossless are sometimes available, though at the cost of larger files.
Uncompressed audio formats, such as
pulse-code modulation (PCM, or .wav), are also sometimes used. PCM was the standard format for
Compact Disc Digital Audio (CDDA), before lossy compression eventually became the standard after the introduction of MP3.
History
In 1950,
Bell Labs filed the patent on
differential pulse-code modulation (DPCM).
Adaptive DPCM (ADPCM) was introduced by P. Cummiskey,
Nikil S. Jayant and
James L. Flanagan at
Bell Labs in 1973.
Perceptual coding was first used for
speech coding compression, with
linear predictive coding (LPC).
Initial concepts for LPC date back to the work of
Fumitada Itakura (
Nagoya University
, abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of T ...
) and Shuzo Saito (
Nippon Telegraph and Telephone) in 1966. During the 1970s,
Bishnu S. Atal
Bishnu S. Atal (born 1933) is an Indian physicist and engineer. He is a noted researcher in acoustics, and is best known for developments in speech coding. He advanced linear predictive coding (LPC) during the late 1960s to 1970s, and develope ...
and
Manfred R. Schroeder
Manfred Robert Schroeder (12 July 1926 – 28 December 2009) was a German physicist, most known for his contributions to acoustics and computer graphics. He wrote three books and published over 150 articles in his field.
Born in Ahlen, he stud ...
at
Bell Labs developed a form of LPC called
adaptive predictive coding Adaptive predictive coding (APC) is a narrowband analog-to-digital conversion that uses a one-level or multilevel sampling system in which the value of the signal at each sampling instant is predicted according to a linear function of the past valu ...
(APC), a perceptual coding algorithm that exploited the masking properties of the human ear, followed in the early 1980s with the
code-excited linear prediction (CELP) algorithm which achieved a significant compression ratio for its time.
Perceptual coding is used by modern audio compression formats such as
MP3 and
AAC
AAC may refer to:
Aviation
* Advanced Aircraft, a company from Carlsbad, California
* Alaskan Air Command, a radar network
* American Aeronautical Corporation, a company from Port Washington, New York
* American Aviation, a company from Cleveland, ...
.
Discrete cosine transform (DCT), developed by
Nasir Ahmed, T. Natarajan and
K. R. Rao
Kamisetty Ramamohan Rao was an Indian-American electrical engineer. He was a professor of Electrical Engineering at the University of Texas at Arlington (UT Arlington). Academically known as K. R. Rao, he is credited with the co-invention of di ...
in 1974,
provided the basis for the
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) used by modern audio compression formats such as MP3
and AAC. MDCT was proposed by J. P. Princen, A. W. Johnson and A. B. Bradley in 1987, following earlier work by Princen and Bradley in 1986. The MDCT is used by modern audio compression formats such as
Dolby Digital
Dolby Digital, originally synonymous with Dolby AC-3, is the name for what has now become a family of audio compression technologies developed by Dolby Laboratories. Formerly named Dolby Stereo Digital until 1995, the audio compression is lossy ...
,
MP3,
and
Advanced Audio Coding (AAC).
List of lossy formats
General
Speech
{{further, Speech coding
*
Linear predictive coding (LPC)
**
Adaptive predictive coding Adaptive predictive coding (APC) is a narrowband analog-to-digital conversion that uses a one-level or multilevel sampling system in which the value of the signal at each sampling instant is predicted according to a linear function of the past valu ...
(APC)
**
Code-excited linear prediction (CELP)
**
Algebraic code-excited linear prediction (ACELP)
**
Relaxed code-excited linear prediction (RCELP)
**
Low-delay CELP (LD-CELP)
**
Adaptive Multi-Rate (used in
GSM and
3GPP)
**
Codec2 Codec 2 is a low-bitrate speech audio codec (speech coding) that is patent free and open source. Codec 2 compresses speech using sinusoidal coding, a method specialized for human speech. Bit rates of 3200 to 450 bit/s have been successfully cre ...
(noted for its lack of patent restrictions)
**
Speex (noted for its lack of patent restrictions)
*
Modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT)
**
AAC-LD
**
Constrained Energy Lapped Transform
The Celts (, see Names of the Celts#Pronunciation, pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-Europea ...
(CELT)
**
Opus (mostly for real-time applications)
List of lossless formats
*
Apple Lossless (ALAC – Apple Lossless Audio Codec)
*
Adaptive Transform Acoustic Coding (ATRAC)
*
Audio Lossless Coding (also known as MPEG-4 ALS)
*
Direct Stream Transfer (DST)
*
Dolby TrueHD
*
DTS-HD Master Audio
*
Free Lossless Audio Codec (FLAC)
*
Lossless discrete cosine transform (LDCT)
*
Meridian Lossless Packing (MLP)
*
Monkey's Audio (Monkey's Audio APE)
*
MPEG-4 SLS (also known as HD-AAC)
*
OptimFROG
*
Original Sound Quality
{{unreferenced, date=March 2010
Original Sound Quality (OSQ) is an audio file format developed in 2002 by '' Steinberg Media Technologies GmbH'' and implemented e.g. in their audio editing software ''Wavelab 4'' (and following releases) for lossle ...
(OSQ)
*
RealPlayer (RealAudio Lossless)
*
Shorten (SHN)
*
TTA TTA may refer to
*Tan Tan Airport, Morocco, IATA code
*Teacher Training Agency, former name of the Training and Development Agency for Schools, England
* Technical Theatre Awards, UK
*Terran Trade Authority, the setting for a series of science-fict ...
(True Audio Lossless)
*
WavPack (WavPack lossless)
*
WMA Lossless (Windows Media Lossless)
See also
*
Comparison of audio coding formats
*
Data compression#Audio
*
Audio file format
*
List of audio compression formats
The following is a list of compression formats and related codecs.
Audio compression formats
Non-compression
* Linear pulse-code modulation (LPCM, generally only described as PCM) is the format for uncompressed audio in media files and it is als ...
References