The Adaptive Multi-Rate (AMR, AMR-NB or GSM-AMR) audio codec is an
audio compression format optimized for
speech coding
Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
. AMR speech codec consists of a multi-rate
narrowband
Narrowband signals are signals that occupy a narrow range of frequencies or that have a small fractional bandwidth. In the audio spectrum, narrowband sounds are sounds that occupy a narrow range of frequencies. In telephony, narrowband is usua ...
speech codec that encodes narrowband (200–3400 Hz) signals at variable bit rates ranging from 4.75 to 12.2 kbit/s with toll quality speech starting at 7.4 kbit/s.
AMR was adopted as the standard speech
codec
A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder.
In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or da ...
by
3GPP in October 1999 and is now widely used in
GSM
The Global System for Mobile Communications (GSM) is a standard developed by the European Telecommunications Standards Institute (ETSI) to describe the protocols for second-generation ( 2G) digital cellular networks used by mobile devices such as ...
and
UMTS
The Universal Mobile Telecommunications System (UMTS) is a third generation mobile cellular system for networks based on the GSM standard. Developed and maintained by the 3GPP (3rd Generation Partnership Project), UMTS is a component of the Inte ...
. It uses
link adaptation Link adaptation, comprising adaptive coding and modulation (ACM) and others (such as Power Control), is a term used in wireless communications to denote the matching of the modulation, coding and other signal and protocol parameters to the conditio ...
to select from one of eight different bit rates based on link conditions.
AMR is also a file format for storing spoken audio using the AMR codec. Many modern mobile telephone handsets can store short audio recordings in the AMR format, and both
free
Free may refer to:
Concept
* Freedom, having the ability to do something, without having to obey anyone/anything
* Freethought, a position that beliefs should be formed only on the basis of logic, reason, and empiricism
* Emancipate, to procur ...
and proprietary programs exist (see
Software support) to convert between this and other formats, although AMR is a speech format and is unlikely to give ideal results for other audio. The common
filename extension
A filename extension, file name extension or file extension is a suffix to the name of a computer file (e.g., .txt, .docx, .md). The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically d ...
is
.amr
. There also exists another storage format for AMR that is suitable for applications with more advanced demands on the storage format, like random access or synchronization with video. This format is the 3GPP-specified
3GP
3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.
3G2 (3GPP2 ...
container format based on ISO base media file format.
Usage
The frames contain 160 samples and are 20 milliseconds long.
AMR uses various techniques, such as
ACELP
Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the code- ...
,
DTX,
VAD and
CNG
Compressed natural gas (CNG) is a fuel gas mainly composed of methane (CH4), compressed to less than 1% of the volume it occupies at standard atmospheric pressure. It is stored and distributed in hard containers at a pressure of , usually in cyl ...
. The usage of AMR requires optimized link adaptation that selects the best codec mode to meet the local radio channel and capacity requirements. If the radio conditions are bad,
source coding
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
is reduced and
channel coding
In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
is increased. This improves the quality and robustness of the network connection while sacrificing some voice clarity. In the particular case of AMR this improvement is somewhere around S/N = 4–6 dB for usable communication. The new intelligent system allows the network operator to prioritize capacity or quality per base station.
There are a total of 14 modes of the AMR codec, eight are available in a
full rate channel (FR) and six on a
half rate channel (HR).
Features
* Sampling frequency 8 kHz/13-bit (160 samples for 20 ms frames), filtered to 200–3400 Hz.
* The AMR codec uses eight source codecs with bit-rates of 12.2, 10.2, 7.95, 7.40, 6.70, 5.90, 5.15 and 4.75 kbit/s.
* Generates frame length of 95, 103, 118, 134, 148, 159, 204, or 244 bits for AMR FR bit rates 4.75, 5.15, 5.90, 6.70, 7.40, 7.95, 10.2, or 12.2 kbit/s, respectively. AMR HR frame lengths are different.
* AMR utilizes
discontinuous transmission
Discontinuous transmission (DTX) is a means by which a mobile telephone is temporarily shut off or muted while the phone lacks a voice input.
Misconception
A common misconception is that DTX improves capacity by freeing up TDMA time slots for us ...
(DTX), with
voice activity detection Voice activity detection (VAD), also known as speech activity detection or speech detection, is the detection of the presence or absence of human speech, used in speech processing. The main uses of VAD are in speech coding and speech recognition. I ...
(VAD) and
comfort noise generation (CNG) to reduce bandwidth usage during silence periods
* Algorithmic delay is 20 ms per frame. For bit-rates of 12.2, there is no "algorithm" look-ahead delay. For other rates, look-ahead delay is 5 ms. Note that there is 5 ms "dummy" look-ahead delay, to allow seamless frame-wise mode switching with the rest of rates.
* AMR is a hybrid speech coder, and as such transmits both speech parameters and a waveform signal
**
Linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC) is used to synthesize the speech from a residual waveform. The LPC parameters are encoded as
line spectral pairs
Line spectral pairs (LSP) or line spectral frequencies (LSF) are used to represent linear prediction coefficients (LPC) for transmission over a channel. LSPs have several properties (e.g. smaller sensitivity to quantization noise) that make them s ...
(LSP).
** The residual waveform is coded using
algebraic code-excited linear prediction
Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the code- ...
(ACELP).
* The complexity of the algorithm is rated at 5, using a relative scale where
G.711
G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second ...
is 1 and
G.729a
G.729 is a royalty-free narrow-band vocoder-based audio data compression algorithm using a frame length of 10 milliseconds. It is officially described as ''Coding of speech at 8 kbit/s using code-excited linear prediction'' speech coding (CS-ACEL ...
is 15.
*
PSQM
Perceptual Speech Quality Measure (PSQM) is a computational and modeling algorithm defined in Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band (300 – 3400 Hz) :Speech codecs, speech codecs.
It ...
testing under ideal conditions yields
mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...
s of 4.14 for AMR (12.2 kbit/s), compared to 4.45 for
G.711
G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second ...
(μ-law)
* PSQM testing under network stress yields
mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...
s of 3.79 for AMR (12.2 kbit/s), compared to 4.13 for
G.711
G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second ...
(μ-law)
Licensing and patent issues
AMR codecs incorporate several
patent
A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an enabling disclosure of the invention."A p ...
s of
Nokia
Nokia Corporation (natively Nokia Oyj, referred to as Nokia) is a Finnish multinational corporation, multinational telecommunications industry, telecommunications, technology company, information technology, and consumer electronics corporatio ...
,
Ericsson
(lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish multinational networking and telecommunications company headquartered in Stockholm. The company sells infrastructure, software, and services in informat ...
,
NTT and VoiceAge,
the last one being the ''License Administrator'' for the AMR
patent pool
In patent law, a patent pool is a consortium of at least two companies agreeing to cross-license patents relating to a particular technology. The creation of a patent pool can save patentees and licensees time and money, and, in case of blocking ...
s. VoiceAge also accepts submission of patents for determination of their possible essentiality to these standards. However, it's very difficult to determine if there were actually any patents in existence for the so-called inventions related to AMR/AMR-WB codecs, since inventors (and their lawyers) do everything they can to hide patents related to AMR/AMR-WB technology. Apparently, all these patents are hidden from all other researches and general audience that could perhaps spot prior art in the claimed "inventions" patented by the patent holders of the AMR/AMR-WB codecs.
The initial fee for professional content creation tools and "real-time channel" products is US$6,500. The minimum annual royalty is $10,000, which, in the first year, excludes the initial fee. Per-channel license fees fall from $0.99 to $0.50 with volume, up to a maximum of $2 million annually.
In the category of personal computer products, e.g., media players, the AMR decoder is licensed for free. The license fee for a sold encoder falls from $0.40 to $0.30 with volume, up to a maximum of $300,000 annually. The minimum annual royalty is not applied to licensed products that fall under the category of personal computer products and use only the free decoder.
More information:
VoiceAge licensing information including pricing to license the AMR codecs
AMR Codecs as Shared Libraries— legal notices for usage of amrnb and amrwb libraries based on the reference implementation
Software support
* 3GPP TS 26.073AMR speech Codec (C source code)reference implementation
*
Audacity (beta version 1.3) via the FFmpeg integration libraries
[Retrieved on 2010-02-28] (both input and output format)
*
FFmpeg
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of vid ...
with OpenCORE AMR libraries
[FFmpeg General Documentation - AMR external library](_blank)
Retrieved on 2009-07-08
*
Android[Android AMR codecs](_blank)
Retrieved on 2009-07-08 Used for voice recorder.
AMR Codecs as Shared Librariesmrnb and amrwb libraries development site. These libraries are based on the reference implementation and were created to prevent embedding of possibly patented source code into many open source projects.
* Open source software to convert the .amr format
RetroCodeAmr2Wav both are in an early developmental stage
AMR Playeris freeware to play AMR audio files, and can convert AMR from/to MP3/WAV audio format.
can convert (create) samples, one can use Nokia's conversion tool to create both .amr and .awb files. It works in Windows 7 as well if the setup is run in XP compatibility mode.
*
MPlayer
MPlayer is a free and open-source media player software application. It is available for Linux, OS X and Microsoft Windows. Versions for OS/2, Syllable, AmigaOS, MorphOS and AROS Research Operating System are also available. A port for DOS usi ...
(
SMPlayer
SMPlayer is a cross-platform graphical front-end for MPlayer and mpv and forks of Mplayer using GUI widgets offered by Qt. SMPlayer is free and open-source software subject to the terms of the GNU General Public License version 2 or later. SMpla ...
,
KMPlayerKMPlayer Internal Audio Decoder Preferences
, Retrieved 2014-10-22)
* Parole Media Player 0.8.1 (in Ubuntu 16.04)
* QuickTime
QuickTime is an extensible multimedia framework developed by Apple Inc., capable of handling various formats of digital video, picture, sound, panoramic images, and interactivity. Created in 1991, the latest Mac version, QuickTime X, is avai ...
Player and multimedia framework
* RealPlayer
RealPlayer, formerly RealAudio Player, RealOne Player and RealPlayer G2, is a cross-platform media player app, developed by RealNetworks. The media player is compatible with numerous container file formats of the multimedia realm, including MP ...
version 11 and later
* VLC media player
VLC media player (previously the VideoLAN Client and commonly known as simply VLC) is a free and open-source, portable, cross-platform media player software and streaming media server developed by the VideoLAN project. VLC is available for desk ...
version 1.1.0 and later (input format only, not output format)
* ffdshow
ffdshow is an open-source unmaintained codec library that is mainly used for decoding of video in the MPEG-4 ASP (e.g. encoded with DivX or Xvid) and H.264/MPEG-4 AVC video formats, but it supports numerous other video and audio formats as we ...
* Apple iPhone (can play back AMR files)
* iOS
iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also includes ...
& macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
(iMessage)
* BlackBerry
The blackberry is an edible fruit produced by many species in the genus ''Rubus'' in the family Rosaceae, hybrids among these species within the subgenus ''Rubus'', and hybrids between the subgenera ''Rubus'' and ''Idaeobatus''. The taxonomy of ...
smartphones (used for voice recorder file format, while BlackBerry 10
BlackBerry 10 is a discontinued proprietary mobile operating system for the BlackBerry line of smartphones, both developed by BlackBerry Limited (formerly Research In Motion). BlackBerry 10 is based on QNX, a Unix-like operating system that was o ...
cannot play AMR format)
* K-Lite Codec Pack
The K-Lite Codec Pack is a collection of audio and video codecs for Microsoft Windows DirectShow that enables an operating system and its software to play various audio and video formats generally not supported by the operating system itself. T ...
* Media Player Classic Home Cinema, around 1.7.1
* foobar2000
foobar2000 (often abbreviated as fb2k or f2k) is a freeware audio player for Microsoft Windows, iOS and Android developed by Peter Pawłowski. It has a modular design, which provides user flexibility in configuration and customization. Stan ...
with the componen
foo_input_amr
See also
* Adaptive Multi-Rate Wideband
Adaptive Multi-Rate Wideband (AMR-WB) is a patented Wideband audio, wideband speech coding, speech audio coding standard developed based on Adaptive Multi-Rate audio codec, Adaptive Multi-Rate encoding, using a similar methodology to algebraic cod ...
(AMR-WB)
* Extended Adaptive Multi-Rate – Wideband
Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding (transform coded excitation – TCX) a ...
(AMR-WB+)
* Half Rate
Half Rate (HR or GSM-HR or GSM 06.20) is a speech coding system for GSM, developed in the early 1990s.
Since the codec, operating at 5.6 kbit/s, requires half the bandwidth of the Full Rate codec, network capacity for voice traffic is doubled, at ...
* Full Rate
Full Rate (FR or GSM-FR or GSM 06.10 or sometimes simply GSM) was the first digital speech coding standard used in the GSM digital mobile phone system. It uses linear predictive coding (LPC). The bit rate of the codec is 13 kbit/s, or 1.625 bits/a ...
* Enhanced Full Rate (EFR)
* Sampling rate
In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples".
A sample is a value of the signal at a point in time and/or spac ...
* IS-641
TIA/EIA standard IS-641 is a speech coding standard used in some computer and telecommunications networks in the U.S.A. The main usage was in the U.S. TDMA networks defined by IS-136. The bit rate of the speech codec
A codec is a device or comp ...
* 3GP
3GP (3GPP file format) is a multimedia container format defined by the Third Generation Partnership Project (3GPP) for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.
3G2 (3GPP2 ...
* Comparison of audio coding formats
The following tables compare general and technical information for a variety of audio coding formats.
For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test.
General informatio ...
* RTP audio video profile The Real-time Transport Protocol (RTP) specifies a general-purpose data format and network protocol for transmitting digital media streams on Internet Protocol (IP) networks. The details of media encoding, such as signal sampling rate, frame size an ...
References
External links
3GPP TS 26.090Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec; Transcoding functions
* ttp://www.3gpp.org/ftp/Specs/html-info/26-series.htm 3GPP codecs specifications; 3G and beyond / GSM, 26 series* RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio Codecs
* The Codecs Parameter for "Bucket" Media Types
{{Compression formats
Speech codecs
1999 software