G.711.0
   HOME

TheInfoList



OR:

G.711 is a
narrowband Narrowband signals are signals that occupy a narrow range of frequencies or that have a small fractional bandwidth. In the audio spectrum, narrowband sounds are sounds that occupy a narrow range of frequencies. In telephony, narrowband is usua ...
audio codec originally designed for use in
telephony Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second, with the tolerance on that rate of 50 parts per million (ppm). Non-uniform (logarithmic) quantization with 8 bits is used to represent each sample, resulting in a 64 kbit/s bit rate. There are two slightly different versions: μ-law, which is used primarily in North America and Japan, and
A-law An A-law algorithm is a standard companding algorithm, used in European 8-bit PCM digital communications systems to optimize, i.e. modify, the dynamic range of an analog signal for digitizing. It is one of two versions of the G.711 standard ...
, which is in use in most other countries outside North America. G.711 is an
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Commu ...
standard (Recommendation) for audio
companding In telecommunication and signal processing, companding (occasionally called compansion) is a method of mitigating the detrimental effects of a channel with limited dynamic range. The name is a portmanteau of the words compressing and expanding, ...
, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972. It is a required standard in many technologies, such as in the
H.320 H.320 or Narrow-band visual telephone systems and terminal equipment is an ''umbrella Recommendation'' by the ITU-T for running Multimedia (Audio/Video/Data) over ISDN based networks. The main protocols in this suite are H.221, H.230, H.242, au ...
and
H.323 H.323 is a recommendation from the ITU Telecommunication Standardization Sector (ITU-T) that defines the protocols to provide audio-visual communication sessions on any packet network. The H.323 standard addresses call signaling and control, m ...
standards. It can also be used for
fax Fax (short for facsimile), sometimes called telecopying or telefax (the latter short for telefacsimile), is the telephonic transmission of scanned printed material (both text and images), normally to a telephone number connected to a printer o ...
communication over IP networks (as defined in
T.38 T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. History The T.38 fax relay standard was devised in 1998 as a way to permit faxes to be transported across IP networks between existing Group 3 ...
specification). Two enhancements to G.711 have been published: G.711.0 utilizes
lossless data compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...
to reduce the bandwidth usage and G.711.1 increases audio quality by increasing bandwidth.


Features

* 8 kHz sampling frequency * 64 kbit/s bitrate (8 kHz sampling frequency × 8 bits per sample) * Typical algorithmic delay is 0.125 ms, with no look-ahead delay * G.711 is a waveform
speech coder Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if they are th ...
* G.711 Appendix I defines a
packet loss concealment Packet loss concealment (PLC) is a technique to mask the effects of packet loss in voice over IP (VoIP) communications. When the voice signal is sent as VoIP packets on an IP network, the packets may (and likely will) travel different routes. A pa ...
(PLC) algorithm to help hide transmission losses in a packetized network * G.711 Appendix II defines a
discontinuous transmission Discontinuous transmission (DTX) is a means by which a mobile telephone is temporarily shut off or muted while the phone lacks a voice input. Misconception A common misconception is that DTX improves capacity by freeing up TDMA time slots for us ...
(DTX) algorithm which uses
voice activity detection Voice activity detection (VAD), also known as speech activity detection or speech detection, is the detection of the presence or absence of human speech, used in speech processing. The main uses of VAD are in speech coding and speech recognition. I ...
(VAD) and comfort noise generation (CNG) to reduce bandwidth usage during silence periods *
PSQM Perceptual Speech Quality Measure (PSQM) is a computational and modeling algorithm defined in Recommendation ITU-T P.861 that objectively evaluates and quantifies voice quality of voice-band (300 – 3400 Hz) :Speech codecs, speech codecs. It ...
testing under ideal conditions yields
mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...
s of 4.45 for G.711 μ-law, 4.45 for G.711 A-law * PSQM testing under network stress yields
mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...
s of 4.13 for G.711 μ-law, 4.11 for G.711 A-law


Types

G.711 defines two main
companding In telecommunication and signal processing, companding (occasionally called compansion) is a method of mitigating the detrimental effects of a channel with limited dynamic range. The name is a portmanteau of the words compressing and expanding, ...
algorithms, the
μ-law algorithm The μ-law algorithm (sometimes written Mu (letter), mu-law, often typographic approximation, approximated as u-law) is a companding algorithm, primarily used in 8-bit PCM Digital data, digital telecommunication systems in North America and Jap ...
and
A-law algorithm An A-law algorithm is a standard companding algorithm, used in European 8-bit PCM digital communications systems to optimize, i.e. modify, the dynamic range of an analog signal for digitizing. It is one of two versions of the G.711 standard f ...
. Both are logarithmic, but A-law was specifically designed to be simpler for a computer to process. The standard also defines a sequence of repeating code values which defines the power level of 0 dB. The μ-law and A-law algorithms encode 14-bit and 13-bit signed linear PCM samples (respectively) to logarithmic 8-bit samples. Thus, the G.711
encoder Encoder may refer to: Electronic circuits * Audio encoder, converts digital audio to analog audio signals * Video encoder, converts digital video to analog video signals * Simple encoder, assigns a binary code to an active input line * Priority e ...
will create a 64 kbit/s bitstream for a signal sampled at 8 kHz. G.711 μ-law tends to give more resolution to higher range signals while G.711 A-law provides more quantization levels at lower signal levels. The terms PCMU, G711u or G711MU for G711 μ-law, and PCMA or G711A for G711 A-law, are used.


A-law

A-law encoding thus takes a 13-bit signed linear audio sample as input and converts it to an 8 bit value as follows: Where is the sign bit, is its inverse (i.e. positive values are encoded with MSB =  = 1), and bits marked are discarded. Note that the first column of the table uses different representation of negative values than the third column. So for example, input decimal value −21 is represented in binary after bit inversion as 1000000010100, which maps to 00001010 (according to the first row of the table). When decoding, this maps back to 1000000010101, which is interpreted as output value −21 in decimal. Input value +52 (0000000110100 in binary) maps to 10011010 (according to the second row), which maps back to 0000000110101 (+53 in decimal). This can be seen as a
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
number with 4 bits of mantissa (equivalent to a 5-bit precision), 3 bits of
exponent Exponentiation is a mathematical operation, written as , involving two numbers, the '' base'' and the ''exponent'' or ''power'' , and pronounced as " (raised) to the (power of) ". When is a positive integer, exponentiation corresponds to re ...
and 1 sign bit , formatted as eeemmmm with the decoded linear value given by formula :y = (-1)^s \cdot (16 \cdot \min \ + m + 0.5) \cdot 2^, which is a 13-bit signed integer in the range ±1 to ±(2 âˆ’ 2). Note that no compressed code decodes to zero due to the addition of 0.5 (half of a quantization step). In addition, the standard specifies that all resulting even bits ( LSB is even) are inverted before the octet is transmitted. This is to provide plenty of 0/1 transitions to facilitate the
clock recovery In serial communication of digital data, clock recovery is the process of extracting timing information from a serial data stream itself, allowing the timing of the data in the stream to be accurately determined without separate clock information. ...
process in the PCM receivers. Thus, a silent A-law encoded PCM channel has the 8 bit samples coded 0xD5 instead of 0x80 in the octets. When data is sent over E0 (
G.703 G.703 is a 2016 ITU-T standard for encoding voice or data over digital carriers such as T1 and E1. G.703 provides specifications for pulse-code modulation (PCM). G.703 also specifies E0 (64kbit/s). For information about E0 audio see G.711. ...
), MSB (sign) is sent first and LSB is sent last. ITU-T STL defines the algorithm for decoding as follows (it puts the decoded values in the 13 most significant bits of the 16-bit output data type). void alaw_expand(lseg, logbuf, linbuf) long lseg; short *linbuf; short *logbuf; See also "ITU-T Software Tool Library 2009 User's manual" that can be found at.


μ-law

The μ-law (sometimes referred to as ulaw, G.711Mu, or G.711μ) encoding takes a 14-bit signed linear audio sample in
two's complement Two's complement is a mathematical operation to reversibly convert a positive binary number into a negative binary number with equivalent (but negative) value, using the binary digit with the greatest place value (the leftmost bit in big- endian ...
representation as input, inverts all bits after the sign bit if the value is negative, adds 33 (binary 100001) and converts it to an 8 bit value as follows: Where is the sign bit, and bits marked are discarded. In addition, the standard specifies that the encoded bits are inverted before the octet is transmitted. Thus, a silent μ-law encoded PCM channel has the 8 bit samples transmitted 0xFF instead of 0x00 in the octets. Adding 33 is necessary so that all values fall into a compression group and it is subtracted back when decoding. Breaking the encoded value formatted as seeemmmm into 4 bits of mantissa , 3 bits of exponent and 1 sign bit , the decoded linear value is given by formula :y = (-1)^s \cdot 33 + 2m) \cdot 2^e - 33 which is a 14-bit signed integer in the range ±0 to ±8031. Note that 0 is transmitted as 0xFF, and −1 is transmitted as 0x7F, but when received the result is 0 in both cases.


G.711.0

G.711.0, also known as G.711 LLC, utilizes
lossless data compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...
to reduce the bandwidth usage by as much as 50 percent. The ''Lossless compression of G.711 pulse code modulation'' standard was approved by ITU-T in September 2009.


G.711.1

G.711.1 ''"Wideband embedded extension for G.711 pulse code modulation"'' is a higher-fidelity extension to G.711, ratified in 2008 and further extended in 2012. G.711.1 allows a series of enhancement layers on top of a raw G.711 core stream (Layer 0): Layer 1 codes 16-bit audio in the same 4kHz narrowband, and Layer 2 allows 8kHz
wideband In communications, a system is wideband when the message bandwidth significantly exceeds the coherence bandwidth of the Channel (communications), channel. Some communication links have such a high Bit rate, data rate that they are forced to use a ...
using
MDCT The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where su ...
; each uses a fixed 16kbps in addition to the 64kbps core. They may be used together or singly, and each encodes the differences from the previous layer. Ratified in 2012, Layer 3 extends Layer 2 to 16kHz "superwideband," allowing another 16kbps for the highest frequencies, while retaining layer independence. Peak bitrate becomes 96 kbps in original G.711.1, or 112 kbps with superwideband. No internal method of identifying or separating the layers is defined, leaving it to the implementation to packetize or signal them. A decoder that doesn't understand any set of fidelity layers may ignore or drop non-core packets without affecting it, enabling graceful degradation across any G.711 (or original G.711.1) telephony system with no changes. Also ratified in 2012 was G.711.0 lossless extended to the new fidelity layers. Like G.711.0, full G.711 backward compatibility is sacrificed for efficiency, though a G.711.0 aware node may still ignore or drop layer packets it doesn't understand.


Licensing

The patents for G.711, released in 1972, have expired, so it may be used without the need for a licence.


See also

*
List of codecs The following is a list of compression formats and related codecs. Audio compression formats Non-compression * Linear pulse-code modulation (LPCM, generally only described as PCM) is the format for uncompressed audio in media files and it is al ...
*
Comparison of audio coding formats The following tables compare general and technical information for a variety of audio coding formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test. General informatio ...
*
RTP audio video profile The Real-time Transport Protocol (RTP) specifies a general-purpose data format and network protocol for transmitting digital media streams on Internet Protocol (IP) networks. The details of media encoding, such as signal sampling rate, frame size an ...
*
Au file format The Au file format is a simple audio file format introduced by Sun Microsystems. The format was common on NeXT systems and on early Web pages. Originally it was headerless, being simply 8-bit mu-law, μ-law-encoded data at an 8000 Hz sample ...


References


External links


ITU-T Recommendation G.711

ITU-T G.191 software tools for speech and audio coding, including G.711 C code

Code Project C# implementation of G.711 with source code

RFC 3551 - RTP Profile for Audio and Video Conferences with Minimal Control
- G.711 - PCMA and PCMU definition.
RFC 4856 - Registration of Media Type audio/PCMA and audio/PCMU
* - RTP Payload Format for ITU-T Recommendation G.711.1 (PCMA-WB and PCMU-WB) {{Compression formats Audio codecs Speech codecs ITU-T recommendations ITU-T G Series Recommendations Telecommunications-related introductions in 1972