Unified Speech and Audio Coding
   HOME

TheInfoList



OR:

Unified Speech and Audio Coding (USAC) is an audio compression format and
codec A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or ...
for both music and speech or any mix of speech and audio using very low bit rates between 12 and 64 kbit/s. It was developed by Moving Picture Experts Group (MPEG) and was published as an international standard
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
/
IEC The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
23003-3 (a.k.a.
MPEG-D MPEG-D is a group of standards for audio coding formally known as '' ISO/IEC 23003'' - ''MPEG audio technologies'', published since 2007. MPEG-D consists of four parts: * MPEG-D Part 1: MPEG Surround MPEG Surround (ISO/ IEC 23003-1 or MPEG-D P ...
Part 3) and also as an
MPEG-4 Audio MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was publish ...
Object Type in ISO/IEC 14496-3:2009/Amd 3 in 2012. It uses time-domain linear prediction and residual coding tools (
ACELP Algebraic code-excited linear prediction (ACELP) is a speech coding algorithm in which a limited set of pulses is distributed as excitation to a linear prediction filter. It is a linear predictive coding (LPC) algorithm that is based on the cod ...
-like techniques) for speech signal segments and transform coding tools (
MDCT The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where ...
-based techniques) for music signal segments and it is able to switch between the tool sets dynamically in a signal-responsive manner. It is being developed with the aim of a single, unified coder with performance that equals or surpasses that of dedicated speech coders and dedicated music coders over a broad range of bitrates. Enhanced variations of the MPEG-4
Spectral Band Replication Spectral band replication (SBR) is a technology to enhance audio or speech codecs, especially at low bit rates and is based on harmonic redundancy in the frequency domain. It can be combined with any audio compression codec: the codec itself tra ...
(SBR) and MPEG-D MPEG Surround parametric coding tools are integrated into the USAC codec.


Extended HE-AAC

The MPEG-D USAC standard (ISO/IEC 23003-3) defines the Extended High Efficiency AAC profile, which contains all of the tools of the HE-AAC v2 profile plus the mono/stereo capabilities of the Baseline USAC profile. As a result, a decoder built according to the Extended High Efficiency AAC profile is able to also decode the bit streams created for the previous AAC family profiles. The Extended High Efficiency AAC profile was designed for applications relying on a consistent performance at low data rates while being able to decode all existing
AAC-LC Advanced Audio Coding (AAC) is an audio coding standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves higher sound quality than MP3 encoders at the same bit rate. AAC has been stand ...
, HE-AAC and HE-AACv2 content.


xHE-AAC

Fraunhofer has defined the xHE-AAC codec as the combination of the Extended High Efficiency AAC profile and appropriate parts of the MPEG-D DRC Loudness Control Profile or Dynamic Range Control Profile. xHE-AAC extends the operating range of the codec from 12 to 300 kbit/s for stereo signals and allows seamless switching between bitrates over this range for adaptive bitrate delivery (using standards such as
MPEG-DASH Dynamic Adaptive Streaming over HTTP (DASH), also known as MPEG-DASH, is an adaptive bitrate streaming technique that enables high quality streaming of media content over the Internet delivered from conventional HTTP web servers. Similar to Apple' ...
or HLS for example). xHE-AAC also includes MPEG-D DRC mandatory loudness control to playback content at a consistent volume and offers new dynamic range control profiles for listening in noisy situations. While xHE-AAC decoders will be able to decode the bit streams created for the previous AAC family profiles, xHE-AAC encoders are typically intended for encoding of MPEG-D USAC audio object type (AOT 42) with MPEG-D DRC loudness metadata, though some may support encoding legacy AAC object types. xHE-AAC is a mandatory audio codec in the Digital Radio Mondiale standard and is a trademark of Fraunhofer. In April 2016, Via Licensing announced the launch of a xHE-AAC patent pool licensing program for 2016. In 2018, xHE-AAC was included in Via Licensing's AAC patent pool at no additional cost. In January 2021, Fraunhofer announced a test service and trademark program for xHE-AAC and announced that the codec is being used by Netflix. Netflix reported that users switched from speakers to headphones 16% less often (due to poor sound quality or inadequate volume) on high dynamic range content when using xHE-AAC instead of HE-AAC. Netflix also explained that xHE-AAC allowed them to begin streaming with adaptive audio bitrates to Android devices. Fraunhofer also announced xHE-AAC licenses to MainConcept, Poikosoft, and LG. xHE-AAC is supported by the Bento4 DASH/HLS packager. In January 2022, MainConcept established
web encoding service
to test xHE-AAC. In October 2022, xHE-AAC decoding was added to Windows 11 and xBox devices.


Compatibility

xHE-AAC is supported in Android since
Android Pie Android Pie ( codenamed Android P during development), also known as Android 9 (API 28) is the ninth major release and the 16th version of the Android mobile operating system. It was first released as a developer preview on March 7, 2018, and ...
and in iOS since
iOS 13 iOS 13 is the thirteenth major release of the iOS mobile operating system developed by Apple Inc. for their iPhone, iPod Touch, and HomePod lines. The successor to iOS 12 on those devices, it was announced at the company's Worldwide Develop ...
. It has been announced that it will be added to
watchOS 7 watchOS is the operating system of the Apple Watch, developed by Apple Inc. It is based on iOS, the operating system used by the iPhone, and has many similar features. It was released on April 24, 2015, along with the Apple Watch, the only dev ...
and has been licensed to
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washin ...
. Playing xHE-AAC audio files is supported in foobar2000 with the use of an add-on AAC decoder.


See also

*
Opus (codec) Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for ...
– a royalty free alternative, low latency codec for a similar usage


References


External links


Fraunhofer xHE-AAC WebsiteFraunhofer AAC Audio Playback Test SitexHE-AAC encoder for Windows 7/8/10/11Fraunhofer xHE-AAC Codec Test ServiceNetflix Tech Blog: Optimizing the Aural Experience on Android Devices with xHE-AAC
{{MPEG Audio codecs MPEG Open standards covered by patents