Opus (audio format)
   HOME

TheInfoList



Opus is a
lossy In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size f ...
audio coding format An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal p ...
developed by the Xiph.Org Foundation and standardized by the
Internet Engineering Task Force The Internet Engineering Task Force (IETF) is an open standards organization A standards organization, standards body, standards developing organization (SDO), or standards setting organization (SSO) is an organization whose primary function ...
, designed to efficiently code speech and general audio in a single format, while remaining
low-latency Latency from a general point of view is a time delay between the Causality, cause and the effect of some physical change in the system being observed. Lag, as it is known in Gaming culture, gaming circles, refers to the latency between the input ...
enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both
Vorbis Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy compression, lossy audio compression (data), audio compressi ...
and
Speex Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software Free software (or libre software) is computer software Software is a collection of Instruction (computer science), ins ...
for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until
transparency Transparency, transparence or transparent most often refer to transparency and translucency, the physical property of allowing the transmission of light through a material. They may also refer to: Literal uses * Transparency (photography), a sti ...
is reached, including
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...
,
AAC AAC may refer to: Aviation * Advanced Aircraft, a company from Carlsbad, California * Alaskan Air Command, a radar network * American Aeronautical Corporation, a company from Port Washington, New York * American Aviation, a company from Clevelan ...
, and
HE-AAC 250px, Evolution from MPEG-2 AAC-LC (Low Complexity) Profile and MPEG-4 AAC-LC Object Type to AAC-HE v2 Profile. High-Efficiency Advanced Audio Coding (AAC-HE) is an audio coding format An audio coding format (or sometimes audio compression ...
. Opus combines the speech-oriented LPC-based
SILK Silk is a natural fiber, natural protein fiber, some forms of which can be weaving, woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoon (silk), cocoons. The be ...

SILK
algorithm and the lower-latency
MDCT The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger d ...
-based
CELT Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and part ...
algorithm, switching between or combining them as needed for maximal efficiency. Bitrate, audio bandwidth, complexity, and algorithm can all be adjusted seamlessly in each frame. Opus has the low algorithmic delay (26.5 ms by default) necessary for use as part of a real-time communication link,
networked music performance A networked music performance or network musical performance is a real-time interaction over a computer network that enables musicians in different locations to perform as if they were in the same room. These interactions can include performances, r ...
s, and live
lip sync Lip sync or lip synch (short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals. Audio for lip syncing is generated through the sound reinforcement system in a liv ...
; by trading-off quality or bitrate, the delay can be reduced down to 5 ms. Its delay is exceptionally low compared to competing codecs, which require well over 100 ms, yet Opus performs very competitively with these formats in terms of quality per bitrate. As an
open format An open format is a file format ogg-file: 154 kilobytes. A file format is a standard Standard may refer to: Flags * Colours, standards and guidons * Standard (flag), a type of flag used for personal identification Norm, convention or re ...
standardized through RFC 6716, a
reference implementation In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation of ...
called libopus is available under the
New BSD License BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lic ...
. The reference has both fixed-point and
floating-point In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often used in system ...
optimizations for low- and high-end devices, with
SIMD Single instruction, multiple data (SIMD) is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously. Such machines exp ...

SIMD
optimizations on platforms that support them. All known
software patent A software patent is a patent NPOV disputes from March 2021 A patent is a Title (property), title that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of years in exchange fo ...
s that cover Opus are licensed under
royalty-free Royalty-free (RF) material subject to copyright or other intellectual property rights may be used without the need to pay royalties or license, license fees for each use, per each copy or volume sold or some time period of use or sales. Computer st ...
terms. Opus is widely used as the
voice-over-IP Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia Multimedia is a form of communication that combines different content forms such as ...
(VoIP) codec in applications such as
WhatsApp WhatsApp Messenger, or simply WhatsApp, is an American freeware, cross-platform Centralized computing, centralized instant messaging (IM) and Voice over IP, voice-over-IP (VoIP) service owned by Meta Platforms. It allows users to send text messa ...

WhatsApp
and the
PlayStation 4 The PlayStation 4 (PS4) is a home video game console developed by Sony Computer Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013 in Europ ...

PlayStation 4
.


Features

Opus supports
constant Constant or The Constant may refer to: Mathematics * Constant (mathematics) In mathematics, the word constant can have multiple meanings. As an adjective, it refers to non-variance (i.e. unchanging with respect to some other Value (mathematics ...
and
variable bitrate Variable bitrate (VBR) is a term used in telecommunications Telecommunication is the transmission of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an entity is" and thus defi ...
encoding from 6  kbit/s to 510 kbit/s (or up to 256 kbit/s per channel for multi-channel tracks), frame sizes from 2.5 ms to 60 ms, and five
sampling rate Image:Signal Sampling.svg, 300px, Signal sampling representation. The continuous signal S(t) is represented with a green colored line while the discrete samples are indicated by the blue vertical lines. In signal processing, sampling is the reducti ...
s from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, the human
hearing range Hearing range describes the range of frequencies that can be Hearing (sense), heard by humans or other animals, though it can also refer to the Sound pressure, range of levels. The human range is commonly given as 20 to 20,000 hertz, Hz, a ...
). An Opus stream can support up to 255 audio channels, and it allows channel coupling between channels in groups of two using mid-side coding. Opus has very short
latency Latency or latent may refer to: Science and technology * Latent heat, energy released or absorbed, by a body or a thermodynamic system, during a constant-temperature process * Latent variable, a variable that is not directly observed but inferred i ...
(26.5 ms using the default 20 ms frames and default application setting), which makes it suitable for real-time applications such as
telephony Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication Telecommunication is the transmission of information by various types of technologies over wire A wire is a single usual ...
,
Voice over IP Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia Multimedia is a form of communication that co ...
and
videoconferencing Videotelephony, sometimes also referred to as video teleconference or videoconferencing, comprises the technologies for the reception and transmission of audio Audio most commonly refers to sound In physics Physics (from grc, φυ ...
; research by
Xiph Xiph.Org Foundation is a nonprofit organization that produces free software, free multimedia formats and software tools. It focuses on the Ogg family of formats, the most successful of which has been Vorbis, an open and freely licensed audio format ...
led to the
CELT Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and part ...
codec, which allows the highest quality while maintaining low delay. In any Opus stream, the bitrate, bandwidth, and delay can be continually varied without introducing any distortion or discontinuity; even mixing packets from different streams will cause a smooth change, rather than the distortion common in other codecs. Unlike Vorbis, Opus does not require large
codebooks File:State Department code book 1899, code page 187.agr.jpg, Page 187 of the State Department 1899 code book, a one part code with a choice of code word or numeric ciphertext. Numeric codes are prefixed by the page number. A codebook is a type of d ...
for each individual file, making it more efficient for short clips of audio and more resilient. As an open standard, the algorithms are openly documented, and a
reference implementation In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation of ...
(including the
source code In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and s ...

source code
) is published.
Broadcom Broadcom Inc. is an American designer, developer, manufacturer and global supplier of a wide range of semiconductor and infrastructure software products. Broadcom's product offerings serve the data center, networking, software, broadband, wirele ...

Broadcom
and the Xiph.Org Foundation own
software patent A software patent is a patent NPOV disputes from March 2021 A patent is a Title (property), title that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of years in exchange fo ...
s on some of the CELT algorithms, and
Skype Technologies Skype Technologies S.A.R.L (also known as Skype Software S.A.R.L, Skype Communications S.A.R.L, Skype Inc., and Skype Limited) is a Luxembourgish Luxembourgish ( ; also ''Luxemburgish'', ''Luxembourgian'', ''Letzebu(e)rgesch''; Luxembourgis ...
/Microsoft own some on the SILK algorithms; each offers a royalty-free perpetual for use with Opus, reserving only the right to make use of their patents to defend against infringement suits of third parties.
Qualcomm Qualcomm () is an American multinational corporation headquartered in San Diego, California, and Delaware General Corporation Law, incorporated in Delaware. It creates semiconductors, software, and services related to wireless technology. It own ...

Qualcomm
,
Huawei Huawei Technologies Co., Ltd. ( ; ) is a Chinese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * M ...
,
France Telecom Orange S.A. (), formerly France Télécom S.A., stylized as france telecom, is a French multinational telecommunications corporation A corporation is an organization—usually a group of people or a company—authorized by the state to ...
, and
Ericsson (lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish Swedish or ' may refer to: * Anything from or related to Sweden, a country in Northern Europe * Swedish language, a North Germanic language spoken prim ...

Ericsson
have claimed that their patents may apply, which Xiph's legal counsel denies, and none have pursued any legal action. The Opus license automatically and retroactively terminates for any entity that attempts to file a patent suit. The Opus format is based on a combination of the full-bandwidth
CELT Constrained Energy Lapped Transform (CELT) is an open, royalty-free lossy In information technology, lossy compression or irreversible compression is the class of data compression, data encoding methods that uses inexact approximations and part ...
format and the speech-oriented
SILK Silk is a natural fiber, natural protein fiber, some forms of which can be weaving, woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoon (silk), cocoons. The be ...

SILK
format, both heavily modified: CELT is based on the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a sum of cosine In mathematics, the trigonom ...
(MDCT) that most music codecs use, using
CELP Code-excited linear prediction (CELP) is a linear predictive speech coding Speech coding is an application of data compression of digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal ...
techniques in the frequency domain for better prediction, while SILK uses
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a Digital data, digital signal (information theory), signal of Speech communication, speech in data co ...
(LPC) and an optional Long-Term Prediction filter to model speech. In Opus, both were modified to support more frame sizes, as well as further algorithmic improvements and integration, such as using CELT's for both types. To minimize overhead at low bitrates, if latency is not as pressing, SILK has support for packing multiple 20 ms frames together, sharing context and headers; SILK also allows Low Bit-Rate Redundancy (LBRR) frames, allowing low-quality packet loss recovery. CELT includes both spectral replication and noise generation, similar to AAC's SBR and PNS, and can further save bits by filtering out all harmonics of tonal sounds entirely, then replicating them in the decoder. Better tone detection is an ongoing project to improve quality. The format has three different modes: speech, hybrid, and CELT. When compressing speech, SILK is used for audio frequencies up to 8 kHz. If wider bandwidth is desired, a hybrid mode uses CELT to encode the frequency range above 8 kHz. The third mode is pure-CELT, designed for general audio. SILK is inherently VBR and cannot hit a bitrate target, while CELT can always be encoded to any specific number of bytes, enabling hybrid and CELT mode when CBR is required. SILK supports frame sizes of 10, 20, 40 and 60 ms. CELT supports frame sizes of 2.5, 5, 10 and 20 ms. Thus, hybrid mode only supports frame sizes of 10 and 20 ms; frames shorter than 10 ms will always use CELT mode. A typical Opus packet contains a single frame, but packets of up to 120 ms are produced by combining multiple frames per packet. Opus can transparently switch between modes, frame sizes, bandwidths, and channel counts on a per-packet basis, although specific applications may choose to limit this. The reference implementation is written in C and compiles on hardware architectures with or without a
floating-point unit A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations automatically. Modern computers can perform gene ...
, although floating-point is currently required for audio bandwidth detection (dynamic switching between SILK, CELT, and hybrid encoding) and most speed optimizations.


Containers

Opus packets are not self-delimiting, but are designed to be used inside a
container box. File:Railroad car with container loads.jpg, A Flatcar#Spine car, spine car with a tank container and an open-top intermodal container, intermodal shipping container with canvas cover. A container is any receptacle or enclosure for holding ...
of some sort which supplies the decoder with each packet's length. Opus was originally specified for encapsulation in
Ogg Ogg is a free, open Open or OPEN may refer to: citizen * Open (band), Australian pop/rock band * The Open (band), English indie rock band * ''Open'' (Blues Image album), 1969 * ''Open'' (Gotthard album), 1999 * ''Open'' (Cowboy Junkies ...
containers, specified as audio/ogg; codecs=opus, and for Ogg Opus files the .opus filename extension is recommended. Opus streams are also supported in
Matroska The Matroska Multimedia Container is a free, open-standard container format, a file format ogg-file: 154 kilobytes. A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are use ...

Matroska
,
WebM WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project WebP for images. The development of the format is sponsored by ...
,
MPEG-TS MPEG transport stream (MPEG-TS, MTS) or simply transport stream (TS) is a standard digital container format for transmission and storage of Digital audio, audio, Digital video, video, and Program and System Information Protocol (PSIP) data. It i ...
, and MP4. Alternatively, each Opus packet may be wrapped in a
network packet In telecommunication Telecommunication is the transmission of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an entity is" and thus defines both its essence and the nature of ...
which supplies the packet length. Opus packets may be sent over an ordered datagram protocol such as RTP. An optional self-delimited packet format is defined in an appendix to the specification. This uses one or two additional bytes per packet to encode the packet length, allowing packets to be concatenated without encapsulation.


Bandwidth and sampling rate

Opus allows the following bandwidths during encoding. Opus compression does not depend on the input sample rate; timestamps are measured in 48 kHz units even if the full bandwidth is not used. Likewise, the output sample rate may be freely chosen. For example, audio can be input at 16 kHz yet be set to encode only narrowband audio.


History

Opus was proposed for the standardization of a new audio format at the IETF, which was eventually accepted and granted by the ''codec''
working group A working group, or working party, is a group of experts working together to achieve specified goals. The groups are domain-specific and focus on discussion or activity around a specific subject area. The term can sometimes refer to an interdisc ...

working group
. It is based on two initially separate standard proposals from the Xiph.Org Foundation and Skype Technologies S.A. (now
Microsoft Microsoft Corporation is an American multinational corporation, multinational technology company with headquarters in Redmond, Washington. It develops, manufactures, licenses, supports, and sells Software, computer software, consumer electroni ...

Microsoft
). Its main developers are Jean-Marc Valin (Xiph.Org, Octasic,
Mozilla Corporation The Mozilla Corporation (stylized as moz://a) is a wholly owned subsidiary of the Mozilla Foundation that coordinates and integrates the development of Internet The Internet (Capitalization of Internet, or internet) is the global sys ...
), Koen Vos (Skype), and Timothy B. Terriberry (Xiph.Org, Mozilla Corporation). Among others, Juin-Hwey (Raymond) Chen (
Broadcom Broadcom Inc. is an American designer, developer, manufacturer and global supplier of a wide range of semiconductor and infrastructure software products. Broadcom's product offerings serve the data center, networking, software, broadband, wirele ...

Broadcom
), Gregory Maxwell (Xiph.Org,
Wikimedia The Wikimedia movement, or simply Wikimedia, is the global community of contributors to Wikimedia Foundation projects. The movement was created around Wikipedia's Wikipedia community, community, and has since expanded to the other Wikimedia pro ...

Wikimedia
), and Christopher Montgomery (Xiph.Org) were also involved. The development of the CELT part of the format goes back to thoughts on a successor for
Vorbis Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy compression, lossy audio compression (data), audio compressi ...
under the working name ''Ghost''. As a newer speech codec from the Xiph.Org Foundation, Opus replaces Xiph's older speech codec
Speex Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software Free software (or libre software) is computer software Software is a collection of Instruction (computer science), ins ...
, an earlier project of Jean-Marc Valin. CELT has been worked on since November 2007. The SILK part has been under development at Skype since January 2007 as the successor of their
SVOPC SVOPC (Sinusoidal Voice Over Packet Coder) is a compression method for audio which is used by VOIP Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and ...
, an internal project to make the company independent from third-party codecs like iSAC and
iLBC Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband {{refimprove, date=March 2011 In radio communications, a narrowband channel is a channel in which the bandwidth of the message does not significantly exceed the channel's coherence ...
and respective license payments. In March 2009, Skype suggested the development and standardization of a wideband audio format within the IETF. Nearly a year passed with much debate on the formation of an appropriate
working group A working group, or working party, is a group of experts working together to achieve specified goals. The groups are domain-specific and focus on discussion or activity around a specific subject area. The term can sometimes refer to an interdisc ...

working group
. Representatives of several companies which were taking part in the standardization of patent-encumbered competing format, including
Polycom Polycom, Inc., now a part of Plantronics, Inc., was an American multinational corporation A multinational company (MNC) is a corporate organization that owns or controls the production of goods or services in at least one country other than it ...
and
Ericsson (lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish Swedish or ' may refer to: * Anything from or related to Sweden, a country in Northern Europe * Swedish language, a North Germanic language spoken prim ...

Ericsson
—the creators and licensors of G.719—as well as France Télécom,
Huawei Huawei Technologies Co., Ltd. ( ; ) is a Chinese multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * M ...
and the Orange Labs (department of France Télécom), which were involved in the creation of G.718, stated objections against the start of the standardization process for a royalty-free format. (Some of the opponents would later claim patent rights that Xiph dismissed; see above.) The working group finally formed in February 2010, and even the corresponding Study Group 16 from the ITU-T pledged to support its work. In July 2010, a prototype of a hybrid format was presented that combined the two proposed format candidates SILK and CELT. In September 2010, Opus was submitted to the IETF as proposal for standardization. For a short time the format went under the name of ''Harmony'' before it got its present name in October 2010. At the beginning of February 2011, the bitstream format was tentatively frozen, subject to last changes. Near the end of July 2011, Jean-Marc Valin was hired by the
Mozilla Corporation The Mozilla Corporation (stylized as moz://a) is a wholly owned subsidiary of the Mozilla Foundation that coordinates and integrates the development of Internet The Internet (Capitalization of Internet, or internet) is the global sys ...
to continue working on Opus.


Finalization (1.0)

In November 2011, the working group issued the last call for changes on the bitstream format. The bitstream has been frozen since January 8, 2012. On July 2, 2012, Opus was approved by the IETF for standardization. The reference software entered release candidate state on August 8, 2012. The final specification was released as RFC 6716 on September 10, 2012. and versions 1.0 and 1.0.1 of the
reference implementation In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation of ...
libopus were released the day after. On July 11, 2013, libopus 1.0.3 brought bug fixes and a new Surround sound API that improves channel allocation and quality, especially for Low-frequency effects, LFE.


1.1

On December 5, 2013, libopus 1.1 was released, incorporating overall speed improvements and significant encoder quality improvements: Tonality estimation boosts bitrate and quality for previousl
problematic samples
like harpsichords; automated speech/music detection improves quality in mixed audio; joint stereo, mid-side stereo reduces the bitrate needs of many songs; band precision boosting for improved transients; and DC rejection below 3 Hz. Two new Variable bitrate, VBR modes were added: unconstrained for more consistent quality, and temporal VBR that boosts louder frames and generally improves quality. libopus 1.1.1 was released on November 26, 2015, and 1.1.2 on January 12, 2016, both adding speed optimizations and bug fixes. July 15, 2016 saw the release of version 1.1.3 and includes bug fixes, optimizations, documentation updates and experimental Ambisonics work.


1.2

libopus 1.2 Beta was released on May 24, 2017. libopus 1.2 was released on June 20, 2017. Improvements brought in 1.2 allow it to create fullband music at bit rates as low as 32 kbit/s, and wideband speech at just 12 kbit/s. libopus 1.2 includes optional support for the decoder specification changes made in drafts of RFC 8251, improving the quality of output from such low-rate streams.


1.3

libopus 1.3 was released on October 18, 2018. The Opus 1.3 major release again brings quality improvements, new features, and bug fixes. Changes since 1.2.x include: * Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN) * Support for ambisonics coding using channel mapping families 2 and 3 * Improvements to stereo speech coding at low bitrate * Using wideband speech encoding down to 9 kb/s (mediumband is no longer used) * Making it possible to use SILK down to bitrates around 5 kb/s * Minor quality improvement on tones * Enabling the spec fixes in RFC 8251 by default * Security/hardening improvements Notable bug fixes include: * Fixes to the CELT Packet loss concealment, PLC * Bandwidth detection fixes


1.3.1

libopus 1.3.1 was released on April 12, 2019. This Opus 1.3.1 minor release fixes an issue with the analysis on files with digital silence (all zeros), especially on x87 builds (mostly affects 32-bit builds). It also includes two new features: * A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all) * A new (and still experimental) CMake-based build system that is eventually meant to replace the VS2015 build system (the autotools one will stay)


Quality comparison and low-latency performance

Opus performs well at both low and high bit rates. In listening tests around 64 kbit/s, Opus shows superior quality compared to
HE-AAC 250px, Evolution from MPEG-2 AAC-LC (Low Complexity) Profile and MPEG-4 AAC-LC Object Type to AAC-HE v2 Profile. High-Efficiency Advanced Audio Coding (AAC-HE) is an audio coding format An audio coding format (or sometimes audio compression ...
codecs, which were previously dominant due to their use of the patented spectral band replication (SBR) technology.Next-Gen Low-Latency Open Codec Beats HE-AAC
Slashdot-Meldung vom 14. April 2011.
In listening tests around 96 kbit/s, Opus shows slightly superior quality compared to
AAC AAC may refer to: Aviation * Advanced Aircraft, a company from Carlsbad, California * Alaskan Air Command, a radar network * American Aeronautical Corporation, a company from Port Washington, New York * American Aviation, a company from Clevelan ...
and significantly better quality compared to
Vorbis Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy compression, lossy audio compression (data), audio compressi ...
and
MP3 MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio Digital audio is a representation of sound recorded in, or converted into, Digital signal (signal processing), digital form. In digital a ...
. Opus has very low algorithmic delay, a necessity for use as part of a low-Latency (audio), audio-latency communication link, which can permit natural conversation,
networked music performance A networked music performance or network musical performance is a real-time interaction over a computer network that enables musicians in different locations to perform as if they were in the same room. These interactions can include performances, r ...
s, or
lip sync Lip sync or lip synch (short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals. Audio for lip syncing is generated through the sound reinforcement system in a liv ...
at live events. Total algorithmic delay for an audio format is the sum of delays that must be incurred in the encoder and the decoder of a live audio stream regardless of processing speed and transmission speed, such as buffering audio samples into blocks or frames, allowing for Modified discrete cosine transform, window overlap and possibly allowing for Noise shaping, noise-shaping look-ahead in a decoder and any other forms of look-ahead, or for an MP3 encoder, the use of MP3#VBR, bit reservoir. Total one-way latency below 150 ms is the preferred target of most VoIP systems, to enable natural conversation with turn-taking little affected by delay. Musicians typically feel in-time with up to around 30 ms audio latency, roughly in accord with the fusion time of the Haas effect, though matching playback delay of each user's own instrument to the round-trip latency can also help. lip sync error#Recommendations, It is suggested for lip sync that around 45–100 ms audio latency may be acceptable. Opus permits trading-off reduced quality or increased bitrate to achieve an even smaller algorithmic delay (5.0 ms minimum). While the reference implementation's default Opus frame is 20.0 ms long, the SILK layer requires a further 5.0 ms lookahead plus 1.5 ms for resampling, giving a default delay of 26.5 ms. When the CELT layer is active, it requires 2.5 ms lookahead for Modified discrete cosine transform, window overlap to which a matching delay of 4.0 ms is added by default to synchronize with the SILK layer. If the encoder is instantiated in the special ''restricted low delay'' mode, the 4.0 ms matching delay is removed and the SILK layer is disabled, permitting the minimal algorithmic delay of 5.0 ms.


Support

The format and algorithms are openly documented and the
reference implementation In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation of ...
is published as free software. Xiph's reference implementation is called ''libopus'' and a package called ''opus-tools'' provides command-line encoder and decoder utilities. It is published under the terms of a BSD licenses, BSD-like license. It is written in C and can be compiled for hardware architectures with or without a
floating-point unit A floating-point unit (FPU, colloquially a math coprocessor) is a part of a computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations automatically. Modern computers can perform gene ...
. The accompanying diagnostic tool ''opusinfo'' reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ''ogginfo'' from the ''vorbis-tools'' and therefore — unlike the encoder and decoder — is available under the terms of version 2 of the GNU General Public License, GPL.


Implementations

contains a complete source code for the reference implementation written in C. RFC contains errata. The FFmpeg project has encoder and decoder implementations not derived from the reference library. The libopus reference library has been ported to both C Sharp (programming language), C# and Java (programming language), Java as part of a project called Concentus. These ports sacrifice performance for the sake of being easily integrated into cross-platform applications.


Software

Digital Radio Mondiale – a digital radio format for AM frequencies – can broadcast and receive Opus audio (albeit not recognised in official standard) using Dream software-defined radio. The Wikimedia Foundation sponsored a free and open source online JavaScript Opus encoder for browsers supporting the required HTML5 features. Since 2016,
WhatsApp WhatsApp Messenger, or simply WhatsApp, is an American freeware, cross-platform Centralized computing, centralized instant messaging (IM) and Voice over IP, voice-over-IP (VoIP) service owned by Meta Platforms. It allows users to send text messa ...

WhatsApp
has been using Opus as its audio file format. Signal (software), Signal switched from
Speex Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software Free software (or libre software) is computer software Software is a collection of Instruction (computer science), ins ...
to Opus audio codec for better audio quality in the beginning of 2017.


Operating system support

Most end-user software relies on multimedia frameworks provided by the operating system. Native Opus codec support is implemented in most major multimedia frameworks for Unix-like operating systems, including GStreamer, FFmpeg, and Libav libraries. Google added native support for Opus audio playback in Android Lollipop, Android 5.0 "Lollipop". However, it was limited to Opus audio encapsulated in
Matroska The Matroska Multimedia Container is a free, open-standard container format, a file format ogg-file: 154 kilobytes. A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are use ...

Matroska
containers, such as .mkv and .webm files. Android Nougat, Android 7.0 "Nougat" introduced support for Opus audio encapsulated in .ogg containers. Android 10 finally added native support for .opus Filename extension, extensions.Support Opus in the MediaScanner (37054258) - Visible to Public - Google Issue Tracker
/ref> Due to the addition of WebRTC support in Apple's WebKit rendering engine, macOS High Sierra and iOS 11 come with native playback support for Opus audio encapsulated in Core Audio Format containers. On Windows 10, version Windows 10 1607, 1607, Microsoft provided native support for Opus audio encapsulated in
Matroska The Matroska Multimedia Container is a free, open-standard container format, a file format ogg-file: 154 kilobytes. A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are use ...

Matroska
and
WebM WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project WebP for images. The development of the format is sponsored by ...
files. On version Windows 10 1709, 1709, support for Opus audio encapsulated in .ogg containers was made available through a pre-installed add-on called Web Media Extensions. On Windows 10 version Windows 10 1903, 1903, native support for the .opus container was added. On Windows 8.1 and older, third-party decoders, such as LAV Filters, are available to provide support for the format.


Media player support

While support in multimedia frameworks automatically enables Opus support in software which is built on top of such frameworks, several applications developers made additional efforts for supporting the Opus audio format in their software. Such support was added to AIMP, Amarok (software), Amarok, cmus, Music Player Daemon, foobar2000, Mpxplay, MusicBee, SMplayer, VLC media player, Winamp and Xmplay audio players; Icecast, Airtime (software) audio streaming software; and Asunder (software), Asunder audio CD ripper, CDBurnerXP CD burner, FFmpeg, Libav and MediaCoder media encoding tools. Streaming Icecast radio trials are live since September 2012 and January 2013. SteamOS uses Opus or Vorbis for streaming audio.


Browser support

Opus support is mandatory for WebRTC implementations. Opus is supported in Firefox, Mozilla Firefox, Chromium (web browser), Chromium and Google Chrome, Blink (web engine), Blink-based Opera (web browser), Opera, as well as all browsers for Unix-like systems relying on GStreamer for multimedia formats support. Although Internet Explorer will not provide Opus playback natively, support for the format is built into the Microsoft Edge, Edge browser, along with VP9, for full
WebM WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project WebP for images. The development of the format is sponsored by ...
support. Safari supports Opus as of iOS 11 and macOS High Sierra.


VoIP support

Due to its abilities, Opus gained early interest from
voice-over-IP Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia Multimedia is a form of communication that combines different content forms such as ...
(VoIP) software vendors. Several Session Initiation Protocol, SIP clients, including Acrobits Softphone, CSipSimple (via additional plug-in), Empathy (software), Empathy (via GStreamer), Jitsi, Tuenti, Line2 (currently only on iOS), Linphone, Phoner and PhonerLite, SFLphone, Telephone (application), Telephone, Mumble (software), Mumble, Discord (software), Discord and TeamSpeak 3 voice chat software also support Opus. TrueConf supports Opus in its VoIP products. Asterisk (PBX), Asterisk lacked builtin Opus support for legal reasons, but a third-party patch was available for download and official support via a binary blob was added in September 2016. Tox (protocol), Tox P2P videoconferencing software uses Opus exclusively. Classified-ads distributed messaging app sends raw opus frames inside TLS socket in its VoIP implementation. Opus is widely used as the voice codec in
WhatsApp WhatsApp Messenger, or simply WhatsApp, is an American freeware, cross-platform Centralized computing, centralized instant messaging (IM) and Voice over IP, voice-over-IP (VoIP) service owned by Meta Platforms. It allows users to send text messa ...

WhatsApp
, which has over 1.5billion users worldwide. WhatsApp uses Opus at 816 kHz
sampling rate Image:Signal Sampling.svg, 300px, Signal sampling representation. The continuous signal S(t) is represented with a green colored line while the discrete samples are indicated by the blue vertical lines. In signal processing, sampling is the reducti ...
s, with the Real-time Transport Protocol (RTP). The
PlayStation 4 The PlayStation 4 (PS4) is a home video game console developed by Sony Computer Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013 in Europ ...

PlayStation 4
video game console also uses the CELT/Opus codec for its PlayStation Network system party chat. It is also used in the Zoom videoconferencing app.


Hardware

Since version 3.13, Rockbox enables Opus playback on supported portable media players, including some products from the iPod series by Apple Inc., Apple, devices made by iriver, Archos and Sandisk, and on Android (operating system), Android devices using "Rockbox as an Application". All recent Grandstream IP phones support Opus audio both for encoding and decoding. OBihai OBi1062, OBi1032 and OBi1022 IP phones all support Opus. Recent BlueSound wireless speakers support Opus playback. Devices running Hiby OS, like the Hiby R3, are capable of decoding Opus files natively. Many broadcast IP codecs include Opus such as those manufactured by Comrex, GatesAir and Tieline.


Notes


References


Citations


Sources

* This article contains quotations from the Opus Codec website, which is available under th
Creative Commons Attribution 3.0 (CC BY 3.0)
license.


External links

*
Opus on Hydrogenaudio Knowledgebase


See also

* Comparison of audio coding formats * Streaming media * xHE-AAC {{Compression software Speech codecs Free audio codecs Lossy compression algorithms Xiph.Org projects Software using the BSD license Open formats