Opus is a
lossy
In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...
audio coding format
An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding ...
developed by the
Xiph.Org Foundation and standardized by the
Internet Engineering Task Force
The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements and a ...
, designed to efficiently
code speech and general audio in a single format, while remaining
low-latency
Latency, from a general point of view, is a time delay between the cause and the effect of some physical change in the system being observed. Lag, as it is known in gaming circles, refers to the latency between the input to a simulation and ...
enough for real-time interactive communication and low-complexity enough for low-end embedded processors.
Opus replaces both
Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conjun ...
and
Speex
Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntro ...
for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until
transparency is reached, including
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
,
AAC, and
HE-AAC.
Opus combines the speech-oriented
LPC-based
SILK
Silk is a natural protein fiber, some forms of which can be woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoons. The best-known silk is obtained from the coc ...
algorithm and the lower-latency
MDCT
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where su ...
-based
CELT
The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancient ...
algorithm, switching between or combining them as needed for maximal efficiency.
Bitrate, audio bandwidth, complexity, and algorithm can all be adjusted seamlessly in each frame. Opus has the low algorithmic delay (26.5 ms by default)
necessary for use as part of a real-time communication link,
networked music performance
A networked music performance or network musical performance is a real-time interaction over a computer network that enables musicians in different locations to perform as if they were in the same room. These interactions can include performances, ...
s, and live
lip sync
Lip sync or lip synch (pronounced , the same as the word ''sink'', short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals.
Audio for lip syncing is generated thr ...
; by trading-off quality or bitrate, the delay can be reduced down to 5 ms. Its delay is exceptionally low compared to competing codecs, which require well over 100 ms, yet Opus performs very competitively with these formats in terms of quality per bitrate.
As an
open format
An open file format is a file format for storing digital data, defined by an openly published specification usually maintained by a standards organization, and which can be used and implemented by anyone. Open file format is licensed with open lic ...
standardized through RFC 6716, a
reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...
called libopus is available under the
New BSD License
BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lice ...
. The reference has both
fixed-point and
floating-point
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
optimizations for low- and high-end devices, with
SIMD
Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal (part of the hardware design) and it can be directly accessible through an instruction set architecture (ISA), but it should ...
optimizations on platforms that support them. All known
software patent
A software patent is a patent on a piece of software, such as a computer program, libraries, user interface, or algorithm.
Background
A patent is a set of exclusionary rights granted by a state to a patent holder for a limited period of time, u ...
s that cover Opus are licensed under
royalty-free
Royalty-free (RF) material subject to copyright or other intellectual property rights may be used without the need to pay royalties or license fees for each use, per each copy or volume sold or some time period of use or sales.
Computer standard ...
terms.
Opus is widely used as the
voice over IP
Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of speech, voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms In ...
(VoIP) codec in applications such as
Discord
Discord is a VoIP and instant messaging social platform. Users have the ability to communicate with voice calls, video calls, text messaging, media and files in private chats or as part of communities called "servers".The developer documenta ...
,
WhatsApp
WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
,
and the
PlayStation 4
The PlayStation 4 (PS4) is a home video game console developed by Sony Interactive Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013 in ...
.
Features
Opus supports
constant and
variable bitrate
Variable bitrate (VBR) is a term used in telecommunications and computing that relates to the bitrate used in sound or video encoding. As opposed to constant bitrate (CBR), VBR files vary the amount of output data per time segment. VBR allows a ...
encoding from 6
kbit/s to 510 kbit/s (or up to 256 kbit/s per channel for multi-channel tracks), frame sizes from 2.5 ms to 60 ms, and five
sampling rate
In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples".
A sample is a value of the signal at a point in time and/or spac ...
s from 8 kHz (with 4 kHz bandwidth) to 48 kHz (with 20 kHz bandwidth, the human
hearing range
Hearing range describes the range of frequencies that can be heard by humans or other animals, though it can also refer to the range of levels. The human range is commonly given as 20 to 20,000 Hz, although there is considerable variati ...
). An Opus stream can support up to 255
audio channels, and it allows
channel coupling between channels in groups of two using mid-side coding.
Opus has very short
latency (26.5 ms using the default 20 ms frames and default application setting), which makes it suitable for
real-time applications such as
telephony
Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
,
Voice over IP
Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of speech, voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms In ...
and
videoconferencing
Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio signal, audio and video signals by people in different locations for Real-time, real time communication. ...
; research by
Xiph
Xiph.Org Foundation is a nonprofit organization that produces free multimedia formats and software tools. It focuses on the Ogg family of formats, the most successful of which has been Vorbis, an open and freely licensed audio format and codec d ...
led to the
CELT
The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancient ...
codec, which allows the highest quality while maintaining low delay. In any Opus stream, the bitrate, bandwidth, and delay can be continually varied without introducing any distortion or discontinuity; even mixing packets from different streams will cause a smooth change, rather than the distortion common in other codecs. Unlike Vorbis, Opus does not require large
codebooks for each individual file, making it more efficient for short clips of audio and more resilient.
The Opus format is based on a combination of the full-bandwidth
CELT
The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancient ...
format and the speech-oriented
SILK
Silk is a natural protein fiber, some forms of which can be woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoons. The best-known silk is obtained from the coc ...
format, both heavily modified: CELT is based on the
modified discrete cosine transform
The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) that most music codecs use, using
CELP
Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
techniques in the frequency domain for better prediction, while SILK uses
linear predictive coding
Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC) and an optional Long-Term Prediction filter to model speech. In Opus, both were modified to support more frame sizes, as well as further algorithmic improvements and integration, such as using CELT's
range encoder for both types. To minimize overhead at low bitrates, if latency is not as pressing, SILK has support for packing multiple 20 ms frames together, sharing context and headers; SILK also allows Low Bit-Rate Redundancy (LBRR) frames, allowing low-quality packet loss recovery. CELT includes both spectral replication and noise generation, similar to AAC's SBR and PNS, and can further save bits by filtering out all harmonics of tonal sounds entirely, then replicating them in the decoder. Better tone detection is an ongoing project to improve quality.
The format has three different modes: speech, hybrid, and CELT. When compressing speech, SILK is used for audio frequencies up to 8 kHz. If wider bandwidth is desired, a hybrid mode uses CELT to encode the frequency range above 8 kHz. The third mode is pure-CELT, designed for general audio. SILK is inherently VBR and cannot hit a bitrate target, while CELT can always be encoded to any specific number of bytes, enabling hybrid and CELT mode when CBR is required.
SILK supports frame sizes of 10, 20, 40 and 60 ms. CELT supports frame sizes of 2.5, 5, 10 and 20 ms. Thus, hybrid mode only supports frame sizes of 10 and 20 ms; frames shorter than 10 ms will always use CELT mode. A typical Opus packet contains a single frame, but packets of up to 120 ms are produced by combining multiple frames per packet. Opus can transparently switch between modes, frame sizes, bandwidths, and channel counts on a per-packet basis, although specific applications may choose to limit this.
The reference implementation is written in
C and compiles on hardware architectures with or without a
floating-point unit
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
, although floating-point is currently required for audio bandwidth detection (dynamic switching between SILK, CELT, and hybrid encoding) and most speed optimizations.
Containers
Opus packets are not self-delimiting, but are designed to be used inside a
container
A container is any receptacle or enclosure for holding a product used in storage, packaging, and transportation, including shipping.
Things kept inside of a container are protected on several sides by being inside of its structure. The term ...
of some sort which supplies the decoder with each packet's length. Opus was originally specified for encapsulation in
Ogg
Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality di ...
containers, specified as
audio/ogg; codecs=opus
, and for Ogg Opus files the
.opus
filename extension is recommended.
Opus streams are also supported in
Matroska
Matroska is a project to create a container format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. The Matroska Multimedia Container is similar in concept to other containers like AVI, MP4, or Adva ...
,
WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
,
MPEG-TS
MPEG transport stream (MPEG-TS, MTS) or simply transport stream (TS) is a standard digital container format for transmission and storage of audio, video, and Program and System Information Protocol (PSIP) data. It is used in broadcast syste ...
,
and
MP4.
Alternatively, each Opus packet may be wrapped in a
network packet
In telecommunications and computer networking, a network packet is a formatted unit of data carried by a packet-switched network. A packet consists of control information and user data; the latter is also known as the ''payload''. Control informa ...
which supplies the packet length. Opus packets may be sent over an ordered datagram protocol such as
RTP.
An optional self-delimited packet format is defined in an appendix to the specification. This uses one or two additional bytes per packet to encode the packet length, allowing packets to be concatenated without encapsulation.
Bandwidth and sampling rate
Opus allows the following bandwidths during encoding. Opus compression does not depend on the input sample rate; timestamps are measured in 48 kHz units even if the full bandwidth is not used. Likewise, the output sample rate may be freely chosen. For example, audio can be input at 16 kHz yet be set to encode only narrowband audio.
History
Opus was proposed for the standardization of a new audio format at the IETF, which was eventually accepted and granted by the ''codec''
working group
A working group, or working party, is a group of experts working together to achieve specified goals. The groups are domain-specific and focus on discussion or activity around a specific subject area. The term can sometimes refer to an interdis ...
. It is based on two initially separate standard proposals from the Xiph.Org Foundation and Skype Technologies S.A. (now
Microsoft
Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
). Its main developers are Jean-Marc Valin (Xiph.Org, Octasic,
Mozilla Corporation
The Mozilla Corporation (stylized as moz://a) is a wholly owned subsidiary of the Mozilla Foundation that coordinates and integrates the development of Internet-related applications such as the Firefox web browser, by a global community of ope ...
), Koen Vos (Skype), and Timothy B. Terriberry (Xiph.Org, Mozilla Corporation). Among others, Juin-Hwey (Raymond) Chen (
Broadcom
Broadcom Inc. is an American designer, developer, manufacturer and global supplier of a wide range of semiconductor and infrastructure software products. Broadcom's product offerings serve the data center, networking, software, broadband, wirel ...
), Gregory Maxwell (Xiph.Org,
Wikimedia
The Wikimedia Foundation, Inc., or Wikimedia for short and abbreviated as WMF, is an American 501(c)(3) nonprofit organization headquartered in San Francisco, California and registered as a charitable foundation under local laws. Best know ...
), and
Christopher Montgomery (Xiph.Org) were also involved.
The development of the CELT part of the format goes back to thoughts on a successor for
Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conjun ...
under the working name ''Ghost''. As a newer speech codec from the Xiph.Org Foundation, Opus replaces Xiph's older speech codec
Speex
Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntro ...
, an earlier project of Jean-Marc Valin. CELT has been worked on since November 2007.
The SILK part has been under development at Skype since January 2007 as the successor of their
SVOPC
SVOPC (Sinusoidal Voice Over Packet Coder) is a compression method for audio which is used by VOIP applications. It is a lossy speech compression codec designed specifically towards communication channels suffering from packet loss. It uses more ...
, an internal project to make the company independent from third-party codecs like
iSAC and
iLBC
Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband speech audio coding format and an open-source reference implementation (codec), developed by Global IP Solutions (GIPS) formerly Global IP Sound (acquired by Google Inc in 2011). ...
and respective license payments.
In March 2009, Skype suggested the development and standardization of a wideband audio format within the IETF. Nearly a year passed with much debate on the formation of an appropriate
working group
A working group, or working party, is a group of experts working together to achieve specified goals. The groups are domain-specific and focus on discussion or activity around a specific subject area. The term can sometimes refer to an interdis ...
.
Representatives of several companies which were taking part in the standardization of patent-encumbered competing format, including
Polycom
Poly, formerly Polycom, a part of HP Inc., is an American multinational corporation that develops video, voice and content collaboration and communication technology.
Polycom was co-founded in 1990 by Brian L Hinman and Jeffrey Rodman. In 2018 ...
and
Ericsson
(lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish multinational networking and telecommunications company headquartered in Stockholm. The company sells infrastructure, software, and services in informat ...
—the creators and licensors of
G.719—as well as
France Télécom
Orange S.A. (), formerly France Télécom S.A. (stylized as france telecom) is a French multinational corporation, multinational telecommunications corporation. It has 266 million customers worldwide and employs 89,000 people in France, and 5 ...
,
Huawei
Huawei Technologies Co., Ltd. ( ; ) is a Chinese multinational technology corporation headquartered in Shenzhen, Guangdong, China. It designs, develops, produces and sells telecommunications equipment, consumer electronics and various smar ...
and the
Orange Labs
Orange S.A. (), formerly France Télécom S.A. (stylized as france telecom) is a French multinational corporation, multinational telecommunications corporation. It has 266 million customers worldwide and employs 89,000 people in France, and 5 ...
(department of France Télécom), which were involved in the creation of
G.718
G.718 is an ITU-T Recommendation embedded scalable speech and audio codec providing high quality narrowband (250 Hz to 3.5 kHz) speech over the lower bit rates and high quality wideband (50 Hz to 7 kHz) speech over the complete ...
, stated objections against the start of the standardization process for a royalty-free format. (Some of the opponents would later claim patent rights that Xiph dismissed; see below.) The working group finally formed in February 2010, and even the corresponding Study Group 16 from the ITU-T pledged to support its work.
In July 2010, a prototype of a hybrid format was presented that combined the two proposed format candidates SILK and CELT. In September 2010, Opus was submitted to the IETF as proposal for standardization. For a short time the format went under the name of ''Harmony'' before it got its present name in October 2010.
At the beginning of February 2011, the
bitstream
A bitstream (or bit stream), also known as binary sequence, is a sequence of bits.
A bytestream is a sequence of bytes. Typically, each byte is an 8-bit quantity, and so the term octet stream is sometimes used interchangeably. An octet may ...
format was tentatively frozen, subject to last changes.
Near the end of July 2011, Jean-Marc Valin was hired by the
Mozilla Corporation
The Mozilla Corporation (stylized as moz://a) is a wholly owned subsidiary of the Mozilla Foundation that coordinates and integrates the development of Internet-related applications such as the Firefox web browser, by a global community of ope ...
to continue working on Opus.
Finalization (1.0)
In November 2011, the working group issued the last call for changes on the bitstream format. The bitstream has been frozen since January 8, 2012.
On July 2, 2012, Opus was approved by the
IETF
The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements and a ...
for standardization.
The reference software entered release candidate state on August 8, 2012.
The final specification was released as RFC 6716 on September 10, 2012.
and versions 1.0 and 1.0.1 of the
reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...
libopus were released the day after.
On July 11, 2013, libopus 1.0.3 brought bug fixes and a new
Surround sound
Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener ( surround channels). Its first application was in movie theaters. Prior to sur ...
API that improves channel allocation and quality, especially for
LFE.
1.1
On December 5, 2013, libopus 1.1 was released,
[ incorporating overall speed improvements and significant encoder quality improvements: Tonality estimation boosts bitrate and quality for previousl]
problematic samples
like harpsichords; automated speech/music detection improves quality in mixed audio; mid-side stereo reduces the bitrate needs of many songs; band precision boosting for improved transients; and DC rejection below 3 Hz. Two new VBR modes were added: unconstrained for more consistent quality, and temporal VBR that boosts louder frames and generally improves quality.
libopus 1.1.1 was released on November 26, 2015, and 1.1.2 on January 12, 2016, both adding speed optimizations and bug fixes. July 15, 2016 saw the release of version 1.1.3 and includes bug fixes, optimizations, documentation updates and experimental Ambisonics
Ambisonics is a ''full-sphere'' surround sound format: in addition to the horizontal plane, it covers sound sources above and below the listener.
Unlike some other multichannel surround formats, its transmission channels do not carry speaker si ...
work.
1.2
libopus 1.2 Beta was released on May 24, 2017. libopus 1.2 was released on June 20, 2017. Improvements brought in 1.2 allow it to create fullband music at bit rates as low as 32 kbit/s, and wideband speech at just 12 kbit/s.
libopus 1.2 includes optional support for the decoder specification changes made in drafts of RFC 8251, improving the quality of output from such low-rate streams.
1.3
libopus 1.3 was released on October 18, 2018. The Opus 1.3 major release again brings quality improvements, new features, and bug fixes. Changes since 1.2.x include:
* Improvements to voice activity detection (VAD) and speech/music classification using a recurrent neural network (RNN)
* Support for ambisonics coding using channel mapping families 2 and 3
* Improvements to stereo speech coding at low bitrate
* Using wideband speech encoding down to 9 kbit/s (mediumband is no longer used)
* Making it possible to use SILK down to bitrates around 5 kbit/s
* Minor quality improvement on tones
* Enabling the spec fixes in RFC 8251 by default
* Security/hardening improvements
Notable bug fixes include:
* Fixes to the CELT PLC
* Bandwidth detection fixes
1.3.1
libopus 1.3.1 was released on April 12, 2019. This Opus 1.3.1 minor release fixes an issue with the analysis on files with digital silence (all zeros), especially on x87 builds (mostly affects 32-bit builds). It also includes two new features:
* A new OPUS_GET_IN_DTX query to know if the encoder is in DTX mode (last frame was either a comfort noise frame or not encoded at all)
* A new (and still experimental) CMake
In software development, CMake is cross-platform free and open-source software for build automation, testing, packaging and installation of software by using a compiler-independent method. CMake is not a build system itself; it generates an ...
-based build system that is eventually meant to replace the VS2015 build system (the autotools
The GNU Autotools, also known as the GNU Build System, is a suite of programming tools designed to assist in making source code packages portable to many Unix-like systems.
It can be difficult to make a software program portable: the C compile ...
build system will stay)
Quality comparison and low-latency performance
Opus performs well at both low and high bit rate
In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time.
The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction w ...
s.
In listening tests around 64 kbit/s, Opus shows superior quality compared to HE-AAC codecs, which were previously dominant due to their use of the patented spectral band replication
Spectral band replication (SBR) is a technology to enhance audio or speech codecs, especially at low bit rates and is based on harmonic redundancy in the frequency domain.
It can be combined with any audio compression codec: the codec itself tra ...
(SBR) technology.[Next-Gen Low-Latency Open Codec Beats HE-AAC](_blank)
Slashdot-Meldung vom 14. April 2011. In listening tests around 96 kbit/s, Opus shows slightly superior quality compared to AAC and significantly better quality compared to Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conjun ...
and MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
.
Opus has very low algorithmic delay, a necessity for use as part of a low- audio-latency communication link, which can permit natural conversation, networked music performance
A networked music performance or network musical performance is a real-time interaction over a computer network that enables musicians in different locations to perform as if they were in the same room. These interactions can include performances, ...
s, or lip sync
Lip sync or lip synch (pronounced , the same as the word ''sink'', short for lip synchronization) is a technical term for matching a speaking or singing person's lip movements with sung or spoken vocals.
Audio for lip syncing is generated thr ...
at live events. Total algorithmic delay for an audio format is the sum of delays that must be incurred in the encoder and the decoder of a live audio stream regardless of processing speed and transmission speed, such as buffering audio samples into blocks or frames, allowing for window overlap and possibly allowing for noise-shaping look-ahead in a decoder and any other forms of look-ahead, or for an MP3 encoder, the use of bit reservoir.
Total one-way latency below 150 ms is the preferred target of most VoIP
Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms Internet t ...
systems, to enable natural conversation with turn-taking little affected by delay. Musicians typically feel in-time with up to around 30 ms audio latency, roughly in accord with the fusion time of the Haas effect
Haas may refer to:
People
* Haas (surname)
* Haas Visser 't Hooft (1905–1977), Dutch field hockey player
Auto racing
* Haas F1 Team, a 21st-century Formula 1 auto racing team
* Haas Lola, a 20th-century Formula 1 auto racing team
* Newman/Haa ...
, though matching playback delay of each user's own instrument to the round-trip latency can also help. It is suggested for lip sync that around 45–100 ms audio latency may be acceptable.
Opus permits trading-off reduced quality or increased bitrate to achieve an even smaller algorithmic delay (5.0 ms minimum). While the reference implementation's default Opus frame is 20.0 ms long, the SILK layer requires a further 5.0 ms lookahead plus 1.5 ms for resampling, giving a default delay of 26.5 ms. When the CELT layer is active, it requires 2.5 ms lookahead for window overlap to which a matching delay of 4.0 ms is added by default to synchronize with the SILK layer. If the encoder is instantiated in the special ''restricted low delay'' mode, the 4.0 ms matching delay is removed and the SILK layer is disabled, permitting the minimal algorithmic delay of 5.0 ms.
Support
The format and algorithms are openly documented and the reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...
is published as free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, no ...
. Xiph's reference implementation is called ''libopus'' and a package called ''opus-tools'' provides command-line encoder and decoder utilities. It is published under the terms of a BSD-like license. It is written in C and can be compiled for hardware architectures with or without a floating-point unit
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
. The accompanying diagnostic tool ''opusinfo'' reports detailed technical information about Opus files, including information on the standard compliance of the bitstream format. It is based on ''ogginfo'' from the ''vorbis-tools'' and therefore — unlike the encoder and decoder — is available under the terms of version 2 of the GPL
The GNU General Public License (GNU GPL or simply GPL) is a series of widely used free software licenses that guarantee end users the four freedoms to run, study, share, and modify the software. The license was the first copyleft for general u ...
.
Implementations
contains a complete source code for an older version of the reference implementation written in C. RFC contains errata. Libopus is the more up-to-date but non-normative branch of the reference implementation.
The FFmpeg
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of vid ...
project has encoder and decoder implementations not derived from the reference library.
The libopus reference library has been ported to both C# and Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
as part of a project called Concentus. These ports sacrifice performance for the sake of being easily integrated into cross-platform applications.
Software and content providers
Digital Radio Mondiale
Digital Radio Mondiale (DRM; ''mondiale'' being Italian and French for "worldwide") is a set of digital audio broadcasting technologies designed to work over the bands currently used for analogue radio broadcasting including AM broadcasting—pa ...
– a digital radio format for AM frequencies – can broadcast and receive Opus audio (albeit not recognised in official standard) using Dream software-defined radio
Software-defined radio (SDR) is a radio communication system where components that have been traditionally implemented in analog hardware (e.g. mixers, filters, amplifiers, modulators/demodulators, detectors, etc.) are instead implemented by me ...
.
The Wikimedia Foundation
The Wikimedia Foundation, Inc., or Wikimedia for short and abbreviated as WMF, is an American 501(c)(3) nonprofit organization headquartered in San Francisco, California and registered as a charitable foundation under local laws. Best kno ...
sponsored a free and open source online JavaScript
JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
Opus encoder for browsers supporting the required HTML5
HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
features.
A list of radio stations that stream using Opus audio codec can be found on the Xiph.Org Foundation Icecast
Icecast is a streaming media project released as free software maintained by the Xiph.Org Foundation. It also refers specifically to the server program which is part of the project. Icecast was created in December 1998/January 1999 by Jack Mo ...
directory.
In late 2014 and 2015, Google's video platform YouTube
YouTube is a global online video platform, online video sharing and social media, social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by ...
started offering Opus audio along with VP9
VP9 is an open and royalty-free video coding format developed by Google.
VP9 is the successor to VP8 and competes mainly with MPEG's High Efficiency Video Coding (HEVC/H.265).
At first, VP9 was mainly used on Google's video platform YouTube. ...
video in the WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
file format, through DASH streaming.
Since 2016, WhatsApp
WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
has been using Opus as its audio file format.
Signal
In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The ''IEEE Transactions on Signal Processing'' ...
switched from Speex
Speex is an audio compression codec specifically tuned for the reproduction of human speech and also a free software speech codec that may be used on VoIP applications and podcasts. It is based on the CELP speech coding algorithm.Xiph.OrIntro ...
to Opus audio codec for better audio quality in the beginning of 2017.
In 2018, SoundCloud
SoundCloud is an online audio distribution platform and music sharing website that enables its users to upload, promote, and share audio. Founded in 2007 by Alexander Ljung and Eric Wahlforss, SoundCloud is one of the largest music streaming se ...
switched from MP3 to Opus, reducing half of its required bandwidth for music streaming.
In January 2021, Vimeo
Vimeo, Inc. () is an American video hosting, sharing, and services platform provider headquartered in New York City. Vimeo focuses on the delivery of high-definition video across a range of devices. Vimeo's business model is through software as ...
introduced Opus to its video platform.
In 2021, the Danish journalism website Zetland switched from MP3 to Opus for its articles' audio recordings, which attained a 35 percent reduction in bandwidth and reduced climate footprint.
One of the changes on VirtualBox 7.0.0 is that Opus was no longer being used.
Operating system support
Most end-user software relies on multimedia framework
A multimedia framework is a software framework that handles media on a computer and through a network. A good multimedia framework offers an intuitive API and a modular architecture to easily add support for new audio, video and container formats ...
s provided by the operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.
Time-sharing operating systems schedule tasks for efficient use of the system and may also in ...
. Native Opus codec support is implemented in most major multimedia frameworks for Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems, including GStreamer
GStreamer is a pipeline-based multimedia framework that links together a wide variety of media processing systems to complete complex workflows. For instance, GStreamer can be used to build a system that reads files in one format, processes them, ...
, FFmpeg
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg tool itself, designed for processing of vid ...
, and Libav
Libav is an abandoned free software project, forked from FFmpeg in 2011, that contains libraries and programs for handling multimedia data.
History
Fork from FFmpeg
The Libav project was a fork of the FFmpeg project. It was announced on ...
libraries.
The WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
container .webm
is mostly used on online video platforms (e.g. YouTube
YouTube is a global online video platform, online video sharing and social media, social media platform headquartered in San Bruno, California. It was launched on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim. It is owned by ...
), and is usually treated as a video file by operating systems & media players. Even if a WebM file contains only Opus audio and no video, some music players do not recognize WebM files as audio files and do not support reading of file metadata.
The Ogg
Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality di ...
container .opus
is preferred for audio-only files, and most media players have support for audio file metadata tagged in the Vorbis comment
A Vorbis comment is a metadata container used in the Vorbis, FLAC, Theora, Speex and Opus file formats. It allows information such as the title, artist, album, track number or other information about the file to be added to the file itself. Howeve ...
format.
Google
Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
added native support for Opus audio playback in Android 5.0 "Lollipop". However, it was limited to Opus audio encapsulated in Matroska
Matroska is a project to create a container format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. The Matroska Multimedia Container is similar in concept to other containers like AVI, MP4, or Adva ...
and WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
containers, such as .mkv
, .mka
and .webm
files. Android 7.0 "Nougat" introduced support for Opus audio encapsulated in Ogg
Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality di ...
containers. Android 10
Android 10 ( codenamed Android Q during development) is the tenth major release and the 17th version of the Android mobile operating system. It was first released as a developer preview on March 13, 2019, and was released publicly on Septembe ...
finally added native support for .opus
extensions
Extension, extend or extended may refer to:
Mathematics
Logic or set theory
* Axiom of extensionality
* Extensible cardinal
* Extension (model theory)
* Extension (predicate logic), the set of tuples of values that satisfy the predicate
* E ...
.[Support Opus in the MediaScanner (37054258) - Visible to Public - Google Issue Tracker](_blank)
/ref>
Due to the addition of WebRTC
WebRTC (Web Real-Time Communication) is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication to wor ...
support in Apple's WebKit
WebKit is a browser engine developed by Apple and primarily used in its Safari web browser, as well as on the iOS and iPadOS version of any web browser. WebKit is also used by the BlackBerry Browser, PlayStation consoles beginning from the PS ...
rendering engine, macOS High Sierra
macOS High Sierra (version 10.13) is the fourteenth major release of macOS, Apple Inc.'s desktop operating system for Macintosh computers. macOS High Sierra was announced at the WWDC 2017 on June 5, 2017 and was released on September 25, 2017. ...
and iOS 11
iOS 11 is the iOS version history, eleventh major release of the iOS mobile operating system developed by Apple Inc., being the successor to iOS 10. It was announced at the company's Apple Worldwide Developers Conference, Worldwide Developers C ...
come with native playback support for Opus audio encapsulated in Core Audio Format
The Core Audio Format is a container for storing audio, developed by Apple Inc. It is compatible with Mac OS X v10.4, Mac OS X 10.4 and higher; Mac OS X v10.3, Mac OS X 10.3 needs QuickTime 7 to be installed.
Core Audio Format is designed to over ...
containers.
On Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system. It is the direct successor to Windows 8.1, which was released nearly two years earlier. It was released to manufacturing on July 15, 2015, and later to retail on J ...
, version 1607, Microsoft provided native support for Opus audio encapsulated in Matroska
Matroska is a project to create a container format that can hold an unlimited number of video, audio, picture, or subtitle tracks in one file. The Matroska Multimedia Container is similar in concept to other containers like AVI, MP4, or Adva ...
and WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
containers. On version 1709, support for Opus audio encapsulated in Ogg
Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality di ...
containers was made available through a pre-installed add-on called Web Media Extensions. On Windows 10 version 1903, native support for the .opus
extension was added. On Windows 8.1 and older, third-party decoders, such as LAV Filters, are available to provide support for the format.
Media player support
While support in multimedia frameworks automatically enables Opus support in software which is built on top of such frameworks, several applications developers made additional efforts for supporting the Opus audio format in their software. Such support was added to AIMP, Amarok (software), Amarok, cmus, Music Player Daemon, foobar2000, Mpxplay, MusicBee, SMplayer, VLC media player, Winamp and Xmplay audio players; Icecast
Icecast is a streaming media project released as free software maintained by the Xiph.Org Foundation. It also refers specifically to the server program which is part of the project. Icecast was created in December 1998/January 1999 by Jack Mo ...
, Airtime (software) audio streaming software; and Asunder (software), Asunder audio CD ripper, CDBurnerXP CD burner, FFmpeg, Libav and MediaCoder media encoding tools. Streaming Icecast radio trials are live since September 2012 and January 2013. SteamOS uses Opus or Vorbis for streaming audio.
Browser support
Opus support is mandatory for WebRTC
WebRTC (Web Real-Time Communication) is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication to wor ...
implementations. Opus is supported in Firefox, Mozilla Firefox, Chromium (web browser), Chromium and Google Chrome, Blink (web engine), Blink-based Opera (web browser), Opera, as well as all browsers for Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
systems relying on GStreamer
GStreamer is a pipeline-based multimedia framework that links together a wide variety of media processing systems to complete complex workflows. For instance, GStreamer can be used to build a system that reads files in one format, processes them, ...
for multimedia formats support. Although Internet Explorer will not provide Opus playback natively, support for the format is built into the Microsoft Edge, Edge browser, along with VP9
VP9 is an open and royalty-free video coding format developed by Google.
VP9 is the successor to VP8 and competes mainly with MPEG's High Efficiency Video Coding (HEVC/H.265).
At first, VP9 was mainly used on Google's video platform YouTube. ...
, for full WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
support. Safari supports Opus as of iOS 11 and macOS High Sierra.
VoIP support
Due to its abilities, Opus gained early interest from voice over IP
Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of speech, voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet. The terms In ...
(VoIP) software vendors. Several Session Initiation Protocol, SIP clients, including Acrobits Softphone, CSipSimple (via additional plug-in), Empathy (software), Empathy (via GStreamer), Jitsi, Tuenti, Line2 (currently only on iOS), Linphone, Phoner and PhonerLite, SFLphone, Telephone (application), Telephone, Mumble (software), Mumble, Discord
Discord is a VoIP and instant messaging social platform. Users have the ability to communicate with voice calls, video calls, text messaging, media and files in private chats or as part of communities called "servers".The developer documenta ...
and TeamSpeak 3 voice chat software also support Opus. TrueConf supports Opus in its VoIP products. Asterisk (PBX), Asterisk lacked builtin Opus support for legal reasons, but a third-party patch was available for download and official support via a binary blob was added in September 2016. Tox (protocol), Tox P2P videoconferencing software uses Opus exclusively. Classified-ads distributed messaging app sends raw opus frames inside TLS socket in its VoIP implementation.
Opus is widely used as the voice codec in WhatsApp
WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
, which has over 1.5billion users worldwide. WhatsApp uses Opus at 816 kHz sampling rate
In signal processing, sampling is the reduction of a continuous-time signal to a discrete-time signal. A common example is the conversion of a sound wave to a sequence of "samples".
A sample is a value of the signal at a point in time and/or spac ...
s, with the Real-time Transport Protocol (RTP). The PlayStation 4
The PlayStation 4 (PS4) is a home video game console developed by Sony Interactive Entertainment. Announced as the successor to the PlayStation 3 in February 2013, it was launched on November 15, 2013, in North America, November 29, 2013 in ...
video game console also uses the CELT/Opus codec for its PlayStation Network system party chat. It is also used in the Zoom videoconferencing app.
Hardware
Since version 3.13, Rockbox enables Opus playback on supported portable media players, including some products from the iPod series by Apple Inc., Apple, devices made by iriver, Archos and Sandisk, and on Android (operating system), Android devices using "Rockbox as an Application". All recent Grandstream IP phones support Opus audio both for encoding and decoding. OBihai OBi1062, OBi1032 and OBi1022 IP phones all support Opus. Recent BlueSound wireless speakers support Opus playback. Devices running Hiby OS, like the Hiby R3, are capable of decoding Opus files natively.
Many broadcast IP codecs include Opus such as those manufactured by Comrex, GatesAir and Tieline.
The Sony PlayStation 5 supports capturing 1080p and 2160p footage using VP9 video and Opus audio in a WebM container.
Patent Claims
As an open standard, the algorithms are openly documented, and a reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...
(including the source code) is published. Broadcom
Broadcom Inc. is an American designer, developer, manufacturer and global supplier of a wide range of semiconductor and infrastructure software products. Broadcom's product offerings serve the data center, networking, software, broadband, wirel ...
and the Xiph.Org Foundation own software patent
A software patent is a patent on a piece of software, such as a computer program, libraries, user interface, or algorithm.
Background
A patent is a set of exclusionary rights granted by a state to a patent holder for a limited period of time, u ...
s on some of the CELT algorithms, and Skype Technologies/Microsoft own some on the SILK algorithms; each offers a royalty-free perpetual for use with Opus, reserving only the right to make use of their patents to defend against infringement suits of third parties. Qualcomm, Huawei
Huawei Technologies Co., Ltd. ( ; ) is a Chinese multinational technology corporation headquartered in Shenzhen, Guangdong, China. It designs, develops, produces and sells telecommunications equipment, consumer electronics and various smar ...
, France Telecom, and Ericsson
(lit. "Telephone Stock Company of LM Ericsson"), commonly known as Ericsson, is a Swedish multinational networking and telecommunications company headquartered in Stockholm. The company sells infrastructure, software, and services in informat ...
have claimed that their patents may apply, which Xiph's legal counsel denies, and none have pursued any legal action. The Opus license automatically and retroactively terminates for any entity that attempts to file a patent suit.
In September of 2022, UK-based Vectis IP Ltd announced the formation of a patent pool for Opus. Members of the pools included Dolby International AB, Dolby Laboratories Licensing Corporation, and Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. A list of patents will be published soon after an evaluation. However, the program makes an exception for open-source software, applications, or content that developers or providers distribute separate from hardware devices (i.e. PCs, smartphones, IP phones, smart TVs).
Notes
References
Citations
Sources
* This article contains quotations from the Opus Codec website, which is available under th
Creative Commons Attribution 3.0 (CC BY 3.0)
license.
External links
*
Opus on Hydrogenaudio Knowledgebase
See also
* Comparison of audio coding formats
* Streaming media
* xHE-AAC
{{Compression software
Speech codecs
Free audio codecs
Lossy compression algorithms
Xiph.Org projects
Software using the BSD license
Open formats