HTML5
HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
specification, incorporating audio input, playback, and synthesis, as well as
speech to text
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the mai ...
, in the browser.
<audio> element
The element represents a sound, or an audio stream. It is commonly used to play back a single audio file within a web page, showing a GUI widget with play/pause/volume controls.
The element has these attributes:
* global attributes (accesskey; class; contenteditable; contextmenu; dir; draggable; dropzone; hidden; id; lang; spellcheck; style; tabindex; title; translate)
* autoplay = "autoplay" or "" (empty string) or empty Instructs the User-Agent to automatically begin playback of the audio stream as soon as it can do so without stopping.
* preload = "none" or "metadata" or "auto" or "" (empty string) or empty Represents a hint to the User-Agent about whether optimistic downloading of the audio stream itself or its metadata is considered worthwhile.
** "none": Hints to the User-Agent that the user is not expected to need the audio stream, or that minimizing unnecessary traffic is desirable.
** "metadata": Hints to the User-Agent that the user is not expected to need the audio stream, but that fetching its metadata (duration and so on) is desirable.
** "auto": Hints to the User-Agent that optimistically downloading the entire audio stream is considered desirable.
* controls = "controls" or "" (empty string) or empty Instructs the User-Agent to expose a user interface for controlling playback of the audio stream.
* loop = "loop" or "" (empty string) or empty Instructs the User-Agent to seek back to the start of the audio stream upon reaching the end.
* mediagroup = string Instructs the User-Agent to link multiple videos and/or audio streams together.
* muted = "muted" or "" (empty string) or empty Represents the default state of the audio stream, potentially overriding user preferences.
* src = non-empty RLpotentially surrounded by spaces The URL for the audio stream.
Example:
Supporting browsers
On PC:
*
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
Opera
Opera is a form of theatre in which music is a fundamental component and dramatic roles are taken by singers. Such a "work" (the literal translation of the Italian word "opera") is typically a collaboration between a composer and a librett ...
Android Browser
is a list of features in the Android operating system.
General
; Messaging: SMS and MMS are available forms of messaging, including threaded text messaging and Android Cloud To Device Messaging (C2DM) and now enhanced version of C2DM, Android ...
2.3
* Blackberry Browser
*
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
*
Internet Explorer Mobile
Internet Explorer Mobile (formerly named Pocket Internet Explorer; later called IE Mobile) is a mobile version of Internet Explorer developed by Microsoft, based on versions of the MSHTML (Trident) layout engine. IE Mobile comes loaded by defaul ...
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current ...
*
Opera Mobile
Opera Mobile is a mobile web browser for smartphones, tablets and PDAs developed by Opera.
History
The first devices to run a mobile edition of Opera were the Psion Series 5, Psion Series 5mx, Psion Series 7, and then Psion netBook. They ra ...
11
Supported audio coding formats
The adoption of HTML5 audio, as with
HTML5 video
The HTML5 specification introduced the video element for the purpose of playing videos, partially replacing the object element. HTML5 video is intended by its creators to become the new standard way to show video on the web, instead of the previo ...
, has become polarized between proponents of free and patent-encumbered formats. In 2007, the recommendation to use
Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conj ...
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
together with that to use
Ogg Theora
Theora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container ...
, citing the lack of a format accepted by all the major browser vendors.
Apple
An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
and
Microsoft
Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
support the
ISO
ISO is the most common abbreviation for the International Organization for Standardization.
ISO or Iso may also refer to: Business and finance
* Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007
* Iso ...
/
IEC
The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
-
defined
A definition is a statement of the meaning of a term (a word, phrase, or other set of symbols). Definitions can be classified into two large categories: intensional definitions (which try to give the sense of a term), and extensional defini ...
MP3
MP3 (formally MPEG-1 Audio Layer III or MPEG-2 Audio Layer III) is a coding format for digital audio developed largely by the Fraunhofer Society in Germany, with support from other digital scientists in the United States and elsewhere. Origin ...
.
Mozilla
Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, w ...
and
Opera
Opera is a form of theatre in which music is a fundamental component and dramatic roles are taken by singers. Such a "work" (the literal translation of the Italian word "opera") is typically a collaboration between a composer and a librett ...
support the free and
open
Open or OPEN may refer to:
Music
* Open (band), Australian pop/rock band
* The Open (band), English indie rock band
* ''Open'' (Blues Image album), 1969
* ''Open'' (Gotthard album), 1999
* ''Open'' (Cowboy Junkies album), 2001
* ''Open'' (YF ...
Vorbis
Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression. Vorbis is most commonly used in conj ...
format in
Ogg
Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality di ...
and
WebM
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML5 video and the HTML5 audio elements. It has a sister project, WebP, for images. The development of the format is sponsored ...
containers, and criticize the patent-encumbered nature of MP3 and AAC, which are guaranteed to be “non-free”.
Google
Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
has so far provided support for all common formats.
Most AAC files with finite length are wrapped in an MPEG-4 container (.mp4, .m4a), which is supported natively in Internet Explorer, Safari, and Chrome, and supported by the OS in Firefox and Opera. Most AAC live streams with infinite length are wrapped in an Audio Data Transport Stream container (.aac, .adts), which is supported by Chrome, Safari, Firefox and Edge.
Many browsers also support uncompressed PCM audio in a
WAV
Waveform Audio File Format (WAVE, or WAV due to its filename extension; pronounced "wave") is an audio file format standard, developed by IBM and Microsoft, for storing an audio bitstream on PCs. It is the main format used on Microsoft Wind ...
E container.
In 2012, the free and open royalty-free
Opus
''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.).
Opus or OPUS may refer to:
Arts and entertainment Music
* Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
format was released and standardized by
IETF
The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster or requirements and a ...
. It is supported by Mozilla, Google, Opera and Edge.
This table documents the current support for
audio coding format
An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding ...
s by the <audio> element.
Web Audio API and MediaStream Processing API
The Web Audio API specification developed by
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.
Mozilla's Firefox browser implements a similar Audio Data API extension since version 4, implemented in 2010 and released in 2011, but Mozilla warns it is non-standard and deprecated, and recommends the Web Audio API instead.
Some JavaScript audio processing and synthesis libraries such a Audiolet support both APIs.
Th W3C Audio Working Group is also considering the
MediaStream Processing API
HTML5 Audio is a subject of the HTML5 specification, incorporating audio input, playback, and synthesis, as well as speech to text, in the browser.
<audio> element
The element represents a sound, or an audio stream. It is commonly used ...
specification developed by
Mozilla
Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape. The Mozilla community uses, develops, spreads and supports Mozilla products, thereby promoting exclusively free software and open standards, w ...
.
In addition to audio mixing and processing, it covers more general media streaming, including synchronization with HTML elements, capture of audio and video streams, and peer-to-peer routing of such media streams.
Supporting browsers
On PC:
*
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
10 (Enabled by default since 14)
*
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current ...
23 (Enabled by default since 25)
*
Opera
Opera is a form of theatre in which music is a fundamental component and dramatic roles are taken by singers. Such a "work" (the literal translation of the Italian word "opera") is typically a collaboration between a composer and a librett ...
Microsoft Edge
Microsoft Edge is a proprietary, cross-platform web browser created by Microsoft. It was first released in 2015 as part of Windows 10 and Xbox One and later ported to other platforms as a fork of Google's Chromium open-source project: Android ...
12
On mobile devices:
*
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
for Android 28 (Enabled by default since 29) and Apple iPads
* Safari 6 (Has restrictions on use (Muted unless user called))
*
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current ...
Web Speech API
HTML5 Audio is a subject of the HTML5 specification, incorporating audio input, playback, and synthesis, as well as speech recognition, speech to text, in the browser.
<audio> element
The HTML element#Elements vs. tags, element represen ...
aims to provide an alternative input method for web applications (without using a keyboard). With this API, developers can give web apps the ability to transcribe voice to text, from the computer's microphone. The recorded audio is sent to speech servers for transcription, after which the text is typed out for the user. The API itself is agnostic of the underlying speech recognition implementation and can support both server based as well as embedded recognizers.
The HTML Speech Incubator group has proposed the implementation of audio-speech technology in browsers in the form of uniform, cross-platform APIs. The API contains both:
* Speech Input API
* Text to Speech API
Google integrated this feature into Google Chrome in March 2011. Letting its users search the web with their voice with code like:
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
25 and up
*
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current ...
HTML5 video
The HTML5 specification introduced the video element for the purpose of playing videos, partially replacing the object element. HTML5 video is intended by its creators to become the new standard way to show video on the web, instead of the previo ...