VoiceXML
   HOME

TheInfoList



OR:

VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service portals. VoiceXML applications are developed and deployed in a manner analogous to how a
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used o ...
interprets and visually renders the Hypertext Markup Language (HTML) it receives from a
web server A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initia ...
. VoiceXML documents are interpreted by a
voice browser A voice browser is a software application that presents an interactive voice user interface to the user in a manner analogous to the functioning of a web browser interpreting Hypertext Markup Language (HTML). Dialog documents interpreted by voic ...
and in common deployment architectures, users interact with voice browsers via the
public switched telephone network The public switched telephone network (PSTN) provides infrastructure and services for public telecommunication. The PSTN is the aggregate of the world's circuit-switched telephone networks that are operated by national, regional, or local telep ...
(PSTN). The VoiceXML document format is based on
Extensible Markup Language Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
(XML). It is a standard developed by the
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working ...
(W3C).


Usage

VoiceXML applications are commonly used in many industries and segments of commerce. These applications include order inquiry, package tracking, driving directions, emergency notification, wake-up, flight tracking, voice access to email, customer relationship management, prescription refilling, audio news magazines, voice dialing, real-estate information and national
directory assistance In telecommunications, directory assistance or directory inquiries is a phone service used to find out a specific telephone number and/or address of a residence, business, or government entity. Technology Directory assistance systems incorporate ...
applications. VoiceXML has tags that instruct the
voice browser A voice browser is a software application that presents an interactive voice user interface to the user in a manner analogous to the functioning of a web browser interpreting Hypertext Markup Language (HTML). Dialog documents interpreted by voic ...
to provide
speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal langua ...
, automatic
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the ...
, dialog management, and audio playback. The following is an example of a VoiceXML document:
Hello world!
When interpreted by a VoiceXML interpreter this will output "Hello world" with synthesized speech. Typically,
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide We ...
is used as the transport protocol for fetching VoiceXML pages. Some applications may use static VoiceXML pages, while others rely on dynamic VoiceXML page generation using an
application server An application server is a server that hosts applications or software that delivers a business application through a communication protocol. An application server framework is a service layer model. It includes software components available to a ...
like Tomcat, Weblogic,
IIS IIS may refer to: Organizations * Indian Information Service, of the Government of India * Institute of Information Scientists, a professional association now merged into the Chartered Institute of Library and Information Professionals, UK * Inst ...
, or
WebSphere IBM WebSphere refers to a brand of proprietary computer software products in the genre of enterprise software known as "application and integration middleware". These software products are used by end-users to create and integrate applications ...
. Historically, VoiceXML platform vendors have implemented the standard in different ways, and added proprietary features. But the VoiceXML 2.0 standard, adopted as a W3C Recommendation on 16 March 2004, clarified most areas of difference. The VoiceXML Forum, an industry group promoting the use of the standard, provides a
conformance testing Conformance testing — an element of conformity assessment, and also known as compliance testing, or type testing — is testing or other activities that determine whether a process, product, or service complies with the requirements of a specifi ...
process that certifies vendors' implementations as conformant.


History

AT&T Corporation AT&T Corporation, originally the American Telephone and Telegraph Company, is the subsidiary of AT&T Inc. that provides voice, video, data, and Internet telecommunications and professional services to businesses, consumers, and government agen ...
, IBM,
Lucent Lucent Technologies, Inc. was an American multinational telecommunications equipment company headquartered in Murray Hill, New Jersey. It was established on September 30, 1996, through the divestiture of the former AT&T Technologies business u ...
, and
Motorola Motorola, Inc. () was an American multinational telecommunications company based in Schaumburg, Illinois, United States. After having lost $4.3 billion from 2007 to 2009, the company split into two independent public companies, Motorola ...
formed the VoiceXML Forum in March 1999, in order to develop a standard markup language for specifying voice dialogs. By September 1999 the Forum released VoiceXML 0.9 for member comment, and in March 2000 they published VoiceXML 1.0. Soon afterwards, the Forum turned over the control of the standard to the W3C. The W3C produced several intermediate versions of VoiceXML 2.0, which reached the final "Recommendation" stage in March 2004. VoiceXML 2.1 added a relatively small set of additional features to VoiceXML 2.0, based on feedback from implementations of the 2.0 standard. It is backward compatible with VoiceXML 2.0 and reached W3C Recommendation status in June 2007.


Future versions of the standard

VoiceXML 3.0 will be the next major release of VoiceXML, with new major features. It includes a new XML statechart description language called
SCXML SCXML stands for State Chart XML: State Machine Notation for Control Abstraction. It is an XML-based markup language that provides a generic state-machine-based execution environment based on Harel statecharts. SCXML is able to describe compl ...
.


Related standards

The W3C's Speech Interface Framework also defines these other standards closely associated with VoiceXML.


SRGS and SISR

The
Speech Recognition Grammar Specification Speech Recognition Grammar Specification (SRGS) is a W3C standard for how ''speech recognition grammars'' are specified. A speech recognition grammar is a set of word patterns, and tells a speech recognition system what to expect a human to say. ...
(SRGS) is used to tell the speech recognizer what sentence patterns it should expect to hear: these patterns are called grammars. Once the speech recognizer determines the most likely sentence it heard, it needs to extract the semantic meaning from that sentence and return it to the VoiceXML interpreter. This semantic interpretation is specified via the
Semantic Interpretation for Speech Recognition Semantic Interpretation for Speech Recognition (SISR) defines the syntax and semantics of annotations to grammar rules in the Speech Recognition Grammar Specification (SRGS). Since 5 April 2007, it is a World Wide Web Consortium recommendation. ...
(SISR) standard. SISR is used inside SRGS to specify the semantic results associated with the grammars, i.e., the set of ECMAScript assignments that create the semantic structure returned by the speech recognizer.


SSML

The
Speech Synthesis Markup Language Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis applications. It is a recommendation of the W3C's Voice Browser Working Group. SSML is often embedded in VoiceXML scripts to drive interactive telephon ...
(SSML) is used to decorate textual prompts with information on how best to render them in synthetic speech, for example which speech synthesizer voice to use or when to speak louder or softer.


PLS

The
Pronunciation Lexicon Specification The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing applicati ...
(PLS) is used to define how words are pronounced. The generated pronunciation information is meant to be used by both speech recognizers and speech synthesizers in voice browsing applications.


CCXML

The
Call Control eXtensible Markup Language Call Control eXtensible Markup Language (CCXML) is an XML standard designed to provide asynchronous event-based telephony support to VoiceXML. Its current status is a W3C recommendation, adopted May 10, 2011. Whereas VoiceXML is designed to provi ...
(CCXML) is a complementary W3C standard. A CCXML interpreter is used on some VoiceXML platforms to handle the initial call setup between the caller and the voice browser, and to provide telephony services like call transfer and disconnect to the voice browser. CCXML can also be used in non-VoiceXML contexts.


MSML, MSCML, MediaCTRL

In media server applications, it is often necessary for several call legs to interact with each other, for example in a multi-party conference. Some deficiencies were identified in VoiceXML for this application and so companies designed specific scripting languages to deal with this environment. The Media Server Markup Language (MSML) was Convedia's solution, and Media Server Control Markup Language (MSCML) was Snowshore's solution. Snowshore is now owned by Dialogic and Convedia is now owned by Radisys. These languages also contain 'hooks' so that external scripts (like VoiceXML) can run on call legs where IVR functionality is required. There was an IETF working group called ''mediactrl'' ("media control") that was working on a successor for these scripting systems, which it is hoped will progress to an open and widely adopted standard. The mediactrl working group concluded in 2013.


See also

*
ECMAScript ECMAScript (; ES) is a JavaScript standard intended to ensure the interoperability of web pages across different browsers. It is standardized by Ecma International in the documenECMA-262 ECMAScript is commonly used for client-side scripti ...
 – the scripting language used in VoiceXML * OpenVXI – an open source VoiceXML interpreter library *
SCXML SCXML stands for State Chart XML: State Machine Notation for Control Abstraction. It is an XML-based markup language that provides a generic state-machine-based execution environment based on Harel statecharts. SCXML is able to describe compl ...
 – State Chart XML


References


External links


W3C's Voice Browser Working Group
Official VoiceXML Standards
VoiceXML Forum
VoiceXML Trademark Holder *
VoiceXML tutorials
{{DEFAULTSORT:Voicexml World Wide Web Consortium standards XML-based standards Markup languages Speech synthesis XML-based programming languages VoIP protocols 2000 software