Speech Recognition Grammar Specification
   HOME

TheInfoList



OR:

Speech Recognition Grammar Specification (SRGS) is a W3C standard for how ''speech recognition grammars'' are specified. A speech recognition grammar is a set of word patterns, and tells a speech recognition system what to expect a human to say. For instance, if you call an
auto-attendant In telephony, an automated attendant (also auto attendant, auto-attendant, autoattendant, automatic phone menus, AA, or virtual receptionist) allows callers to be automatically transferred to an extension without the intervention of an operator ...
application, it will prompt you for the name of a person (with the expectation that your call will be transferred to that person's phone). It will then start up a speech recognizer, giving it a speech recognition grammar. This grammar contains the names of the people in the auto attendant's directory and a collection of sentence patterns that are the typical responses from callers to the prompt. SRGS specifies two alternate but equivalent syntaxes, one based on
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable ...
, and one using augmented BNF format. In practice, the XML syntax is used more frequently. Both the ABNF and XML form have the expressive power of a context-free grammar. A grammar processor that does not support recursive grammars has the expressive power of a
finite state machine A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
or
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
language. If the speech recognizer returned just a string containing the actual words spoken by the user, the voice application would have to do the tedious job of extracting the semantic meaning from those words. For this reason, SRGS grammars can be decorated with ''tag'' elements, which when executed, build up the semantic result. SRGS does not specify the contents of the tag elements: this is done in a companion W3C standard,
Semantic Interpretation for Speech Recognition Semantic Interpretation for Speech Recognition (SISR) defines the syntax and semantics of annotations to grammar rules in the Speech Recognition Grammar Specification (SRGS). Since 5 April 2007, it is a World Wide Web Consortium recommendation. ...
(SISR). SISR is based on
ECMAScript ECMAScript (; ES) is a JavaScript standard intended to ensure the interoperability of web pages across different browsers. It is standardized by Ecma International in the documenECMA-262 ECMAScript is commonly used for client-side scripting o ...
, and ECMAScript statements inside the SRGS tags build up an ECMAScript semantic result object that is easy for the voice application to process. Both SRGS and SISR are W3C Recommendations, the final stage of the W3C standards track. The W3C
VoiceXML VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service ...
standard, which defines how voice dialogs are specified, depends heavily on SRGS and SISR.


Examples

Here is an example of the augmented BNF of SRGS, as it could be used in an auto attendant application: #ABNF 1.0 ISO-8859-1; // Default grammar language is US English language en-US; // Single language attachment to tokens // Note that "fr-CA" (Canadian French) is applied to only // the word "oui" because of precedence rules $yes = yes , oui!fr-CA; // Single language attachment to an expansion $people1 = (Michel Tremblay , André Roy)!fr-CA; // Handling language-specific pronunciations of the same word // A capable speech recognizer will listen for Mexican Spanish and // US English pronunciations. $people2 = Jose!en-US , Jose!es-MX; /** * Multi-lingual input possible * @example may I speak to André Roy * @example may I speak to Jose */ public $request = may I speak to ($people1 , $people2); Here is the same SRGS example, using the XML form: yes oui Michel Tremblay André Roy Jose Jose may I speak to André Roy may I speak to Jose may I speak to


See also

*
SISR Semantic Interpretation for Speech Recognition (SISR) defines the syntax and semantics of annotations to grammar rules in the Speech Recognition Grammar Specification (SRGS). Since 5 April 2007, it is a World Wide Web Consortium recommendation. ...
*
VoiceXML VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service ...
*
Pronunciation Lexicon Specification The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing applicati ...
(PLS) *
Natural Language Semantics Markup Language Natural Language Semantics Markup Language is a markup language for providing systems (like Voice Browsers) with semantic interpretations for a variety of inputs, including speech and natural language text input. Natural Language Semantics Markup L ...
*
JSGF JSGF stands for Java Speech Grammar Format or the JSpeech Grammar Format (in a W3C Note). Developed by Sun Microsystems, it is a textual representation of grammars for use in speech recognition for technologies like XHTML+Voice. JSGF adopts the s ...


External links


SRGS Specification (W3C Recommendation)

SISR Specification (W3C Recommendation)

VoiceXML Forum
{{W3C Standards World Wide Web Consortium standards XML-based standards