The Pronunciation Lexicon Specification (PLS) is a
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
Recommendation, which is designed to enable interoperable specification of pronunciation information for both
speech recognition
Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...
and
speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...
engines within voice browsing applications. The language is intended to be easy to use by developers while supporting the accurate specification of pronunciation information for international use.
The language allows one or more pronunciations for a word or phrase to be specified using a standard pronunciation alphabet or if necessary using vendor specific alphabets. Pronunciations are grouped together into a PLS document which may be referenced from other markup languages, such as the Speech Recognition Grammar Specification
SRGS and the Speech Synthesis Markup Language
SSML.
Usage
Here is an example PLS document:
judgment
judgement
ˈdʒʌdʒ.mənt
fiancé
fiance
fiˈɒns.eɪ
ˌfiː.ɑːnˈseɪ
which could be used to improve
TTS as shown in the following
SSML 1.0 document:
In the judgement of my fiancé, Las Vegas is the best place for a honeymoon.
I replied that I preferred Venice and didn't think the Venetian casino was an
acceptable compromise.
but also to improve
ASR
The Asr prayer ( ar, صلاة العصر ', "afternoon prayer") is one of the five mandatory salah (Islamic prayer). As an Islamic day starts at sunset, the Asr prayer is technically the fifth prayer of the day. If counted from midnight, it is ...
in the following
SRGS 1.0 grammar:
- Terminator 2: Judgment Day
- My Big Fat Obnoxious Fiance
- Pluto's Judgement Day
Common use cases
Multiple pronunciations for the same orthography
For
ASR
The Asr prayer ( ar, صلاة العصر ', "afternoon prayer") is one of the five mandatory salah (Islamic prayer). As an Islamic day starts at sunset, the Asr prayer is technically the fifth prayer of the day. If counted from midnight, it is ...
systems it is common to rely on multiple pronunciations of the same word or phrase in order to cope with variations of pronunciation within a language. In the Pronunciation Lexicon language, multiple pronunciations are represented by more than one
(or ) element within the same element.
In the following example the word "Newton" has two possible pronunciations.
Newton
ˈnjuːtən
ˈnuːtən
Multiple orthographies
In some situations there are alternative textual representations for the same word or phrase. This can arise due to a number of reasons. See Section 4.5 of PLS for details. Because these are representations that have the same meaning (as opposed to homophones), it is recommended that they be represented using a single element that contains multiple graphemes.
Here are two simple examples of multiple orthographies: alternative spelling of an English word and multiple writings of a Japanese word.
colour
color
ˈkʌlər
nihongo
日本語
にほんご
ɲihoŋɡo
Homophones
Most languages have homophones
A homophone () is a word that is pronounced the same (to varying extent) as another word but differs in meaning. A ''homophone'' may also differ in spelling. The two words may be spelled the same, for example ''rose'' (flower) and ''rose'' (p ...
, words with the same pronunciation but different meanings (and possibly different spellings), for instance "seed" and "cede". It is recommended that these be represented as different lexemes.
cede
siːd
seed
siːd
Homographs
Most languages have words with different meanings but the same spelling (and sometimes different pronunciations), called homographs
A homograph (from the el, ὁμός, ''homós'', "same" and γράφω, ''gráphō'', "write") is a word that shares the same written form as another word but has a different meaning. However, some dictionaries insist that the words must also ...
. For example, in English the word bass (fish) and the word bass (in music) have identical spellings but different meanings and pronunciations. Although it is recommended that these words be represented using separate elements that are distinguished by different values of the role attribute (see Section 4.4 of PLS 1.0), if a pronunciation lexicon author does not want to distinguish between the two words they could simply be represented as alternative pronunciations within the same element. In the latter case the TTS processor will not be able to distinguish when to apply the first or the second transcription.
In this example the pronunciations of the homograph "bass" are shown.
bass
bæs
beɪs
Note that English contains numerous examples of noun-verb pairs that can be treated either as homographs
A homograph (from the el, ὁμός, ''homós'', "same" and γράφω, ''gráphō'', "write") is a word that shares the same written form as another word but has a different meaning. However, some dictionaries insist that the words must also ...
or as alternative pronunciations, depending on author preference. Two examples are the noun/verb "refuse" and the noun/verb "address".
refuse
rɪˈfjuːz
refuse
ˈrɛfjuːs
Pronunciation by orthography
For some words and phrases pronunciation can be expressed quickly and conveniently as a sequence of other orthographies
An orthography is a set of conventions for writing a language, including norms of spelling, hyphenation, capitalization, word breaks, emphasis, and punctuation.
Most transnational languages in the modern period have a writing system, and mos ...
. The developer is not required to have linguistic knowledge, but instead makes use of the pronunciations
Pronunciation is the way in which a word or a language is spoken. This may refer to generally agreed-upon sequences of sounds used in speaking a given word or language in a specific dialect ("correct pronunciation") or simply the way a particular ...
that are already expected to be available. To express pronunciations using other orthographies the element may be used.
This feature may be very useful to deal with acronym expansion.
W3C
World Wide Web Consortium
101
one hundred and one
Thailand
tie land
BBC 1
be be sea one
Status and future
*PLS 1.0 reached the status of W3C Recommendation on 14 October 2008.
See also
* VoiceXML
VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service ...
* SRGS
* SSML
* SISR
Semantic Interpretation for Speech Recognition (SISR) defines the syntax and semantics of annotations to grammar rules in the Speech Recognition Grammar Specification (SRGS). Since 5 April 2007, it is a World Wide Web Consortium recommendation.
...
References
PLS Specification (W3C Recommendation)
External links
PLS Specification (W3C Recommendation)
SRGS Specification (W3C Recommendation)
SSML Specification (W3C Recommendation)
VoiceXML Forum
France Telecom Orange Labs implementation of PLS 1.0 under the Gnu General Public License version 3
SourceForge project for Java-based implementation of PLS 1.0
{{W3C standards
World Wide Web Consortium standards
XML-based standards