ARPABET
   HOME

TheInfoList



OR:

ARPABET (also spelled ARPAbet) is a set of
phonetic transcription Phonetic transcription (also known as phonetic script or phonetic notation) is the visual representation of speech sounds (or ''phones'') by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the I ...
codes developed by
Advanced Research Projects Agency The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
(ARPA) as a part of their Speech Understanding Research project in the 1970s. It represents
phoneme In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language. For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
s and
allophone In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is a set of multiple possible spoken soundsor ''phones''or signs used to pronounce a single phoneme in a particular language. For example, in English, (as in '' ...
s of
General American English General American English or General American (abbreviated GA or GenAm) is the umbrella accent of American English spoken by a majority of Americans. In the United States it is often perceived as lacking any distinctly regional, ethnic, or so ...
with distinct sequences of
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
characters. Two systems, one representing each segment with one character (alternating upper- and lower-case letters) and the other with one or two (case-insensitive), were devised, the latter being far more widely adopted. ARPABET has been used in several speech synthesizers, including Computalker for the S-100 system, SAM for the
Commodore 64 The Commodore 64, also known as the C64, is an 8-bit home computer introduced in January 1982 by Commodore International (first shown at the Consumer Electronics Show, January 7–10, 1982, in Las Vegas). It has been listed in the Guinness W ...
, SAY for the
Amiga Amiga is a family of personal computers introduced by Commodore International, Commodore in 1985. The original model is one of a number of mid-1980s computers with 16- or 32-bit processors, 256 KB or more of RAM, mouse-based GUIs, and sign ...
, TextAssist for the PC and Speakeasy from Intelligent Artefacts which used the
Votrax Votrax International, Inc. (originally the Vocal division of Federal Screw Works), or just Votrax, was a speech synthesis company located in the Detroit, Michigan area from 1971 to 1996. It began as a division of Federal Screw Works from 1971 to 19 ...
SC-01 speech synthesiser IC. It is also used in the
CMU Pronouncing Dictionary The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthograp ...
. A revised version of ARPABET is used in the TIMIT corpus.


Symbols

Stress Stress may refer to: Science and medicine * Stress (biology), an organism's response to a stressor such as an environmental condition * Stress (linguistics), relative emphasis or prominence given to a syllable in a word, or to a word in a phrase ...
is indicated by a digit immediately following a vowel. Auxiliary symbols are identical in 1- and 2-letter codes. In 2-letter notation, segments are separated by a space.


TIMIT

In TIMIT, the following symbols are used in addition to the ones listed above:


See also

*
Comparison of ASCII encodings of the International Phonetic Alphabet The International Phonetic Alphabet (IPA) consists of more than 100 letters and diacritics. Before Unicode became widely available, several ASCII-based encoding systems of the IPA were proposed. The alphabet went through a large revision at the Ki ...
*
SAMPA __NOTOC__ The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for ...
, language-specific *
X-SAMPA The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. It is designed to unify the individual language SAMPA alphabets, a ...
, encoding the whole International Phonetic Alphabet *
Pronunciation respelling for English A pronunciation respelling for English is a notation used to convey the pronunciation of words in the English language, which does not have a phonemic orthography (i.e. the spelling does not reliably indicate pronunciation). There are two ...


References

{{reflist


External links


The CMU Pronouncing Dictionary
Phonetic alphabets Advanced Research Projects Agency ASCII American English Computer-related introductions in the 1970s 1970s in the United States 1970s in science