A phonemic orthography is an
orthography
An orthography is a set of conventions for writing a language, including norms of spelling, hyphenation, capitalization, word breaks, emphasis, and punctuation.
Most transnational languages in the modern period have a writing system, and ...
(system for writing a
language
Language is a structured system of communication. The structure of a language is its grammar and the free components are its vocabulary. Languages are the primary means by which humans communicate, and may be conveyed through a variety of ...
) in which the
grapheme
In linguistics, a grapheme is the smallest functional unit of a writing system.
The word ''grapheme'' is derived and the suffix ''-eme'' by analogy with ''phoneme'' and other names of emic units. The study of graphemes is called '' graphemi ...
s (written symbols) correspond to the
phoneme
In phonology and linguistics, a phoneme () is a unit of sound that can distinguish one word from another in a particular language.
For example, in most dialects of English, with the notable exception of the West Midlands and the north-wes ...
s (significant spoken sounds) of the language. Natural languages rarely have perfectly phonemic orthographies; a high degree of grapheme-phoneme correspondence can be expected in orthographies based on
alphabet
An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a syllab ...
ic writing systems, but they differ in how complete this correspondence is.
English orthography
English orthography is the writing system used to represent spoken English, allowing readers to connect the graphemes to sound and to meaning. It includes English's norms of spelling, hyphenation, capitalisation, word breaks, emphasis, and ...
, for example, is alphabetic but highly nonphonemic; it was once mostly phonemic during the Middle English stage, when the modern spellings originated, but
spoken English changed rapidly while the orthography was much more stable, resulting in the modern nonphonemic situation. However, because of their relatively recent modernizations compared to English, the
Serbian/
Croatian/
Bosnian/
Montenegrin,
Romanian
Romanian may refer to:
*anything of, from, or related to the country and nation of Romania
**Romanians, an ethnic group
**Romanian language, a Romance language
*** Romanian dialects, variants of the Romanian language
** Romanian cuisine, tradition ...
,
Italian
Italian(s) may refer to:
* Anything of, from, or related to the people of Italy over the centuries
** Italians, an ethnic group or simply a citizen of the Italian Republic or Italian Kingdom
** Italian language, a Romance language
*** Regional Ita ...
,
Turkish,
Spanish
Spanish might refer to:
* Items from or related to Spain:
**Spaniards are a nation and ethnic group indigenous to Spain
**Spanish language, spoken in Spain and many Latin American countries
**Spanish cuisine
Other places
* Spanish, Ontario, Can ...
,
Finnish
Finnish may refer to:
* Something or someone from, or related to Finland
* Culture of Finland
* Finnish people or Finns, the primary ethnic group in Finland
* Finnish language, the national language of the Finnish people
* Finnish cuisine
See also ...
,
Czech
Czech may refer to:
* Anything from or related to the Czech Republic, a country in Europe
** Czech language
** Czechs, the people of the area
** Czech culture
** Czech cuisine
* One of three mythical brothers, Lech, Czech, and Rus'
Places
* Czech, ...
,
Latvian,
Esperanto,
Korean
Korean may refer to:
People and culture
* Koreans, ethnic group originating in the Korean Peninsula
* Korean cuisine
* Korean culture
* Korean language
**Korean alphabet, known as Hangul or Chosŏn'gŭl
**Korean dialects and the Jeju language
** ...
and
Swahili orthographic systems come much closer to being consistent phonemic representations.
In less formal terms, a language with a highly phonemic orthography may be described as having regular spelling. Another terminology is that of
deep and shallow orthographies, in which the depth of an orthography is the degree to which it diverges from being truly phonemic. The concept can also be applied to nonalphabetic writing systems like
syllabaries.
Ideal phonemic orthography
In an ideal phonemic orthography, there would be a complete one-to-one correspondence (
bijection) between the graphemes (letters) and the phonemes of the language, and each phoneme would invariably be represented by its corresponding grapheme. So the spelling of a word would unambiguously and transparently indicate its pronunciation, and conversely, a speaker knowing the pronunciation of a word would be able to infer its spelling without any doubt. That ideal situation is rare but exists in a few languages.
A disputed example of an ideally phonemic orthography is the
Serbo-Croatian
Serbo-Croatian () – also called Serbo-Croat (), Serbo-Croat-Bosnian (SCB), Bosnian-Croatian-Serbian (BCS), and Bosnian-Croatian-Montenegrin-Serbian (BCMS) – is a South Slavic language and the primary language of Serbia, Croatia, Bosnia an ...
language. In its alphabet (
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
as well as
Serbian Cyrillic alphabet
The Serbian Cyrillic alphabet ( sr, / , ) is a variation of the Cyrillic script used to write the Serbian language, updated in 1818 by Serbian linguist Vuk Karadžić. It is one of the two alphabets used to write standard modern Serbian, t ...
), there are 30 graphemes, each uniquely corresponding to one of the phonemes. This seemingly perfect yet simple phonemic orthography was achieved in the 19th century—the Cyrillic alphabet first in 1814 by Serbian linguist
Vuk Karadžić
Vuk Stefanović Karadžić ( sr-Cyrl, Вук Стефановић Караџић, ; 6 November 1787 (26 October OS)7 February 1864) was a Serbian philologist, anthropologist and linguist. He was one of the most important reformers of the moder ...
, and the Latin alphabet in 1830 by Croatian linguist
Ljudevit Gaj. However, both Gaj's Latin alphabet and Serbian Cyrillic do not distinguish short and long vowels, and non-tonic (the short one is written), rising, and falling tones that Serbo-Croatian has. In Serbo-Croatian, the tones and vowel lengths were optionally written as (in Latin) ⟨e⟩, ⟨ē⟩, ⟨è⟩, ⟨é⟩, ⟨ȅ⟩, and ⟨ȇ⟩, especially in dictionaries.
Another such ideal phonemic orthography is native to
Esperanto, employing the language creator L. L. Zamenhof's then-pronounced principle “one letter, one sound”.
There are two distinct types of deviation from this phonemic ideal. In the first case, the exact one-to-one correspondence may be lost (for example, some phoneme may be represented by a
digraph instead of a single letter), but the "regularity" is retained: there is still an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
(but a more complex one) for predicting the spelling from the pronunciation and vice versa. In the second case, true irregularity is introduced, as certain words come to be spelled and pronounced according to different rules from others, and prediction of spelling from pronunciation and vice versa is no longer possible. Common cases of both types of deviation from the ideal are discussed in the following section.
Deviations from phonemic orthography
Some ways in which orthographies may deviate from the ideal of one-to-one grapheme-phoneme correspondence are listed below. The first list contains deviations that tend only to make the relation between spelling and pronunciation more complex, without affecting its predictability (see above paragraph).
Case 1: Regular
''Pronunciation and spelling still correspond in a predictable way''
*A phoneme may be represented by a sequence of letters, called a
multigraph
In mathematics, and more specifically in graph theory, a multigraph is a graph which is permitted to have multiple edges (also called ''parallel edges''), that is, edges that have the same end nodes. Thus two vertices may be connected by more ...
, rather than by a single letter (as in the case of the
digraph ''ch'' in French and the
trigraph ''sch'' in German). That only retains predictability if the multigraph cannot be broken down into smaller units. Some languages use diacritics to distinguish between a digraph and a sequence of individual letters, and others require knowledge of the language to distinguish them; compare ''
goatherd'' and ''
loather'' in English.
Examples:
''sch'' versus ''s-ch'' in
Romansch
''ng'' versus ''n'' + ''g'' in
Welsh
Welsh may refer to:
Related to Wales
* Welsh, referring or related to Wales
* Welsh language, a Brittonic Celtic language spoken in Wales
* Welsh people
People
* Welsh (surname)
* Sometimes used as a synonym for the ancient Britons (Celtic peop ...
''ch'' versus ''çh'' in
Manx Gaelic
Manx ( or , pronounced or ), also known as Manx Gaelic, is a Gaelic language of the insular Celtic branch of the Celtic language family, itself a branch of the Indo-European language family. Manx is the historical language of the Manx peo ...
: this is a slightly different case where the same digraph is used for two different single phonemes.
''ai'' versus ''aï'' in
French
This is often due to the use of an alphabet that was originally used for a different language (the
Latin alphabet
The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and th ...
in these examples) and so does not have single letters available for all the phonemes used in the current language (although some orthographies use devices such as
diacritic
A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacriti ...
s to increase the number of available letters).
*Sometimes, conversely, a single letter may represent a sequence of more than one phoneme (as ''
x'' can represent the sequence /ks/ in English and other languages).
*Sometimes, the rules of correspondence are more complex and depend on adjacent letters, often as a result of historical
sound changes (as with the rules for the pronunciation of ''ca'' and ''ci'' in
Italian
Italian(s) may refer to:
* Anything of, from, or related to the people of Italy over the centuries
** Italians, an ethnic group or simply a citizen of the Italian Republic or Italian Kingdom
** Italian language, a Romance language
*** Regional Ita ...
and the
silent ''e'' in English).
Case 2: Irregular
''Pronunciation and spelling do not always correspond in a predictable way''
* Sometimes, different letters correspond to the same phoneme (for instance ''u'' and ''ó'' in
Polish
Polish may refer to:
* Anything from or related to Poland, a country in Europe
* Polish language
* Poles
Poles,, ; singular masculine: ''Polak'', singular feminine: ''Polka'' or Polish people, are a West Slavic nation and ethnic group, w ...
are both pronounced as the phoneme /u/). That is often for historical reasons (the Polish letters originally stood for different phonemes, which later
merged phonologically). That affects the predictability of spelling from pronunciation but not necessarily vice versa. Another example is found in
Modern Greek
Modern Greek (, , or , ''Kiní Neoellinikí Glóssa''), generally referred to by speakers simply as Greek (, ), refers collectively to the dialects of the Greek language spoken in the modern era, including the official standardized form of the ...
, whose phoneme /i/ can be written in six different ways: ι, η, υ, ει, οι and υι.
* Conversely, a letter or group of letters can correspond to different phonemes in different contexts. For example, ''
th'' in English can be pronounced as /ð/ (as in ''this'') or /θ/ (as in ''thin''), as well as /th/ (as in ''goatherd'').
*Spelling may otherwise represent a historical pronunciation; orthography does not necessarily keep up with
sound changes in the spoken language. For example, both the ''k'' and the
digraph ''gh'' of English ''knight'' were once pronounced (the latter is still pronounced in some
Scots varieties), but after the
loss
Loss may refer to:
Arts, entertainment, and media Music
* ''Loss'' (Bass Communion album) (2006)
* ''Loss'' (Mull Historical Society album) (2001)
*"Loss", a song by God Is an Astronaut from their self-titled album (2008)
* Losses "(Lil Tjay son ...
of their sounds, they no longer represent the word's phonemic structure or its pronunciation.
*Spelling may represent the pronunciation of a different
dialect
The term dialect (from Latin , , from the Ancient Greek word , 'discourse', from , 'through' and , 'I speak') can refer to either of two distinctly different types of linguistic phenomena:
One usage refers to a variety of a language that is a ...
from the one being considered.
*Spellings of
loanword
A loanword (also loan word or loan-word) is a word at least partly assimilated from one language (the donor language) into another language. This is in contrast to cognates, which are words in two or more languages that are similar because t ...
s often adhere to or are influenced by the orthography of the source language (as with the English words ''ballet'' and ''fajita'', from French and
Spanish
Spanish might refer to:
* Items from or related to Spain:
**Spaniards are a nation and ethnic group indigenous to Spain
**Spanish language, spoken in Spain and many Latin American countries
**Spanish cuisine
Other places
* Spanish, Ontario, Can ...
respectively). With some loanwords, though, regularity is retained either by
** nativizing the pronunciation to match the spelling (as with the
Russian
Russian(s) refers to anything related to Russia, including:
*Russians (, ''russkiye''), an ethnic group of the East Slavic peoples, primarily living in Russia and neighboring countries
*Rossiyane (), Russian language term for all citizens and peo ...
word шофёр, from French ''chauffeur'' but pronounced in accordance with the normal rules of
Russian vowel reduction
In the pronunciation of the Russian language, several ways of vowel reduction (and its absence) are distinguished between the standard language and dialects. Russian orthography most often does not reflect vowel reduction, which can confuse for ...
; see also
spelling pronunciation
A spelling pronunciation is the pronunciation of a word according to its spelling when this differs from a longstanding standard or traditional pronunciation. Words that are spelled with letters that were never pronounced or that were not pronounc ...
) or by
**
nativizing the spelling (for example, ''football'' is spelt ''fútbol'' in Spanish and ''futebol'' in
Portuguese
Portuguese may refer to:
* anything of, from, or related to the country and nation of Portugal
** Portuguese cuisine, traditional foods
** Portuguese language, a Romance language
*** Portuguese dialects, variants of the Portuguese language
** Portu ...
).
*Spelling may reflect a
folk etymology (as in the English words ''hiccough'' and ''island'', so spelt because of an imagined connection with the words ''cough'' and ''isle''), or distant etymology (as in the English word ''debt'' in which the silent ''b'' was added under the influence of Latin).
* Spelling may reflect
morphophonemic
Morphophonology (also morphophonemics or morphonology) is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes (mi ...
structure rather than the purely phonemic (see next section) although it is often also a reflection of historical pronunciation.
Most orthographies do not reflect the changes in pronunciation known as
sandhi
Sandhi ( sa, सन्धि ' , "joining") is a cover term for a wide variety of sound changes that occur at morpheme or word boundaries. Examples include fusion of sounds across word boundaries and the alteration of one sound depending on near ...
in which pronunciation is affected by adjacent sounds in neighboring words (written
Sanskrit
Sanskrit (; attributively , ; nominally , , ) is a classical language belonging to the Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had diffused there from the northwest in the late ...
and other
Indian languages, however, reflect such changes). A language may also use different sets of symbols or different rules for distinct sets of vocabulary items such as the Japanese
hiragana
is a Japanese syllabary, part of the Japanese writing system, along with ''katakana'' as well as ''kanji''.
It is a phonetic lettering system. The word ''hiragana'' literally means "flowing" or "simple" kana ("simple" originally as contrast ...
and
katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji). The word ''katakana'' means "fragmentary kana", as the katakana characters are derived f ...
syllabaries (and the different treatment in English orthography of words derived from Latin and Greek).
Morphophonemic features
Alphabetic orthographies often have features that are
morphophonemic
Morphophonology (also morphophonemics or morphonology) is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes (mi ...
rather than purely phonemic. This means that the spelling reflects to some extent the underlying
morphological structure of the words, not only their pronunciation. Hence different forms of a
morpheme
A morpheme is the smallest meaningful constituent of a linguistic expression. The field of linguistic study dedicated to morphemes is called morphology.
In English, morphemes are often but not necessarily words. Morphemes that stand alone are ...
(minimum meaningful unit of language) are often spelt identically or similarly in spite of differences in their pronunciation. That is often for historical reasons; the morphophonemic spelling reflects a previous pronunciation from before historical
sound changes that caused the variation in pronunciation of a given morpheme. Such spellings can assist in the recognition of words when reading.
Some examples of morphophonemic features in orthography are described below.
*The English plural morpheme is written ''-s'' regardless of whether it is pronounced as or , e.g. ''cats and dogs'', not ''cats and dogz''. This is because the and sounds are forms of the same underlying
morphophoneme
Morphophonology (also morphophonemics or morphonology) is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes (mi ...
, automatically pronounced differently depending on its environment. (However, when this morpheme takes the form , the addition of the vowel ''is'' reflected in the spelling: ''churches'', ''masses''.)
*Similarly the English past tense morpheme is written ''-ed'' regardless of whether it is pronounced as , or .
*Many English words retain spellings that reflect their
etymology
Etymology ()The New Oxford Dictionary of English (1998) – p. 633 "Etymology /ˌɛtɪˈmɒlədʒi/ the study of the class in words and the way their meanings have changed throughout time". is the study of the history of the Phonological chan ...
and morphology rather than their present-day pronunciation. For example, ''sign'' and ''signature'' include the spelling , which means the same but is pronounced differently in the two words. Other examples are ''science'' vs. ''conscience'' , ''prejudice'' vs. ''prequel'' , ''nation'' vs. ''nationalism'' , and ''special'' vs. ''species'' .
*Phonological
assimilation is often not reflected in spelling even in otherwise phonemic orthographies such as Spanish, in which ''obtener'' "obtain" and ''optimista'' "optimist" are written with ''b'' and ''p'', but are commonly
neutralized with regard to voicing and pronounced in various ways, such as both
in neutral style or both
in emphatic pronunciation. On the other hand, Serbo-Croatian (Serbian, Croatian, Bosnian and Montenegrin) spelling reflects assimilation so one writes ''Србија/Srbija'' "Serbia" but ''српски/srpski'' "Serbian".
*The
final-obstruent devoicing
Final-obstruent devoicing or terminal devoicing is a systematic phonological process occurring in languages such as Catalan, German, Dutch, Breton, Russian, Polish, Lithuanian, Turkish, and Wolof. In such languages, voiced obstruents in ...
that occurs in many languages (such as German, Polish and Russian) is not normally reflected in the spelling. For example, in German, ''Bad'' "bath" is spelt with a final even though it is pronounced , thus corresponding to other morphologically related forms such as the verb ''baden'' (bathe) in which the ''d'' is pronounced . (Compare ', ' ("advice", "advise") in which the ''t'' is pronounced in both positions.)
Turkish orthography, however, is more strictly phonemic: for example, the imperative of ''eder'' "does" is spelled ''et'', as it is pronounced (and the same as the word for "meat"), not ''*ed'', as it would be if German spelling were used.
Korean ''
hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The le ...
'' has changed over the centuries from a highly phonemic to a largely morphophonemic orthography. Japanese kana are almost completely phonemic but have a few morphophonemic aspects, notably in the use of ぢ ''di'' and づ ''du'' (rather than じ ''ji'' and ず ''zu'', their pronunciation in standard
Tokyo dialect
The Tokyo dialect () is a variety of Japanese language spoken in modern Tokyo. As a whole, it is generally considered to be Standard Japanese, though specific aspects of slang or pronunciation can vary by area and social class.
Overview
Tr ...
), when the character is a voicing of an underlying ち or つ. That is from the
rendaku
is a phenomenon in Japanese morphophonology that governs the voicing of the initial consonant of a non-initial portion of a compound or prefixed word. In modern Japanese, ''rendaku'' is common but at times unpredictable, with certain words un ...
sound change combined with the
yotsugana
are a set of four specific kana, じ, ぢ, ず, づ (in the Nihon-shiki romanization system: ''zi'', ''di'', ''zu'', ''du''), used in the Japanese writing system. They historically represented four distinct voiced morae (syllables) in t ...
merger of formally different morae. The
Russian orthography
Russian orthography (russian: правописа́ние, r=pravopisaniye, p=prəvəpʲɪˈsanʲɪjə) is formally considered to encompass spelling ( rus, орфогра́фия, r=orfografiya, p=ɐrfɐˈɡrafʲɪjə) and punctuation ( rus, ...
is also mostly morphophonemic, because it does not reflect vowel reduction, consonant assimilation and final-obstruent devoicing. Also, some consonant combinations have silent consonants.
Defective orthographies
A
defective orthography A defective script is a writing system that does not represent all the phonemic distinctions of a language. This means that the concept is always relative to a given language. Taking the Latin alphabet used in Italian orthography as an example, the ...
is one that is not capable of representing all the phonemes or phonemic distinctions in a language. An example of such a deficiency in English orthography is the lack of distinction between the voiced and voiceless "th" phonemes ( and , respectively), occurring in words like ''this'' (voiced) and ''thin'' (voiceless) respectively, with both written .
Comparison between languages
Languages whose current orthographies have a high grapheme-to-phoneme and phoneme-to-grapheme correspondence (excluding exceptions due to loan words and assimilation) include:
*
Afrikaans
Afrikaans (, ) is a West Germanic language that evolved in the Dutch Cape Colony from the Dutch vernacular of Holland proper (i.e., the Hollandic dialect) used by Dutch, French, and German settlers and their enslaved people. Afrikaans gra ...
*
Kurdish
Kurdish may refer to:
*Kurds or Kurdish people
*Kurdish languages
*Kurdish alphabets
*Kurdistan, the land of the Kurdish people which includes:
**Southern Kurdistan
**Eastern Kurdistan
**Northern Kurdistan
**Western Kurdistan
See also
* Kurd (dis ...
*
Maltese
*
Estonian (apart from palatalization or long and "over-long" phoneme length distinction)
*
Finnish
Finnish may refer to:
* Something or someone from, or related to Finland
* Culture of Finland
* Finnish people or Finns, the primary ethnic group in Finland
* Finnish language, the national language of the Finnish people
* Finnish cuisine
See also ...
*
Albanian
*
Georgian
Georgian may refer to:
Common meanings
* Anything related to, or originating from Georgia (country)
** Georgians, an indigenous Caucasian ethnic group
** Georgian language, a Kartvelian language spoken by Georgians
**Georgian scripts, three scrip ...
*
Hindi
Hindi ( Devanāgarī: or , ), or more precisely Modern Standard Hindi (Devanagari: ), is an Indo-Aryan language spoken chiefly in the Hindi Belt region encompassing parts of northern, central, eastern, and western India. Hindi has been ...
(apart from schwa deletion)
*
Sanskrit
Sanskrit (; attributively , ; nominally , , ) is a classical language belonging to the Indo-Aryan branch of the Indo-European languages. It arose in South Asia after its predecessor languages had diffused there from the northwest in the late ...
*
Kannada
Kannada (; ಕನ್ನಡ, ), originally romanised Canarese, is a Dravidian language spoken predominantly by the people of Karnataka in southwestern India, with minorities in all neighbouring states. It has around 47 million native s ...
*
Telugu
Telugu may refer to:
* Telugu language, a major Dravidian language of India
*Telugu people, an ethno-linguistic group of India
* Telugu script, used to write the Telugu language
** Telugu (Unicode block), a block of Telugu characters in Unicode
S ...
*
Malayalam
Malayalam (; , ) is a Dravidian languages, Dravidian language spoken in the Indian state of Kerala and the union territories of Lakshadweep and Puducherry (union territory), Puducherry (Mahé district) by the Malayali people. It is one of 2 ...
*
Dhivehi
Dhivehi, also spelled Divehi, may refer to:
*Dhivehi people, an ethnic group native to the historic region of the Maldive Islands.
*Dhivehi language, an Indo-Aryan language predominantly spoken by about 350,000 people in the Republic of Maldives
...
*
Turkish (apart from ''ğ'' and various palatal and vowel allophones)
*
Serbo-Croatian
Serbo-Croatian () – also called Serbo-Croat (), Serbo-Croat-Bosnian (SCB), Bosnian-Croatian-Serbian (BCS), and Bosnian-Croatian-Montenegrin-Serbian (BCMS) – is a South Slavic language and the primary language of Serbia, Croatia, Bosnia an ...
(
Serbian,
Croatian,
Bosnian and
Montenegrin; written in either
Cyrillic or
Latin
Latin (, or , ) is a classical language belonging to the Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through the power of the ...
script)
*
Slovenian
Slovene or Slovenian may refer to:
* Something of, from, or related to Slovenia, a country in Central Europe
* Slovene language, a South Slavic language mainly spoken in Slovenia
* Slovenes
The Slovenes, also known as Slovenians ( sl, Sloven ...
*
Bulgarian
Bulgarian may refer to:
* Something of, from, or related to the country of Bulgaria
* Bulgarians, a South Slavic ethnic group
* Bulgarian language, a Slavic language
* Bulgarian alphabet
* A citizen of Bulgaria, see Demographics of Bulgaria
* Bul ...
*
Macedonian (if the apostrophe denoting
schwa is counted, though slight inconsistencies may be found)
*
Eastern Armenian (apart from ''o'', ''v'')
*
Basque
Basque may refer to:
* Basques, an ethnic group of Spain and France
* Basque language, their language
Places
* Basque Country (greater region), the homeland of the Basque people with parts in both Spain and France
* Basque Country (autonomous co ...
(apart from palatalized ''l'', ''n'')
*
Haitian Creole
*
Spanish
Spanish might refer to:
* Items from or related to Spain:
**Spaniards are a nation and ethnic group indigenous to Spain
**Spanish language, spoken in Spain and many Latin American countries
**Spanish cuisine
Other places
* Spanish, Ontario, Can ...
(apart from ''h'', ''x'', ''b''/''v'', and sometimes ''k'', ''c'', ''g'', ''j'', ''z'')
*
Czech
Czech may refer to:
* Anything from or related to the Czech Republic, a country in Europe
** Czech language
** Czechs, the people of the area
** Czech culture
** Czech cuisine
* One of three mythical brothers, Lech, Czech, and Rus'
Places
* Czech, ...
(apart from ''ě'', ''ů'', ''y'', ''ý'')
*
Polish
Polish may refer to:
* Anything from or related to Poland, a country in Europe
* Polish language
* Poles
Poles,, ; singular masculine: ''Polak'', singular feminine: ''Polka'' or Polish people, are a West Slavic nation and ethnic group, w ...
(apart from ''ó'', ''ch'', ''rz'' and nasal vowels ''ą'' and ''ę'')
*
Romanian
Romanian may refer to:
*anything of, from, or related to the country and nation of Romania
**Romanians, an ethnic group
**Romanian language, a Romance language
*** Romanian dialects, variants of the Romanian language
** Romanian cuisine, tradition ...
(apart from ''â'' or ''î'' (see
Î versus Â))
*
Ukrainian
Ukrainian may refer to:
* Something of, from, or related to Ukraine
* Something relating to Ukrainians, an East Slavic people from Eastern Europe
* Something relating to demographics of Ukraine in terms of demography and population of Ukraine
* So ...
(mainly phonemic with some other historical/morphological rules, as well as palatalization)
*
Belarusian (phonemic for vowels but mostly morphophonemic for consonants except ''ў'' written phonetically)
*
Swahili (missing aspirated consonants, which do not occur in all varieties and anyway are sparsely used)
*
Mongolian (Cyrillic) (apart from letters representing multiple sounds depending on front or back vowels, the soft and hard sign, silent letters to indicate from and voiced versus voiceless consonants)
*
Azerbaijani (apart from ''k'')
*
Hungarian (apart from ''j'' and ''ly'')
*
Oromo
Many otherwise phonemic orthographies are slightly
defective
Defective may refer to::
*Defective matrix, in algebra
*Defective verb, in linguistics
*Defective, or ''haser'', in Hebrew orthography, a spelling variant that does not include mater lectionis
*Something presenting an anomaly, such as a product de ...
:
Malay (incl.
Malaysian
Malaysian may refer to:
* Something from or related to Malaysia, a country in Southeast Asia
* Malaysian Malay, a dialect of Malay language spoken mainly in Malaysia
* Malaysian people, people who are identified with the country of Malaysia regard ...
and
Indonesian
Indonesian is anything of, from, or related to Indonesia, an archipelagic country in Southeast Asia. It may refer to:
* Indonesians, citizens of Indonesia
** Native Indonesians, diverse groups of local inhabitants of the archipelago
** Indonesian ...
),
Italian
Italian(s) may refer to:
* Anything of, from, or related to the people of Italy over the centuries
** Italians, an ethnic group or simply a citizen of the Italian Republic or Italian Kingdom
** Italian language, a Romance language
*** Regional Ita ...
,
Maltese,
Welsh
Welsh may refer to:
Related to Wales
* Welsh, referring or related to Wales
* Welsh language, a Brittonic Celtic language spoken in Wales
* Welsh people
People
* Welsh (surname)
* Sometimes used as a synonym for the ancient Britons (Celtic peop ...
, and
Kazakh do not fully distinguish their vowels;
Lithuanian,
Latvian, and
Serbo-Croatian
Serbo-Croatian () – also called Serbo-Croat (), Serbo-Croat-Bosnian (SCB), Bosnian-Croatian-Serbian (BCS), and Bosnian-Croatian-Montenegrin-Serbian (BCMS) – is a South Slavic language and the primary language of Serbia, Croatia, Bosnia an ...
do not distinguish
tone and vowel length (also additional vowels for Lithuanian and Latvian);
Somali does not distinguish vowel
phonation; and the graphemes ''b'' and ''v'' represent the same phoneme in all varieties of Spanish (except in Valencia), while in the Spanish of the Americas, can be represented by graphemes ''s'', ''c'', or ''z''. Modern Indo-Aryan languages like
Hindi
Hindi ( Devanāgarī: or , ), or more precisely Modern Standard Hindi (Devanagari: ), is an Indo-Aryan language spoken chiefly in the Hindi Belt region encompassing parts of northern, central, eastern, and western India. Hindi has been ...
,
Punjabi,
Gujarati
Gujarati may refer to:
* something of, from, or related to Gujarat, a state of India
* Gujarati people, the major ethnic group of Gujarat
* Gujarati language, the Indo-Aryan language spoken by them
* Gujarati languages, the Western Indo-Aryan sub ...
,
Maithili and several others feature
schwa deletion, where the implicit default vowel is suppressed without being explicitly marked as such. Others, like
Marathi
Marathi may refer to:
*Marathi people, an Indo-Aryan ethnolinguistic group of Maharashtra, India
*Marathi language, the Indo-Aryan language spoken by the Marathi people
*Palaiosouda, also known as Marathi, a small island in Greece
See also
*
* ...
, do not have a high grapheme-to-phoneme correspondence for vowel lengths.
French, with its
silent letter
In an alphabetic writing system, a silent letter is a letter that, in a particular word, does not correspond to any sound in the word's pronunciation. In linguistics, a silent letter is often symbolised with a null sign . Null is an unprono ...
s and its heavy use of
nasal vowels and
elision, may seem to lack much correspondence between spelling and pronunciation, but its rules on pronunciation, though complex, are consistent and predictable with a fair degree of accuracy. The phoneme-to-letter correspondence, on the other hand, is often low and a sequence of sounds may have multiple ways of being spelt, often with different meanings.
Orthographies such as those of
German
German(s) may refer to:
* Germany (of or related to)
** Germania (historical use)
* Germans, citizens of Germany, people of German ancestry, or native speakers of the German language
** For citizens of Germany, see also German nationality law
**Ge ...
,
Hungarian (mainly phonemic with the exception ''ly'', ''j'' representing the same sound, but consonant and vowel length are not always accurate and various spellings reflect etymology, not pronunciation),
Portuguese
Portuguese may refer to:
* anything of, from, or related to the country and nation of Portugal
** Portuguese cuisine, traditional foods
** Portuguese language, a Romance language
*** Portuguese dialects, variants of the Portuguese language
** Portu ...
, and modern
Greek
Greek may refer to:
Greece
Anything of, from, or related to Greece, a country in Southern Europe:
*Greeks, an ethnic group.
*Greek language, a branch of the Indo-European language family.
**Proto-Greek language, the assumed last common ancestor ...
(written with the
Greek alphabet
The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BCE. It is derived from the earlier Phoenician alphabet, and was the earliest known alphabetic script to have distinct letters for vowels as w ...
), as well as Korean
hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The le ...
, are sometimes considered to be of intermediate depth (for example they include many morphophonemic features, as described above).
Similarly to French, it is much easier to infer the pronunciation of a German word from its spelling than vice versa. For example, for speakers who merge /eː/ and /ɛː/, the phoneme /eː/ may be spelt ''e'', ''ee'', ''eh'', ''ä'' or ''äh''.
English orthography
English orthography is the writing system used to represent spoken English, allowing readers to connect the graphemes to sound and to meaning. It includes English's norms of spelling, hyphenation, capitalisation, word breaks, emphasis, and ...
is highly non-phonemic. The irregularity of English spelling arises partly because the
Great Vowel Shift
The Great Vowel Shift was a series of changes in the pronunciation of the English language that took place primarily between 1400 and 1700, beginning in southern England and today having influenced effectively all dialects of English. Through ...
occurred after the orthography was established; partly because English has acquired a large number of loanwords at different times, retaining their original spelling at varying levels; and partly because the regularisation of the spelling (moving away from the situation in which many different spellings were acceptable for the same word) happened arbitrarily over a period without any central plan. However even English has general, albeit complex, rules that predict pronunciation from spelling, and several of these rules are successful most of the time; rules to predict spelling from the pronunciation have a higher failure rate.
Most
constructed language
A constructed language (sometimes called a conlang) is a language whose phonology, grammar, and vocabulary, instead of having developed naturally, are consciously devised for some purpose, which may include being devised for a work of fiction ...
s such as
Esperanto and
Lojban
Lojban (pronounced ) is a logical, constructed, human language created by the Logical Language Group which aims to be syntactically unambigious. It succeeds the Loglan project.
The Logical Language Group (LLG) began developing Lojban in 1987. ...
have mostly phonemic orthographies.
The
syllabary systems of
Japanese
Japanese may refer to:
* Something from or related to Japan, an island country in East Asia
* Japanese language, spoken mainly in Japan
* Japanese people, the ethnic group that identifies with Japan through ancestry or culture
** Japanese diaspor ...
(
hiragana
is a Japanese syllabary, part of the Japanese writing system, along with ''katakana'' as well as ''kanji''.
It is a phonetic lettering system. The word ''hiragana'' literally means "flowing" or "simple" kana ("simple" originally as contrast ...
and
katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji). The word ''katakana'' means "fragmentary kana", as the katakana characters are derived f ...
) are examples of almost perfectly shallow orthography – exceptions include the use of ぢ and づ (
discussed above) and the use of は, を, and へ to represent the sounds わ, お, and え, as relics of
historical kana usage
The , or , refers to the in general use until orthographic reforms after World War II; the current orthography was adopted by Cabinet order in 1946. By that point the historical orthography was no longer in accord with Japanese pronunciation ...
. There is also no indication of pitch accent, which results in homography of words like 箸 and 橋 (はし in hiragana), which are distinguished in speech.
Xavier Marjou uses an
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
to rank 17 orthographies according to their level of
Orthographic depth
The orthographic depth of an alphabetic orthography indicates the degree to which a written language deviates from simple one-to-one letter–phoneme correspondence. It depends on how easy it is to predict the pronunciation of a word based on its s ...
. Among the tested orthographies, Chinese and French orthographies, followed by English and Russian, are the most opaque regarding writing (i.e. phonemes to graphemes direction) and English, followed by Dutch, is the most opaque regarding reading (i.e. graphemes to phonemes direction); Esperanto, Arabic, Finnish, Korean, Serbo-Croatian and Turkish are very shallow both to read and to write; Italian is shallow to read and very shallow to write, Breton, German, Portuguese and Spanish are shallow to read and to write.
Realignment of orthography
With time,
pronunciations change and spellings become out of date, as has happened to English and
French. In order to maintain a phonemic orthography such a system would need periodic updating, as has been attempted by various
language regulators and proposed by other
spelling reform
A spelling reform is a deliberate, often authoritatively sanctioned or mandated change to spelling rules. Proposals for such reform are fairly common, and over the years, many languages have undergone such reforms. Recent high-profile examples a ...
ers.
Sometimes the pronunciation of a word changes to match its spelling; this is called a
spelling pronunciation
A spelling pronunciation is the pronunciation of a word according to its spelling when this differs from a longstanding standard or traditional pronunciation. Words that are spelled with letters that were never pronounced or that were not pronounc ...
. This is most common with loanwords, but occasionally occurs in the case of established native words too.
In some English personal names and place names, the relationship between the spelling of the name and its pronunciation is so distant that associations between phonemes and graphemes cannot be readily identified. Moreover, in many other words, the pronunciation has subsequently evolved from a fixed spelling, so that it has to be said that the phonemes represent the graphemes rather than vice versa. And in much technical jargon, the primary medium of communication is the written language rather than the spoken language, so the phonemes represent the graphemes, and it is unimportant how the word is pronounced. Moreover, the sounds which literate people perceive being heard in a word are significantly influenced by the actual spelling of the word.
Sometimes, countries have the written language undergo a
spelling reform
A spelling reform is a deliberate, often authoritatively sanctioned or mandated change to spelling rules. Proposals for such reform are fairly common, and over the years, many languages have undergone such reforms. Recent high-profile examples a ...
to realign the writing with the contemporary spoken language. These can range from simple spelling changes and word forms to switching the entire writing system itself, as when
Turkey
Turkey ( tr, Türkiye ), officially the Republic of Türkiye ( tr, Türkiye Cumhuriyeti, links=no ), is a transcontinental country located mainly on the Anatolian Peninsula in Western Asia, with a small portion on the Balkan Peninsula in ...
switched from the Arabic alphabet to a
Turkish alphabet
The Turkish alphabet ( tr, ) is a Latin-script alphabet used for writing the Turkish language, consisting of 29 letters, seven of which ( Ç, Ğ, I, İ, Ö, Ş and Ü) have been modified from their Latin originals for the phonetic require ...
of Latin origin.
Phonetic transcription
Methods for phonetic transcription such as the
International Phonetic Alphabet
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation ...
(IPA) aim to describe pronunciation in a standard form. They are often used to solve ambiguities in the spelling of written language. They may also be used to write languages with no previous written form. Systems like IPA can be used for phonemic representation or for showing more detailed phonetic information (see
Narrow vs. broad transcription).
Phonemic orthographies are different from phonetic transcription; whereas in a phonemic orthography,
allophone
In phonology, an allophone (; from the Greek , , 'other' and , , 'voice, sound') is a set of multiple possible spoken soundsor '' phones''or signs used to pronounce a single phoneme in a particular language. For example, in English, (as in '' ...
s will usually be represented by the same grapheme, a purely phonetic script would demand that phonetically distinct allophones be distinguished. To take an example from American English: the sound in the words "table" and "cat" would, in a phonemic orthography, be written with the same character; however, a strictly phonetic script would make a distinction between the
aspirated "t" in "table", the
flap
Flap may refer to:
Arts, entertainment, and media
* ''Flap'' (film), a 1970 American film
* Flap, a boss character in the arcade game ''Gaiapolis''
* Flap, a minor character in the film '' Little Nemo: Adventures in Slumberland''
Biology and he ...
in "butter", the
unaspirated
In linguistics, a tenuis consonant ( or ) is an obstruent that is voiceless, unaspirated and unglottalized.
In other words, it has the "plain" phonation of with a voice onset time close to zero (a zero-VOT consonant), as Spanish ''p, t, ...
"t" in "stop" and the
glottalized
Glottalization is the complete or partial closure of the glottis during the articulation of another sound. Glottalization of vowels and other sonorants is most often realized as creaky voice (partial closure). Glottalization of obstruent consona ...
"t" in "cat" (not all these allophones exist in all English
dialect
The term dialect (from Latin , , from the Ancient Greek word , 'discourse', from , 'through' and , 'I speak') can refer to either of two distinctly different types of linguistic phenomena:
One usage refers to a variety of a language that is a ...
s). In other words, the sound that most English speakers think of as is really a group of sounds, all pronounced slightly differently depending on where they occur in a word. A perfect phonemic orthography has one letter per group of sounds (phoneme), with different letters only where the sounds distinguish words (so "bed" is spelled differently from "bet").
A narrow phonetic transcription represents
phones
A telephone is a telecommunications device that permits two or more users to conduct a conversation when they are too far apart to be easily heard directly. A telephone converts sound, typically and most efficiently the human voice, into ele ...
, the sounds humans are capable of producing, many of which will often be grouped together as a single phoneme in any given natural language, though the groupings vary across languages. English, for example, does not distinguish between aspirated and unaspirated consonants, but other languages, like
Korean
Korean may refer to:
People and culture
* Koreans, ethnic group originating in the Korean Peninsula
* Korean cuisine
* Korean culture
* Korean language
**Korean alphabet, known as Hangul or Chosŏn'gŭl
**Korean dialects and the Jeju language
** ...
,
Bengali
Bengali or Bengalee, or Bengalese may refer to:
*something of, from, or related to Bengal, a large region in South Asia
* Bengalis, an ethnic and linguistic group of the region
* Bengali language, the language they speak
** Bengali alphabet, the w ...
and
Hindi
Hindi ( Devanāgarī: or , ), or more precisely Modern Standard Hindi (Devanagari: ), is an Indo-Aryan language spoken chiefly in the Hindi Belt region encompassing parts of northern, central, eastern, and western India. Hindi has been ...
do.
The sounds of speech of all languages of the world can be written by a rather small universal phonetic alphabet. A standard for this is the
International Phonetic Alphabet
The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standardized representation ...
.
See also
*
Alphabetic principle
According to the alphabetic principle, letters and combinations of letters are the symbols used to represent the speech sounds of a language based on systematic and predictable relationships between written letters, symbols, and spoken words. Th ...
*
English-language spelling reform
*
Spelling
Spelling is a set of conventions that regulate the way of using graphemes (writing system) to represent a language in its written form. In other words, spelling is the rendering of speech sound (phoneme) into writing (grapheme). Spelling is one ...
*
Morphophonology
Morphophonology (also morphophonemics or morphonology) is the branch of linguistics that studies the interaction between morphological and phonological or phonetic processes. Its chief focus is the sound changes that take place in morphemes (mi ...
*
Orthographic depth
The orthographic depth of an alphabetic orthography indicates the degree to which a written language deviates from simple one-to-one letter–phoneme correspondence. It depends on how easy it is to predict the pronunciation of a word based on its s ...
*
Orthographic transcription Orthographic transcription is a transcription method that employs the standard spelling system of each target language.Hayes, Bruce (2011)Introductory Phonology John Wiley & Sons; , 9781444360134. "The term orthographic transcription simply means ...
References
{{DEFAULTSORT:Phonemic Orthography
Orthography
Phonetics
Phonology
Spelling