linguistics Linguistics is the scientific study of language. The areas of linguistic analysis are syntax (rules governing the structure of sentences), semantics (meaning), Morphology (linguistics), morphology (structure of words), phonetics (speech sounds ...

, romanization is the conversion of text from a different

writing system A writing system comprises a set of symbols, called a ''script'', as well as the rules by which the script represents a particular language. The earliest writing appeared during the late 4th millennium BC. Throughout history, each independen ...

to the Roman (Latin) script, or a system for doing so. Methods of romanization include

transliteration Transliteration is a type of conversion of a text from one script to another that involves swapping letters (thus '' trans-'' + '' liter-'') in predictable ways, such as Greek → and → the digraph , Cyrillic → , Armenian → or L ...

, for representing written text, and transcription, for representing the spoken word, and combinations of both. Transcription methods can be subdivided into ''

phonemic A phoneme () is any set of similar speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible phonetic unit—that helps distinguish one word from another. All languages con ...

transcription'', which records the

phoneme A phoneme () is any set of similar Phone (phonetics), speech sounds that are perceptually regarded by the speakers of a language as a single basic sound—a smallest possible Phonetics, phonetic unit—that helps distinguish one word fr ...

s or units of

semantic Semantics is the study of linguistic Meaning (philosophy), meaning. It examines what meaning is, how words get their meaning, and how the meaning of a complex expression depends on its parts. Part of this process involves the distinction betwee ...

meaning in speech, and more strict ''

phonetic transcription Phonetic transcription (also known as Phonetic script or Phonetic notation) is the visual representation of speech sounds (or ''phonetics'') by means of symbols. The most common type of phonetic transcription uses a phonetic alphabet, such as the ...

'', which records speech sounds with precision.

Methods

There are many consistent or

standardized Standardization (American English) or standardisation (British English) is the process of implementing and developing technical standards based on the consensus of different parties that include firms, users, interest groups, standards organiza ...

romanization systems. They can be classified by their characteristics. A particular system's characteristics may make it better-suited for various, sometimes contradictory applications, including document retrieval, linguistic analysis, easy readability, faithful representation of pronunciation. * Source, or donor language – A system may be tailored to romanize text from a particular language, or a series of languages, or for any language in a particular writing system. A language-specific system typically preserves language features like pronunciation, while the general one may be better for cataloguing international texts. * Target, or receiver language – Most systems are intended for an audience that speaks or reads a particular language. (So-called ''international'' romanization systems for Cyrillic text are based on central-European alphabets like the

Czech Czech may refer to: * Anything from or related to the Czech Republic, a country in Europe ** Czech language ** Czechs, the people of the area ** Czech culture ** Czech cuisine * One of three mythical brothers, Lech, Czech, and Rus *Czech (surnam ...

and

Croatian alphabet Croatian may refer to: *Croatia *Croatian language *Croatian people *Croatians (demonym) See also

* * * Croatan (disambiguation) * Croatia (disambiguation) * Croatoan (disambiguation) * Hrvatski (disambiguation) * Hrvatsko (disambiguation) ...

.) * Simplicity – Since the basic

Latin alphabet The Latin alphabet, also known as the Roman alphabet, is the collection of letters originally used by the Ancient Rome, ancient Romans to write the Latin language. Largely unaltered except several letters splitting—i.e. from , and from � ...

has a smaller number of letters than many other writing systems, digraphs,

diacritics A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...

, or special characters must be used to represent them all in Latin script. This affects the ease of creation, digital storage and transmission, reproduction, and reading of the romanized text. * Reversibility – Whether or not the original can be restored from the converted text. Some reversible systems allow for an irreversible simplified version.

Transliteration

If the romanization attempts to transliterate the original script, the guiding principle is a one-to-one mapping of characters in the source language into the target script, with less emphasis on how the result sounds when pronounced according to the reader's language. For example, the

Nihon-shiki , romanized as in the system itself, is a romanization system for transliterating the Japanese language into the Latin alphabet. Among the major romanization systems for Japanese, it is the most regular one and has an almost one-to-one rel ...

romanization of Japanese allows the informed reader to reconstruct the original Japanese

kana are syllabary, syllabaries used to write Japanese phonology, Japanese phonological units, Mora (linguistics), morae. In current usage, ''kana'' most commonly refers to ''hiragana'' and ''katakana''. It can also refer to their ancestor , wh ...

syllables with 100% accuracy, but requires additional knowledge for correct pronunciation.

Transcription

Phonemic

Most romanizations are intended to enable the casual reader who is unfamiliar with the original script to pronounce the source language reasonably accurately. Such romanizations follow the principle of phonemic transcription and attempt to render the significant sounds (

s) of the original as faithfully as possible in the target language. The popular

Hepburn Romanization is the main system of Romanization of Japanese, romanization for the Japanese language. The system was originally published in 1867 by American Christian missionary and physician James Curtis Hepburn as the standard in the first edition of h ...

of Japanese is an example of a transcriptive romanization designed for English speakers.

Phonetic

phonetic Phonetics is a branch of linguistics that studies how humans produce and perceive sounds or, in the case of sign languages, the equivalent aspects of sign. Linguists who specialize in studying the physical properties of speech are phoneticians ...

conversion goes one step further and attempts to depict all phones in the source language, sacrificing legibility if necessary by using characters or conventions not found in the target script. In practice such a representation almost never tries to represent ''every'' possible allophone—especially those that occur naturally due to coarticulation effects—and instead limits itself to the most significant allophonic distinctions. The

International Phonetic Alphabet The International Phonetic Alphabet (IPA) is an alphabetic system of phonetic notation based primarily on the Latin script. It was devised by the International Phonetic Association in the late 19th century as a standard written representation ...

is the most common system of phonetic transcription.

Compromise

For most language pairs, building a usable romanization involves a trade-off between the two extremes. Pure transcriptions are generally not possible, as the source language usually contains sounds and distinctions not found in the target language, but which must be shown for the romanized form to be comprehensible. Furthermore, due to diachronic and synchronic variance no

written language A written language is the representation of a language by means of writing. This involves the use of visual symbols, known as graphemes, to represent linguistic units such as phonemes, syllables, morphemes, or words. However, written language is ...

represents any

spoken language A spoken language is a form of communication produced through articulate sounds or, in some cases, through manual gestures, as opposed to written language. Oral or vocal languages are those produced using the vocal tract, whereas sign languages ar ...

with perfect accuracy and the vocal interpretation of a script may vary by a great degree among languages. In modern times the chain of transcription is usually spoken foreign language, written foreign language, written native language, spoken (read) native language. Reducing the number of those processes, i.e. removing one or both steps of writing, usually leads to more accurate oral articulations. In general, outside a limited audience of scholars, romanizations tend to lean more towards transcription. As an example, consider the Japanese martial art 柔術: the Nihon-shiki romanization ''zyûzyutu'' may allow someone who knows Japanese to reconstruct the kana syllables , but most native English speakers, or rather readers, would find it easier to guess the pronunciation from the Hepburn version, ''

jūjutsu Jujutsu ( , or ), also known as jiu-jitsu and ju-jitsu (both ), is a Japanese martial art and a system of close combat that can be used in a defensive or offensive manner to kill or subdue one or more weaponless or armed and armored opponent ...

''.

Romanization of specific writing systems

Arabic

The

Arabic script The Arabic script is the writing system used for Arabic (Arabic alphabet) and several other languages of Asia and Africa. It is the second-most widely used alphabetic writing system in the world (after the Latin script), the second-most widel ...

is used to write

Arabic Arabic (, , or , ) is a Central Semitic languages, Central Semitic language of the Afroasiatic languages, Afroasiatic language family spoken primarily in the Arab world. The International Organization for Standardization (ISO) assigns lang ...

, Persian,

Urdu Urdu (; , , ) is an Indo-Aryan languages, Indo-Aryan language spoken chiefly in South Asia. It is the Languages of Pakistan, national language and ''lingua franca'' of Pakistan. In India, it is an Eighth Schedule to the Constitution of Indi ...

Pashto Pashto ( , ; , ) is an eastern Iranian language in the Indo-European language family, natively spoken in northwestern Pakistan and southern and eastern Afghanistan. It has official status in Afghanistan and the Pakistani province of Khyb ...

and Sindhi as well as numerous other languages in the Muslim world, particularly African and Asian languages without alphabets of their own. Romanization standards include the following:

Arabic

* (1936): Adopted by the International Convention of Orientalist Scholars in Rome. It is the basis for the very influential Hans Wehr dictionary (). * BS 4280 (1968): Developed by the

British Standards Institution The British Standards Institution (BSI) is the Standards organization, national standards body of the United Kingdom. BSI produces technical standards on a wide range of products and services and also supplies standards certification services ...

* SATTS (1970s): A one-for-one substitution system, a legacy from the

Morse code Morse code is a telecommunications method which Character encoding, encodes Written language, text characters as standardized sequences of two different signal durations, called ''dots'' and ''dashes'', or ''dits'' and ''dahs''. Morse code i ...

era *

UNGEGN The United Nations Group of Experts on Geographical Names (UNGEGN) is one of the nine expert groups of the United Nations Economic and Social Council (ECOSOC) and deals with the national and international standardization of geographical names. ...

(1972) *

DIN 31635 DIN 31635 is a (DIN) standard for the transliteration of the Arabic alphabet adopted in 1982. It is based on the rules of the (DMG) as modified by the International Orientalist Congress 1935 in Rome. The most important differences from English-ba ...

(1982): Developed by the (German Institute for Standardization) *

ISO 233 The international standard ISO 233 establishes a system for romanization of Arabic script. It was supplemented by ISO 233-2 in 1993 which is specific for Arabic language. 1984 edition The table below shows the consonants for the Arabic langua ...

(1984). Transliteration. * Qalam (1985): A system that focuses upon preserving the spelling, rather than the pronunciation, and uses mixed case * ISO 233-2 (1993): Simplified transliteration. * Buckwalter transliteration (1990s): Developed at

Xerox Xerox Holdings Corporation (, ) is an American corporation that sells print and electronic document, digital document products and services in more than 160 countries. Xerox was the pioneer of the photocopier market, beginning with the introduc ...

by Tim Buckwalter; does not require unusual

diacritic A diacritic (also diacritical mark, diacritical point, diacritical sign, or accent) is a glyph added to a letter or to a basic glyph. The term derives from the Ancient Greek (, "distinguishing"), from (, "to distinguish"). The word ''diacrit ...

s *

ALA-LC ALA-LC (American Library AssociationLibrary of Congress) is a set of standards for romanization, the representation of text in other writing systems using the Latin script. Applications The system is used to represent bibliographic information by ...

(1997) *

Arabic chat alphabet The Arabic chat alphabet, also known as ''Arabizi'', ''Arabeezi'', ''Arabish'', Franco-Arabic or simply Franco (from ) refer to the romanized alphabets for informal Arabic dialects in which Arabic script is transcribed or encoded into a combinati ...

Persian

Notes:

Armenian

Georgian

Notes:

Greek

There are romanization systems for both Modern and

Ancient Greek Ancient Greek (, ; ) includes the forms of the Greek language used in ancient Greece and the classical antiquity, ancient world from around 1500 BC to 300 BC. It is often roughly divided into the following periods: Mycenaean Greek (), Greek ...

. *

* Beta Code * Greeklish * ISO 843 (1997)

Hebrew

The

Hebrew alphabet The Hebrew alphabet (, ), known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is a unicase, unicameral abjad script used in the writing of the Hebrew language and other Jewish languages, most notably ...

is romanized using several standards: *

ANSI The American National Standards Institute (ANSI ) is a private nonprofit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organiz ...

Z39.25 (1975) *

(1977) *

ISO 259 ISO 259 is a series of international standards for the romanization of Hebrew characters into Latin characters, dating to 1984, with updated ISO 259-2 (a simplification, disregarding several vowel signs, 1994) and ISO 259-3 ( Phonemic Conversion, ...

(1984): Transliteration. * ISO 259-2 (1994): Simplified transliteration. * ISO/DIS 259-3: Phonemic transcription. *

Indic (Brahmic) scripts

The

Brahmic family The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout South Asia, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used b ...

abugida An abugida (; from Geʽez: , )sometimes also called alphasyllabary, neosyllabary, or pseudo-alphabetis a segmental Writing systems#Segmental writing system, writing system in which consonant–vowel sequences are written as units; each unit ...

s is used for languages of the Indian subcontinent and south-east Asia. There is a long tradition in the west to study

Sanskrit Sanskrit (; stem form ; nominal singular , ,) is a classical language belonging to the Indo-Aryan languages, Indo-Aryan branch of the Indo-European languages. It arose in northwest South Asia after its predecessor languages had Trans-cultural ...

and other Indic texts in Latin transliteration. Various transliteration conventions have been used for Indic scripts since the time of Sir William Jones. *

ISO 15919 ISO 15919 is an international standard for the romanization of Indic scripts. Published in 2001, it is part of a series of romanization standards by the International Organization for Standardization. Overview Relation to other systems ...

(2001): A standard

convention was codified in the ISO 15919 standard. It uses

s to map the much larger set of Brahmic

consonant In articulatory phonetics, a consonant is a speech sound that is articulated with complete or partial closure of the vocal tract, except for the h sound, which is pronounced without any stricture in the vocal tract. Examples are and pronou ...

s and

vowel A vowel is a speech sound pronounced without any stricture in the vocal tract, forming the nucleus of a syllable. Vowels are one of the two principal classes of speech sounds, the other being the consonant. Vowels vary in quality, in loudness a ...

s to the Latin script. The Devanagari-specific portion is very similar to the academic standard,

IAST The International Alphabet of Sanskrit Transliteration (IAST) is a transliteration scheme that allows the lossless romanisation of Brahmic family, Indic scripts as employed by Sanskrit and related Indic languages. It is based on a scheme that ...

: "International Alphabet of Sanskrit Transliteration", and to the United States Library of Congress standard,

, although there are a few differences * The

National Library at Kolkata romanization The National Library at Kolkata romanisationSee p 24-26 for table comparing Indic languages, and p 33-34 for Devanagari alphabet listing. is a widely used transliteration scheme in dictionaries and grammars of Indic languages. This transliter ...

, intended for the romanization of all

Indic scripts The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems. They are used throughout South Asia, Southeast Asia and parts of East Asia. They are descended from the Brahmi script of ancient India and are used b ...

, is an extension of

* Harvard-Kyoto: Uses upper and lower case and doubling of letters, to avoid the use of diacritics, and to restrict the range to 7-bit ASCII. * ITRANS: a transliteration scheme into 7-bit ASCII created by Avinash Chopde that used to be prevalent on

Usenet Usenet (), a portmanteau of User's Network, is a worldwide distributed discussion system available on computers. It was developed from the general-purpose UUCP, Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Elli ...

. *

ISCII Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Eastern Nagari, Bengali–Ass ...

(1988)

Devanagari–nastaʿlīq (Hindustani)

Hindustani is an Indo-Aryan language with extreme

digraphia In sociolinguistics, digraphia refers to the use of more than one writing system for the same language. Synchronic digraphia is the coexistence of two or more writing systems for the same language, while diachronic digraphia or sequential digr ...

and

diglossia In linguistics, diglossia ( , ) is where two dialects or languages are used (in fairly strict compartmentalization) by a single language community. In addition to the community's everyday or vernacular language variety (labeled "L" or "low" v ...

resulting from the

Hindi–Urdu controversy The Hindi–Urdu controversy arose in 19th-century British Raj out of the debate over whether Modern Standard Hindi or Standard Urdu should be chosen as a national language. Hindi and Urdu are mutually intelligible standard registers of the ...

starting in the 1800s. Technically, Hindustani itself is recognized by neither the language community nor any governments. Two

registers,

Standard Hindi Modern Standard Hindi (, ), commonly referred to as Hindi, is the standardised variety of the Hindustani language written in the Devanagari script. It is an official language of the Government of India, alongside English, and is the ''lin ...

and Standard Urdu, are recognized as

official language An official language is defined by the Cambridge English Dictionary as, "the language or one of the languages that is accepted by a country's government, is taught in schools, used in the courts of law, etc." Depending on the decree, establishmen ...

s in India and Pakistan. However, in practice the situation is, * In Pakistan: Standard (Saaf or Khaalis) Urdu is the "high" variety, whereas Hindustani is the "low" variety used by the masses (called Urdu, written in

nastaʿlīq script ''Nastaliq'' (; ; ), also romanized as ''Nastaʿlīq'' or ''Nastaleeq'' (), is one of the main calligraphic hands used to write Arabic script and is used for some Indo-Iranian languages, predominantly Classical Persian, Kashmiri, Punjabi a ...

). * In India, both Standard (Shuddh) Hindi and Standard (Saaf or Khaalis) Urdu are the "H" varieties (written in

devanagari Devanagari ( ; in script: , , ) is an Indic script used in the Indian subcontinent. It is a left-to-right abugida (a type of segmental Writing systems#Segmental systems: alphabets, writing system), based on the ancient ''Brāhmī script, Brā ...

and nastaʿlīq respectively), whereas Hindustani is the "L" variety used by the masses and written in either devanagari or nastaʿlīq (and called 'Hindi' or 'Urdu' respectively). The digraphia renders any work in either script largely inaccessible to users of the other script, though otherwise Hindustani is a perfectly mutually intelligible language, essentially meaning that any kind of text-based

open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...

collaboration is impossible among devanagari and nastaʿlīq readers. Initiated in 2011, the Hamari Boli Initiative is a full-scale open-source

language planning In sociolinguistics, language planning (also known as language engineering) is a deliberate effort to influence the function, structure or acquisition of languages or language varieties within a speech community.Kaplan B., Robert, and Rich ...

initiative aimed at Hindustani script, style, status & lexical reform and modernization. One of primary stated objectives of Hamari Boli is to relieve Hindustani of the crippling devanagari–nastaʿlīq digraphia by way of romanization.

Chinese

Romanization of the

Sinitic languages The Sinitic languages (), often synonymous with the Chinese languages, are a language group, group of East Asian analytic languages that constitute a major branch of the Sino-Tibetan language family. It is frequently proposed that there is a p ...

, particularly

Mandarin Mandarin or The Mandarin may refer to: Language * Mandarin Chinese, branch of Chinese originally spoken in northern parts of the country ** Standard Chinese or Modern Standard Mandarin, the official language of China ** Taiwanese Mandarin, Stand ...

, has proved a very difficult problem, although the issue is further complicated by political considerations. Because of this, many romanization tables contain Chinese characters plus one or more romanizations or Zhuyin.

Mandarin

: Used to be similar to Wade–Giles, but converted to

Hanyu Pinyin Hanyu Pinyin, or simply pinyin, officially the Chinese Phonetic Alphabet, is the most common romanization system for Standard Chinese. ''Hanyu'' () literally means ' Han language'—that is, the Chinese language—while ''pinyin'' literally ...

in 2000 * EFEO. Developed by École française d'Extrême-Orient in the 19th century, used mainly in France. *

Latinxua Sin Wenz Latinxua Sin Wenz () is a historical set of romanizations for Chinese language, Chinese. Promoted as a revolutionary reform to combat illiteracy and replace Chinese characters, Sin Wenz distinctively does not indicate Tone (linguistics), tones, ...

(1926): Omitted tone sounds. Used mainly in the

Soviet Union The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...

and

Xinjiang Xinjiang,; , SASM/GNC romanization, SASM/GNC: Chinese postal romanization, previously romanized as Sinkiang, officially the Xinjiang Uygur Autonomous Region (XUAR), is an Autonomous regions of China, autonomous region of the China, People' ...

in the 1930s. Predecessor of

. * Lessing-Othmer: Used mainly in Germany. *

Postal romanization Postal romanization was a system of transliterating place names in China developed by postal authorities in the late 19th and early 20th centuries. For many cities, the corresponding postal romanization was the most common English-language fo ...

(1906): Early standard for international addresses *

Wade–Giles Wade–Giles ( ) is a romanization system for Mandarin Chinese. It developed from the system produced by Thomas Francis Wade during the mid-19th century, and was given completed form with Herbert Giles's '' A Chinese–English Dictionary'' ...

(1892): Transliteration. Very popular from the 19th century until recently and continues to be used by some Western academics. *

Yale Yale University is a private Ivy League research university in New Haven, Connecticut, United States. Founded in 1701, Yale is the third-oldest institution of higher education in the United States, and one of the nine colonial colleges ch ...

(1942): Created by the U.S. for battlefield communication and used in the influential Yale textbooks. * Legge romanization: Created by

James Legge James Legge (; 20 December 181529 November 1897) was a Scottish linguist, missionary, sinologist, and translator who was best known as an early translator of Classical Chinese texts into English. Legge served as a representative of the Lond ...

, a Scottish missionary.

=Mainland China

= *

(1958): In

mainland China "Mainland China", also referred to as "the Chinese mainland", is a Geopolitics, geopolitical term defined as the territory under direct administration of the People's Republic of China (PRC) in the aftermath of the Chinese Civil War. In addit ...

, Hanyu Pinyin has been used officially to romanize

for decades, primarily as a linguistic tool for teaching the standardized language. The system is also used in other Chinese-speaking areas such as

Singapore Singapore, officially the Republic of Singapore, is an island country and city-state in Southeast Asia. The country's territory comprises one main island, 63 satellite islands and islets, and one outlying islet. It is about one degree ...

and parts of

Taiwan Taiwan, officially the Republic of China (ROC), is a country in East Asia. The main geography of Taiwan, island of Taiwan, also known as ''Formosa'', lies between the East China Sea, East and South China Seas in the northwestern Pacific Ocea ...

, and has been adopted by much of the international community as a standard for writing Chinese words and names in the Latin script. The value of Hanyu Pinyin in education in China lies in the fact that China, like any other populated area with comparable area and population, has numerous distinct

dialects A dialect is a variety of language spoken by a particular group of people. This may include dominant and standardized varieties as well as vernacular, unwritten, or non-standardized varieties, such as those used in developing countries or iso ...

, though there is just one common written language and one common standardized spoken form. (These comments apply to romanization in general) * ISO 7098 (1991): Based on Hanyu Pinyin.

=Taiwan

= #

Gwoyeu Romatzyh Gwoyeu Romatzyh ( ; GR) is a system for writing Standard Chinese using the Latin alphabet. It was primarily conceived by Yuen Ren Chao (1892–1982), who led a group of linguists on the National Languages Committee in refining the system betwe ...

(GR, 1928–1986, in Taiwan 1945–1986; Taiwan used Japanese Romaji before 1945), # Mandarin Phonetic Symbols II (MPS II, 1986–2002), #

Tongyong Pinyin Tongyong Pinyin was the official romanization of Taiwanese Mandarin, Mandarin in Taiwan between 2002 and 2008. The system was unofficially used between 2000 and 2002, when a new romanization system for Taiwan was being evaluated for adoption. ...

(2002–2008), and #

(since January 1, 2009).

=Singapore

Cantonese

* Barnett–Chao *

Guangdong ) means "wide" or "vast", and has been associated with the region since the creation of Guang Prefecture in AD 226. The name "''Guang''" ultimately came from Guangxin ( zh, labels=no, first=t, t= , s=广信), an outpost established in Han dynasty ...

(1960) *

Hong Kong Government The Government of the Hong Kong Special Administrative Region (commonly known as the Hong Kong Government or HKSAR Government) is the Executive (government), executive authorities of Hong Kong. It was established on 1 July 1997, following the ...

Jyutping The Linguistic Society of Hong Kong Cantonese Romanization Scheme, also known as Jyutping, is a romanisation system for Cantonese developed in 1993 by the Linguistic Society of Hong Kong (LSHK). The name ''Jyutping'' (itself the Jyutping ro ...

* Macau Government * Meyer–Wempe * Sidney Lau *

(1942) * ILE romanization of Cantonese

Wu

Min Nan or Hokkien

Pe̍h-ōe-jī ( ; , , ; POJ), also known as Church Romanization, is an orthography used to write variants of Hokkien Southern Min, particularly Taiwanese Hokkien, Taiwanese and Amoy dialect, Amoy Hokkien, and it is widely employed as one of the writing syst ...

(POJ), once the ''de facto'' official script of the Presbyterian Church in Taiwan (since the late 19th century). Technically this represented a largely phonemic transcription system, as

Min Nan Southern Min (), Minnan ( Mandarin pronunciation: ) or Banlam (), is a group of linguistically similar and historically related Chinese languages that form a branch of Min Chinese spoken in Fujian (especially the Minnan region), most of Taiwan ...

was not commonly written in Chinese. * Tâi-uân Lô-má-jī Phing-im Hong-àn

=Teochew

= *

(1960), for the distinct Teochew variety.

Min Dong

Foochow Romanized Fuzhou is the capital of Fujian, China. The city lies between the Min River estuary to the south and the city of Ningde to the north. Together, Fuzhou and Ningde make up the Mindong linguistic and cultural region. Fuzhou's population was 8 ...

Min Bei

* Kienning Colloquial Romanized

Japanese

Romanization (or, more generally, Roman letters) is called "

rōmaji The romanization of Japanese is the use of Latin script to write the Japanese language. This method of writing is sometimes referred to in Japanese as . Japanese is normally written in a combination of logogram, logographic characters borrowe ...

" in Japanese. The most common systems are: * Hepburn (1867): phonetic transcription to Anglo-American practices, used in geographical names *

(1885): transliteration. Also adopted as ( ISO 3602 Strict) in 1989. * Kunrei-shiki (1937): phonemic transcription. Also adopted as (

ISO 3602 The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. Me ...

). * JSL (1987): phonemic transcription. Named after the book ''Japanese: The Spoken Language'' by Eleanor Jorden. *

: Similar to Modified Hepburn * Wāpuro: ("word processor romanization") transliteration. Not strictly a system, but a collection of common practices that enables input of Japanese text.

Korean

The following systems are currently the most widely used: *

McCune–Reischauer McCune–Reischauer romanization ( ) is a romanization system for the Korean language. It was first published in 1939 by George M. McCune and Edwin O. Reischauer. According to Reischauer, McCune "persuaded the American Army Map Service to ad ...

("MR"; 1939): Basis for various romanization systems. Almost universally used by international academic journals on

Korean studies Korean studies is an academic discipline that focuses on the study of Korea, which includes South Korea, North Korea, and diasporic Korean populations. Areas commonly included under this rubric include Korean history, Korean culture, Korea ...

. **

Romanization of Korean The romanization of Korean is the use of the Latin script to transcribe the Korean language. There are multiple romanization systems in common use. The two most prominent systems are McCune–Reischauer (MR) and Revised Romanization (RR). MR ...

(1992): The official romanization in North Korea, with some differences from the original MR. ** The

system is based on but deviates from MR. ** South Korea formerly used yet another modified version of MR as its official system from 1984 to 2000. *

Revised Romanization of Korean Revised Romanization of Korean () is the official Romanization of Korean, Korean language romanization system in South Korea. It was developed by the National Institute of Korean Language, National Academy of the Korean Language from 1995 and w ...

(2000): South Korea's official romanization system. * Yale romanization of Korean (1942): Standard for almost exclusively international linguists.

Thai

Thai, spoken in

Thailand Thailand, officially the Kingdom of Thailand and historically known as Siam (the official name until 1939), is a country in Southeast Asia on the Mainland Southeast Asia, Indochinese Peninsula. With a population of almost 66 million, it spa ...

and some areas of Laos, Burma and China, is written with its own script, probably descended from mixture of Tai–Laotian and Old Khmer, in the

. *

Royal Thai General System of Transcription The Royal Thai General System of Transcription (RTGS) is the official system for rendering Thai words in the Latin alphabet. It was published by the Royal Institute of Thailand in early 1917, when Thailand was called Siam. It is used in roa ...

* ISO 11940 1998 Transliteration * ISO 11940-2 2007 Transcription *

Nuosu

The Nuosu language, spoken in southern China, is written with its own script, the

Yi script The Yi scripts (; ) are two scripts used to write the Yi languages; Classical Yi (an ideogram script), and the later Yi syllabary. The script is historically known in Chinese as ''Cuan Wen'' () or ''Wei Shu'' () and various other names (), amon ...

. The only existing romanisation system is YYPY (Yi Yu Pin Yin), which represents tone with letters attached to the end of syllables, as Nuosu forbids codas. It does not use diacritics, and as such due to the large phonemic inventory of Nuosu, it requires frequent use of digraphs, including for monophthong vowels.

Tibetan

The

Tibetan script The Tibetan script is a segmental writing system, or '' abugida'', forming a part of the Brahmic scripts, and used to write certain Tibetic languages, including Tibetan, Dzongkha, Sikkimese, Ladakhi, Jirel and Balti. Its exact origins ...

has two official romanization systems: Tibetan Pinyin (for

Lhasa Tibetan Lhasa Tibetan or Standard Tibetan is a standardized dialect of Tibetan spoken by the people of Lhasa, the capital of the Tibetan Autonomous Region. It is an official language of the Tibet Autonomous Region. In the traditional "three-branched" ...

) and Roman Dzongkha (for

Dzongkha Dzongkha (; ) is a Tibeto-Burman languages, Tibeto-Burman language that is the official and national language of Bhutan. It is written using the Tibetan script. The word means "the language of the fortress", from ' "fortress" and ' "language ...

Cyrillic

In English language library catalogues, bibliographies, and most academic publications, the Library of Congress transliteration method is used worldwide. In linguistics, scientific transliteration is used for both

Cyrillic The Cyrillic script ( ) is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic, Turkic, Mongolic, Uralic, Caucasian and Iranic-speaking countries in Southeastern Europe, Ea ...

and

Glagolitic alphabet The Glagolitic script ( , , ''glagolitsa'') is the oldest known Slavic alphabet. It is generally agreed that it was created in the 9th century for the purpose of translating liturgical texts into Old Church Slavonic by Saints Cyril and Methodi ...

s. This applies to

Old Church Slavonic Old Church Slavonic or Old Slavonic ( ) is the first Slavic languages, Slavic literary language and the oldest extant written Slavonic language attested in literary sources. It belongs to the South Slavic languages, South Slavic subgroup of the ...

, as well as modern

Slavic languages The Slavic languages, also known as the Slavonic languages, are Indo-European languages spoken primarily by the Slavs, Slavic peoples and their descendants. They are thought to descend from a proto-language called Proto-Slavic language, Proto- ...

that use these alphabets.

Belarusian

BGN/PCGN romanization of Belarusian The BGN/PCGN romanization system for Belarusian is a method for romanization of Cyrillic Belarusian texts, that is, their transliteration into the Latin alphabet. There are a number of systems for romanization of Belarusian, but the BGN/PCGN s ...

, 1979 (

United States Board on Geographic Names The United States Board on Geographic Names (BGN) is a Federal government of the United States, federal body operating under the United States Secretary of the Interior. The purpose of the board is to establish and maintain uniform usage of geogr ...

and Permanent Committee on Geographical Names for British Official Use) * Scientific transliteration, or the ''International Scholarly System'' for

ALA-LC romanization ALA-LC (American Library AssociationLibrary of Congress) is a set of standards for romanization, the representation of text in other writing systems using the Latin script. Applications The system is used to represent bibliographic information by ...

, 1997 (American Library Association and Library of Congress): *

ISO 9 ISO 9 is an international standard establishing a system for the transliteration into Latin characters of Cyrillic characters constituting the alphabets of many Slavic and non-Slavic languages. Published on February 23, 1995 by the Internation ...

:1995 * ''

Instruction on transliteration of Belarusian geographical names with letters of Latin script The Instruction on the Transliteration of Belarusian Geographical Names with Letters of the Latin Script was an official standard of the romanization of Belarusian geographical names. Status The instruction was adopted by a decree of the Belar ...

'', 2000

Bulgarian

A system based on scientific transliteration and ISO/R 9:1968 was considered official in Bulgaria since the 1970s. Since the late 1990s, Bulgarian authorities have switched to the so-called Streamlined System avoiding the use of diacritics and optimized for compatibility with English. This system became mandatory for public use with a law passed in 2009. Where the old system uses <č,š,ž,št,c,j,ă>, the new system uses . The new Bulgarian system was endorsed for official use also by UN in 2012, and by BGN and PCGN in 2013.

Kyrgyz

Macedonian

Russian

There is no single universally accepted system of writing Russian using the Latin script—in fact there are a huge number of such systems: some are adjusted for a particular target language (e.g. German or French), some are designed as a librarian's transliteration, some are prescribed for Russian travellers' passports; the transcription of some names is purely traditional. All this has resulted in great reduplication of names. E.g. the name of the Russian composer

Tchaikovsky Pyotr Ilyich Tchaikovsky ( ; 7 May 1840 – 6 November 1893) was a Russian composer during the Romantic period. He was the first Russian composer whose music made a lasting impression internationally. Tchaikovsky wrote some of the most popular ...

may also be written as ''Tchaykovsky'', ''Tchajkovskij'', ''Tchaikowski'', ''Tschaikowski'', ''Czajkowski'', ''Čajkovskij'', ''Čajkovski'', ''Chajkovskij'', ''Çaykovski'', ''Chaykovsky'', ''Chaykovskiy'', ''Chaikovski'', ''Tshaikovski'', ''Tšaikovski'', ''Tsjajkovskij'' etc. Systems include: * BGN/PCGN (1947): Transliteration system (United States Board on Geographic Names & Permanent Committee on Geographical Names for British Official Use). * GOST 16876-71 (1971): A now defunct Soviet transliteration standard. Replaced by GOST 7.79, which is an

equivalent. *

United Nations The United Nations (UN) is the Earth, global intergovernmental organization established by the signing of the Charter of the United Nations, UN Charter on 26 June 1945 with the stated purpose of maintaining international peace and internationa ...

romanization system for geographical names (1987): Based on GOST 16876-71. *

(1995): Transliteration. From the

International Organization for Standardization The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. M ...

. *

(1997) * "Volapuk" encoding (1990s): Slang term (it is not really Volapük) for a writing method that is not truly a transliteration, but used for similar goals (see article). * Conventional English transliteration is based to BGN/PCGN, but does not follow a particular standard. Described in detail at

Romanization of Russian The romanization of the Russian language (the transliteration of Russian text from the Cyrillic script into the Latin script), aside from its primary use for including Russian names and words in text written in a Latin alphabet, is also essentia ...

. * Streamlined System for the romanization of Russian. * Comparative transliteration of Russian in different languages (Western European, Arabic, Georgian, Braille, Morse)

Syriac

The Latin script for Syriac was developed in the 1930s, following the state policy for minority languages of the

, with some material published.

Ukrainian

The 2010 Ukrainian National system has been adopted by the UNGEGN in 2012 and by the BGN/PCGN in 2020. It is also very close to the modified (simplified) ALA-LC system, which has remained unchanged since 1941. *

* Ukrainian National transliteration * Ukrainian National and BGN/PCGN systems, at the UN Working Group on Romanization Systems * Thomas T. Pedersen's comparison of five systems

Overview and summary

The chart below shows the most common phonemic transcription romanization used for several different alphabets. While it is sufficient for many casual users, there are multiple alternatives used for each alphabet, and many exceptions. For details, consult each of the language sections above. (Hangul characters are broken down into jamo components.)

References

External links

; About romanization
IPA for Urdu and Roman Urdu for Mobile and Internet Users (Download)

Microsoft Transliteration Utility
nbsp;– A tool for creating, debugging and using transliteration modules from any script to any other script. * Randall Barry (ed.) ''ALA-LC Romanization Tables'' U.S. Library of Congress, 1997, . (One of the few printed books with lists of romanizations)

in PDF format
UNGEGN Working Group on Romanization Systems

; Romanization online
Chinese Phonetic Conversion Tool
nbsp;– Converts between Pinyin and other formats
Cyrillic Transliteration and Transcription ONLINE (Cyrillic -> Latin)

eiktub
– An Arabic Transliteration Pad
Lingua::Translit
nbsp;–

Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...

module covering a variety of writing systems e.g. Cyrillic or Greek. Provides a lot of standards as well as common transliteration schemes.
Arabeasy
nbsp;– Arabic Transliteration (free chrome extension exists, also works for Persian, Urdu)

– Russian Transliteration (free chrome extension exists) For Persian Romanization
Cantonese" target="_blank" class="mw-redirect" title="Romanization * [https://hongkongvision.com/tool/cc_py_conv_en
{{Latin script Romanization, ">Romanization
script Romanization,
Latin script
Multilingual orthographies Orthography