HOME

TheInfoList



OR:

The Cork (also known as T1 or EC) encoding is a
character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ...
used for encoding glyphs in fonts. It is named after the city of
Cork Cork or CORK may refer to: Materials * Cork (material), an impermeable buoyant plant product ** Cork (plug), a cylindrical or conical object used to seal a container ***Wine cork Places Ireland * Cork (city) ** Metropolitan Cork, also known as G ...
in
Ireland Ireland ( ; ga, Éire ; Ulster Scots dialect, Ulster-Scots: ) is an island in the Atlantic Ocean, North Atlantic Ocean, in Northwestern Europe, north-western Europe. It is separated from Great Britain to its east by the North Channel (Grea ...
, where during a
TeX Users Group Tex may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Joe Tex (1933–1982), stage name of American soul singer Joseph Arrington Jr. Entertainment * ''Tex'', the Italian ...
(TUG) conference in 1990 a new encoding was introduced for
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
. It contains 256 characters supporting most west and east-European languages with the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and the o ...
.


Details

In 8-bit
TeX Tex may refer to: People and fictional characters * Tex (nickname), a list of people and fictional characters with the nickname * Joe Tex (1933–1982), stage name of American soul singer Joseph Arrington Jr. Entertainment * ''Tex'', the Italian ...
engines the font encoding has to match the encoding of hyphenation patterns where this encoding is most commonly used. In
LaTeX Latex is an emulsion (stable dispersion) of polymer microparticles in water. Latexes are found in nature, but synthetic latexes are common as well. In nature, latex is found as a milky fluid found in 10% of all flowering plants (angiosperms ...
one can switch to this encoding with \usepackage 1/code>, while in
ConTeXt Context may refer to: * Context (language use), the relevant constraints of the communicative situation that influence language use, language variation, and discourse summary Computing * Context (computing), the virtual environment required to su ...
MkII this is the default encoding already. In modern engines such as
XeTeX XeTeX ( or ; see also Pronouncing and writing "TeX") is a TeX typesetting engine using Unicode and supporting modern font technologies such as OpenType, Graphite and Apple Advanced Typography (AAT). It was originally written by Jonathan Ke ...
and
LuaTeX LuaTeX is a TeX-based computer typesetting system which started as a version of pdfTeX with a Lua scripting engine embedded. After some experiments it was adopted by the TeX Live distribution as a successor to pdfTeX (itself an extension of ε- ...
Unicode is fully supported and the 8-bit font encodings are obsolete.


Character set


Notes

* Hexadecimal values under the characters in the table are the Unicode character codes. * The first 12 characters are often used as
combining characters In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents). Unicode also ...
.


Supported languages

The encoding supports most European languages written in Latin alphabet. Notable exceptions are: *
Esperanto Esperanto ( or ) is the world's most widely spoken constructed international auxiliary language. Created by the Warsaw-based ophthalmologist L. L. Zamenhof in 1887, it was intended to be a universal second language for international communi ...
(using IL3) *
Latvian language Latvian ( ), also known as Lettish, is an Eastern Baltic language belonging to the Baltic branch of the Indo-European language family, spoken in the Baltic region. It is the language of Latvians and the official language of Latvia as well as ...
and
Lithuanian language Lithuanian ( ) is an Eastern Baltic language belonging to the Baltic branch of the Indo-European language family. It is the official language of Lithuania and one of the official languages of the European Union. There are about 2.8 millio ...
(using L7X) *
Welsh language Welsh ( or ) is a Celtic language family, Celtic language of the Brittonic languages, Brittonic subgroup that is native to the Welsh people. Welsh is spoken natively in Wales, by some in England, and in Y Wladfa (the Welsh colony in Chubut P ...
Languages with slightly suboptimal support include: * Galician language,
Portuguese language Portuguese ( or, in full, ) is a western Romance language of the Indo-European language family, originating in the Iberian Peninsula of Europe. It is an official language of Portugal, Brazil, Cape Verde, Angola, Mozambique, Guinea-Bissau and ...
and
Spanish language Spanish ( or , Castilian) is a Romance languages, Romance language of the Indo-European language family that evolved from colloquial Latin spoken on the Iberian peninsula. Today, it is a world language, global language with more than 500 millio ...
– due to the lack of characters ª and º, which are not superscript versions of lowercase "a" and "o" (superscripts are thinner) and they are often underlined *
Croatian language Croatian (; ' ) is the standardized variety of the Serbo-Croatian pluricentric language used by Croats, principally in Croatia, Bosnia and Herzegovina, the Serbian province of Vojvodina, and other neighboring countries. It is the official ...
, Bosnian language,
Serbian language Serbian (, ) is the standardized variety of the Serbo-Croatian language mainly used by Serbs. It is the official and national language of Serbia, one of the three official languages of Bosnia and Herzegovina and co-official in Montenegro and Kos ...
– due to the shared use of the slot for Đ *
Turkish language Turkish ( , ), also referred to as Turkish of Turkey (''Türkiye Türkçesi''), is the most widely spoken of the Turkic languages, with around 80 to 90 million speakers. It is the national language of Turkey and Northern Cyprus. Significant sma ...
– due to
dotless i I, or ı, called dotless I, is a letter used in the Latin-script alphabets of Azerbaijani, Crimean Tatar, Gagauz, Kazakh, Tatar, Kyrgyz, and Turkish. It commonly represents the close back unrounded vowel , except in Kazakh where it represen ...
having different uppercase and lowercase combinations than in other languages


References


External links


Encoding fileOverview of encodings in LaTeX
{{character encoding Character encoding Character sets TeX