HOME

TheInfoList



OR:

ISO/IEC 8859-2:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2'', is part of the
ISO/IEC 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. ...
series of ASCII-based standard
character encoding Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
s, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from
code page 852 Code page 852 (CCSID 852) (also known as CP 852, IBM 00852, OEM 852 (Latin II), MS-DOS Latin 2) is a code page used under DOS to write Central European languages that use Latin script (such as Bosnian, Croatian, Czech, Hungarian, Polish, Roma ...
(MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions. Code page 912 is an extension. Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8 (on the web). ISO-8859-2 is the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
preferred charset name for this standard when supplemented with the
C0 and C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a curso ...
from
ISO/IEC 6429 ISO/IEC JTC 1, entitled "Information technology", is a joint technical committee (JTC) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and ...
. Less than 0.04% of all web pages use ISO-8859-2 as of October 2022. Microsoft has assigned code page 28592 a.k.a. Windows-28592 to ISO-8859-2 in Windows. IBM assigned Code page 1111 to ISO 8859-2.
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Czech (which is its main user with half its use, though Czech has 96.6% use of UTF-8, an ...
is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. ...
, which keeps all printable characters from
ISO-8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in ...
in the same place).


Language coverage

These code values can be used for the following languages: It can also be used for
Romanian Romanian may refer to: *anything of, from, or related to the country and nation of Romania ** Romanians, an ethnic group **Romanian language, a Romance language ***Romanian dialects, variants of the Romanian language **Romanian cuisine, traditiona ...
, but it is not well suited for that language, due to lacking letters s and t with commas below, although it provides s and t with similar-looking
cedilla A cedilla ( ; from Spanish) or cedille (from French , ) is a hook or tail ( ¸ ) added under certain letters as a diacritical mark to modify their pronunciation. In Catalan language, Catalan, French language, French, and Portuguese language, ...
s. These letters were unified in the first versions of the
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
standard, meaning that the appearance with cedilla or with a comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should therefore, in theory, have characters with a comma below at those code points. Microsoft did not really provide such fonts for computers sold in Romania. Still, ISO 8859-2 and
Windows-1250 Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Czech (which is its main user with half its use, though Czech has 96.6% use of UTF-8, an ...
(with the same problem) have been heavily used for Romanian. Unicode subsequently disunified the comma variants from the cedilla variants, and has since taken the lead for web pages, which however often have s and t with cedilla anyway. Unicode notes as of 2014 that disunifying the letters with comma below was a mistake, causing corruptions of Romanian data: pre-existing data and input methods would still contain the older cedilla codepoints, complicating text searching.


Code page layout

Differences from
ISO-8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in ...
have the Unicode code point number underneath.


See also

*
Character encoding Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
* Polish code pages


References


External links


ISO/IEC 8859-2:1999
8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 ''2nd edition (June 1986)''
ISO-IR 101
Right-Hand Part of Latin Alphabet No.2 ''(February 1, 1986)''

{{DEFAULTSORT:ISO IEC 8859-2 ISO/IEC 8859 Computer-related introductions in 1987