HOME

TheInfoList



OR:

KOI-8 (КОИ-8) is an 8-bit character set standardized in GOST 19768-74. Маркелова Л. Н. Эксплуатация программоуправляемой вычислительной машины «Искра 226». — М.: Машиностроение, 1987. — С. 41—42. It is an extension of KOI-7 which allows the use of the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and the o ...
along with the
Russian alphabet The Russian alphabet (russian: ру́сский алфави́т, russkiy alfavit, , label=none, or russian: ру́сская а́збука, russkaya azbuka, label=none, more traditionally) is the script used to write the Russian language. I ...
, both the upper and lower case letters; however, the letter Ёё and the uppercase Ъ are missed, the latter to avoid conflicts with the
delete character The delete control character (also called DEL or rubout) is the last character in the ASCII repertoire, with the code 127. It is supposed to do nothing and was designed to erase incorrect characters on paper tape. It is denoted as in caret notati ...
(both are added in most extensions, see
KOI8-B KOI8-B is the informal name for an 8-bit Roman / Cyrillic character set constituting the common subset of the major KOI-8 variants (KOI8-R, KOI8-U, KOI8-RU, KOI8-E, KOI8-F). Accordingly, it is closely related to KOI8-R, but defines only the l ...
). The first 127 code points are identical to
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
with the exception of the
dollar sign The dollar sign, also known as peso sign, is a symbol consisting of a capital " S" crossed with one or two vertical strokes ($ or ), used to indicate the unit of various currencies around the world, including most currencies denominated "pes ...
$ (code point 24hex) replaced by the
universal currency sign The currency sign is a character (symbol), character used to denote an unspecified currency. It can be described as a circle the size of a lowercase character with four short radiating arms at 45° (NE), 135° (SE), 225° (SW) and 315° (NW). I ...
¤. The rows x8_ and x9_ (code points 128–159) might be filled with the additional control characters from
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding six- ...
(code points 32–63). This standard has become the base for the later
Internet standard In computer network engineering, an Internet Standard is a normative specification of a technology or methodology applicable to the Internet. Internet Standards are created and published by the Internet Engineering Task Force (IETF). They allow ...
s such as
KOI8-R KOI8-R (RFC 1489) is an 8-bit character encoding, derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses a Cyrillic alphabet. KOI8-R was based on Russian Morse code, which was created fr ...
,
KOI8-U KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian language, Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian language, Russian and Bulgarian language, Bulgarian, but replaces eight b ...
,
KOI8-RU KOI8-RU is an 8-bit character encoding, designed to cover Russian, Ukrainian, and Belarusian which use a Cyrillic alphabet. It is closely related to KOI8-R, which covers Russian and Bulgarian, but replaces ten box drawing characters with five ...
and all the other
derivatives The derivative of a function is the rate of change of the function's output relative to its input value. Derivative may also refer to: In mathematics and economics * Brzozowski derivative in the theory of formal languages * Formal derivative, an ...
.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
is preferred to KOI-8 and its variants or other Cyrillic encodings in modern applications, especially on the Internet, making
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
the dominant encoding for web pages. (For further discussion of Unicode's complete coverage, of 436 Cyrillic letters/code points, including for
Old Cyrillic The Early Cyrillic alphabet, also called classical Cyrillic or paleo-Cyrillic, is a writing system that was developed in the First Bulgarian Empire during the late 9th century on the basis of the Greek alphabet for the Slavic people livin ...
, and how single-byte character encodings, such as
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used si ...
and KOI8 variants, cannot provide this, see
Cyrillic script in Unicode As of Unicode version 15.0 Cyrillic script is encoded across several blocks: * CyrillicU+0400–U+04FF 256 characters * Cyrillic SupplementU+0500–U+052F 48 characters * Cyrillic Extended-AU+2DE0–U+2DFF 32 characters * Cyrillic Extended-BU+A ...
.)


Character set

The following table shows the KOI-8 encoding. Each character is shown with its equivalent Unicode code point.


See also

*
KOI character encodings KOI (''КОИ'') is a family of several code pages for the Cyrillic script. The name stands for ''Kod obmena informatsiey'' (russian: Код обмена информацией) which means "Code for Information Interchange". A particular feature ...


Footnotes


References

{{Character encoding Character sets Computing in the Soviet Union