HOME

TheInfoList



OR:

KOI-8 (КОИ-8) is an 8-bit character set standardized in GOST 19768-74. Маркелова Л. Н. Эксплуатация программоуправляемой вычислительной машины «Искра 226». — М.: Машиностроение, 1987. — С. 41—42. It is an extension of KOI-7 which allows the use of the
Latin alphabet The Latin alphabet or Roman alphabet is the collection of letters originally used by the ancient Romans to write the Latin language. Largely unaltered with the exception of extensions (such as diacritics), it used to write English and the ...
along with the
Russian alphabet The Russian alphabet (russian: ру́сский алфави́т, russkiy alfavit, , label=none, or russian: ру́сская а́збука, russkaya azbuka, label=none, more traditionally) is the script used to write the Russian language. I ...
, both the upper and lower case letters; however, the letter Ёё and the uppercase Ъ are missed, the latter to avoid conflicts with the delete character (both are added in most extensions, see
KOI8-B KOI8-B is the informal name for an 8-bit Roman / Cyrillic character set constituting the common subset of the major KOI-8 variants (KOI8-R, KOI8-U, KOI8-RU, KOI8-E, KOI8-F). Accordingly, it is closely related to KOI8-R, but defines only t ...
). The first 127 code points are identical to
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
with the exception of the
dollar sign The dollar sign, also known as peso sign, is a symbol consisting of a capital " S" crossed with one or two vertical strokes ($ or ), used to indicate the unit of various currencies around the world, including most currencies denominated "p ...
$ (code point 24hex) replaced by the universal currency sign ¤. The rows x8_ and x9_ (code points 128–159) might be filled with the additional control characters from
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding ...
(code points 32–63). This standard has become the base for the later
Internet standard In computer network engineering, an Internet Standard is a normative specification of a technology or methodology applicable to the Internet. Internet Standards are created and published by the Internet Engineering Task Force (IETF). They allow ...
s such as
KOI8-R KOI8-R (RFC 1489) is an 8-bit character encoding, derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses a Cyrillic alphabet. KOI8-R was based on Russian Morse code, which was creat ...
,
KOI8-U KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian and Bulgarian, but replaces eight box drawing characters with four Ukrainian letters Ґ ...
,
KOI8-RU KOI8-RU is an 8-bit character encoding, designed to cover Russian, Ukrainian, and Belarusian which use a Cyrillic alphabet. It is closely related to KOI8-R, which covers Russian and Bulgarian, but replaces ten box drawing characters with five Ukra ...
and all the other derivatives.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
is preferred to KOI-8 and its variants or other Cyrillic encodings in modern applications, especially on the Internet, making
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of e ...
the dominant encoding for web pages. (For further discussion of Unicode's complete coverage, of 436 Cyrillic letters/code points, including for Old Cyrillic, and how single-byte character encodings, such as
Windows-1251 Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used ...
and KOI8 variants, cannot provide this, see
Cyrillic script in Unicode As of Unicode version 15.0 Cyrillic script is encoded across several blocks: * CyrillicU+0400–U+04FF 256 characters * Cyrillic SupplementU+0500–U+052F 48 characters * Cyrillic Extended-AU+2DE0–U+2DFF 32 characters * Cyrillic Extended-BU ...
.)


Character set

The following table shows the KOI-8 encoding. Each character is shown with its equivalent Unicode code point.


See also

* KOI character encodings


Footnotes


References

{{Character encoding Character sets Computing in the Soviet Union