Windows-1251
   HOME
*





Windows-1251
Windows-1251 is an 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Ukrainian, Belarusian, Bulgarian, Serbian Cyrillic, Macedonian and other languages. On the web, it is the second most-used single-byte character encoding (or third most-used character encoding overall), and most used of the single-byte encodings supporting Cyrillic. , 0.4% of all websites use Windows-1251. It's by far mostly used for Russian, while a small minority of Russian websites use it, with 93.7% of Russian (.ru) websites using UTF-8, and the legacy 8-bit encoding is distant second. In Linux, the encoding is known as cp1251. IBM uses code page 1251 (CCSID 1251 and euro sign extended CCSID 5347) for Windows-1251. Windows-1251 and KOI8-R (or its Ukrainian variant KOI8-U) are much more commonly used than ISO 8859-5 (which is used by less than 0.0004% of websites). In contrast to Windows-1252 and ISO 8859-1, Windows-1251 is not closely related to ISO 8859 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


ISO-IR-111
ISO-IR-111 or KOI8-E is an 8-bit character set. It is a multinational extension of KOI-8 for Belarusian, Macedonian, Serbian, and Ukrainian (except Ґґ which is added to KOI8-F). The name "ISO-IR-111" refers to its registration number in the ISO-IR registry, and denotes it as a set usable with ISO/IEC 2022. It was defined by the first (1986) edition of ECMA-113, which is the Ecma International standard corresponding to , and as such also corresponds to a 1987 draft version of ISO-8859-5. The published editions of instead correspond to subsequent editions of ECMA-113, which defines a different encoding. Naming confusion ISO-IR-111, the 1985 edition of ECMA-113 (also called "ECMA-Cyrillic" or "KOI8-E"), was based on the 1974 edition of GOST 19768 (i.e. KOI-8). In 1987 ECMA-113 was redesigned. These newer editions of ECMA-113 are equivalent to ISO-8859-5, and do not follow the KOI layout. This confusion has led to a common misconception that ISO-8859-5 was defined in or based ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


ISO 8859-5
ISO/IEC 8859-5:1999, ''Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988. It is informally referred to as Latin/Cyrillic. It was designed to cover languages using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian, Serbian and Macedonian but was never widely used. It would also have been usable for Ukrainian in the Soviet Union from 1933 to 1990, but it is missing the Ukrainian letter ''ge'', ґ, which is required in Ukrainian orthography before and since, and during that period outside Soviet Ukraine. As a result, IBM created Code page 1124. ISO-8859-5 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. The 8-bit encodings KOI8-R and KOI8-U, CP866, and also Windows-1251 are far more commonly used. In contrast to Windows ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


KOI8-U
KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian language, Ukrainian, which uses a Cyrillic alphabet. It is based on KOI8-R, which covers Russian language, Russian and Bulgarian language, Bulgarian, but replaces eight box drawing characters with four Ukrainian letters Ghe with upturn, Ґ, Ukrainian Ye, Є, Soft-dotted i (Cyrillic), І, and Yi (Cyrillic), Ї in both upper case and lower case. KOI8-RU is closely related, but adds Ў for Belarusian language, Belarusian. In both, the letter allocations match those in KOI8-E, except for Ґ which is added to KOI8-F. In Microsoft Windows, KOI8-U is assigned the code page number 21866. In IBM, KOI8-U is assigned code page/CCSID 1168. KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In the future, both may eventually give way to Unicode. KOI8 stands for ''Kod Obmena Informatsiey, 8 bit'' (russian: Код Обмен ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Windows-125x
Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Windows, although they are still supported both within Windows and other platforms, and still apply when Alt code shortcuts are used. There are two groups of system code pages in Windows systems: OEM and Windows-native ("ANSI") code pages. (ANSI is the American National Standards Institute.) Code pages in both of these groups are extended ASCII code pages. Additional code pages are supported by standard Windows conversion routines, but not used as either type of system code page. ANSI code page ANSI code pages (officially called "Windows code pages" after Microsoft accepted the former term being a misnomer ) are used for native non-Unicode (say, byte oriented) applications using a graphical user interf ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


KOI8-R
KOI8-R (RFC 1489) is an 8-bit character encoding, derived from the KOI-8 encoding by the programmer Andrei Chernov in 1993 and designed to cover Russian, which uses a Cyrillic alphabet. KOI8-R was based on Russian Morse code, which was created from a phonetic version of Latin Morse code. As a result, Russian Cyrillic letters are in pseudo-Roman order rather than the normal Cyrillic alphabetical order. Although this may seem unnatural, if the 8th bit is stripped, the text is partially readable in ASCII and may convert to syntactically correct KOI-7. For example, "Русский Текст" in KOI8-R becomes ''rUSSKIJ tEKST'' ("Russian Text"). KOI8 stands for ''Kod Obmena Informatsiey, 8 bit'' (russian: Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit". In Microsoft Windows, KOI8-R is assigned the code page number 20866. In IBM, KOI8-R is assigned code page 878. KOI8-R also happens to cover Bulgarian, but has not been use ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Cyrillic Script
The Cyrillic script ( ), Slavonic script or the Slavic script, is a writing system used for various languages across Eurasia. It is the designated national script in various Slavic languages, Slavic, Turkic languages, Turkic, Mongolic languages, Mongolic, Uralic languages, Uralic, Caucasian languages, Caucasian and Iranian languages, Iranic-speaking countries in Southeastern Europe, Eastern Europe, the Caucasus, Central Asia, North Asia, and East Asia. , around 250 million people in Eurasia use Cyrillic as the official script for their national languages, with Russia accounting for about half of them. With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin script, Latin and Greek alphabet, Greek alphabets. The Early Cyrillic alphabet was developed during the 9th century AD at the Preslav Literary School in the First Bulgarian Empire during the reign of tsar Simeon I of Bulgar ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Euro Sign
The euro sign () is the currency sign used for the euro, the official currency of the eurozone and unilaterally adopted by Kosovo and Montenegro. The design was presented to the public by the European Commission on 12 December 1996. It consists of a stylized letter E (or epsilon), crossed by two lines instead of one. In English, the sign immediately precedes the value (for instance, €10); in most other European languages, it follows the value, usually but not always with an intervening space (for instance, 10€, 10€). Design There were originally 32 proposed designs for a symbol for Europe's new common currency; the Commission short-listed these to ten candidates. These ten were put to a public survey. After the survey had narrowed the original ten proposals down to two, it was up to the Commission to choose the final design. The other designs that were considered are not available for the public to view, nor is any information regarding the designers available for pub ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Character Encoding
Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmitted, and Computing, transformed using Digital electronics, digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "Character Map (Windows), character map". Early character codes associated with the optical or electrical Telegraphy, telegraph could only represent a subset of the characters used in written languages, sometimes restricted to Letter case, upper case letters, Numeral system, numerals and some punctuation only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages. Character enc ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Russian Language
Russian (russian: русский язык, russkij jazyk, link=no, ) is an East Slavic languages, East Slavic language mainly spoken in Russia. It is the First language, native language of the Russians, and belongs to the Indo-European languages, Indo-European language family. It is one of four living East Slavic languages, and is also a part of the larger Balto-Slavic languages. Besides Russia itself, Russian is an official language in Belarus, Kazakhstan, and Kyrgyzstan, and is used widely as a lingua franca throughout Ukraine, the Caucasus, Central Asia, and to some extent in the Baltic states. It was the De facto#National languages, ''de facto'' language of the former Soviet Union,1977 Soviet Constitution, Constitution and Fundamental Law of the Union of Soviet Socialist Republics, 1977: Section II, Chapter 6, Article 36 and continues to be used in public life with varying proficiency in all of the post-Soviet states. Russian has over 258 million total speakers worldwide. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Windows-1252
Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. It is the most-used single-byte character encoding in the world (on websites at least). , 0.3% of all websites declared use of Windows-1252, but at the same time 1.3% used ISO 8859-1 (while only 8 of the top 1000 websites), which by HTML5 standards should be considered the same encoding, so that 1.6% of websites effectively use Windows-1252. Pages declared as US-ASCII would also count as this character set. An unknown (but probably large) subset of other pages use only the ASCII portion of UTF-8, or only the codes matching Windows-1252 from their declared character set, and could also be counted. Depending on the country, use can be much higher than the global average, e.g., for Brazil according to website use (including ISO-8859-1), use ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


ISO 8859-1
ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. ISO/IEC 8859-1 encodes what it refers to as "Latin alphabet no. 1", consisting of 191 characters from the Latin script. This character-encoding scheme is used throughout the Americas, Western Europe, Oceania, and much of Africa. It is the basis for some popular 8-bit character sets and the first two blocks of characters in Unicode. ISO-8859-1 was (according to the standard, at least) the default encoding of documents delivered via HTTP with a MIME type beginning with "text/" (HTML5 changed this to Windows-1252). , 1.3% of all (but only 8 of the top 1000) web sites use . It is the most ''declared'' single-byte character encoding in the world on the Web, but as Web browsers interpret it as the superset Windows-1252, the documents m ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic script (Unicode), scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with Universal Coded Character Set, ISO/IEC 10646, each being code-for-code id ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]