Johab
KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs ... and Hanja characters on a computer. KS X 1001 is encoded by the most common legacy (pre-Unicode) character encodings for Korean language, Korean, including EUC-KR and Microsoft's Unified Hangul Code (UHC). It contains Korean Hangul syllables, CJK ideographs (Hanja), Greek alphabet, Greek, Cyrillic script, Cyrillic, Japanese (Hiragana and Katakana) and some other characters. KS X 1001 is arranged as a 94×94 table, following the structure of 2-byte code words in ISO 2022 and Extended Unix Code, EUC. Therefore, its code points are ordered pair, pairs of integers 1–94. ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Johab Encoding
KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent Hangul and Hanja characters on a computer. KS X 1001 is encoded by the most common legacy (pre-Unicode) character encodings for Korean, including EUC-KR and Microsoft's Unified Hangul Code (UHC). It contains Korean Hangul syllables, CJK ideographs (Hanja), Greek, Cyrillic, Japanese (Hiragana and Katakana) and some other characters. KS X 1001 is arranged as a 94×94 table, following the structure of 2-byte code words in ISO 2022 and EUC. Therefore, its code points are pairs of integers 1–94. However, some encodings (UHC and Johab), in addition to providing codes for every code point, provide additional codes for characters otherwise representable only as code point sequences. History This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
KPS 9566
KPS 9566 ("''DPRK Standard Korean Graphic Character Set for Information Interchange''") is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK. KPS 9566 differs in approach from KS X 1001, its South Korean counterpart, in using a different ordering of Chosŏn'gŭl, in encoding explicit vertical presentation forms of punctuation, in not encoding duplicate Hanja for multiple readings, and in including several characters specific to the North Korean political system, including special encodings for the names of the country's past and present leaders (Kim Il Sung, Kim Jong Il and Kim Jong Un). Although KPS 9566 was the original source of several characters added to Unicode, not al ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unified Hangul Code
Unified Hangul Code (UHC), or Extended Wansung, also known under Microsoft Windows as Code Page 949 (Windows-949, MS949 or ambiguously CP949), is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code ( KS C 5601:1987, encoded as EUC-KR) to include all 11172 non-partial Hangul syllables present in Johab (KS C 5601:1992 annex 3). This corresponds to the pre-composed syllables available in Unicode 2.0 and later. Wansung Code has the drawback that it only assigns codes for the 2350 precomposed Hangul syllables which have their own KS X 1001 (KS C 5601) codepoints (out of 11172 in total, not counting those using obsolete jamo), and requires others to use eight-byte composition sequences, which are not supported by some partial implementations of the standard. UHC resolves this by assigning single codes for all possible syllables constructed using modern jamo, by making assignments outside of the encoding space used for KS X 1001. The lead byte ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO 2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard in the field of character encoding. It is equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and te ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO-2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an International Organization for Standardization, ISO/International Electrotechnical Commission, IEC standard in the field of character encoding. It is equivalent to the Ecma International, ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes (Hexadecimal, 0x00–1F and 0x7F–9F) to be used for non-printing C0 and C1 control codes, control codes for formatting and in-band instructions (such as newline, line breaks or formatting instructions for text terminals), rather than graphic character, graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instr ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
EUC-KR
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese language, Japanese, Korean language, Korean, and simplified Chinese characters, simplified Chinese (characters). The most commonly used EUC codes are variable-width encoding, variable-length encodings with a character belonging to an compliant coded character set (such as ASCII) taking one byte, and a character belonging to a 94×94 coded character set (such as ) represented in two bytes. The EUC-CN form of and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes, including an initial , whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyphs of the EUC codes, and more, and is generally more portable with fewer vendor deviations and errors. EUC is however still very popular, especially EUC-KR for South Korea. Encoding structure The structure ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hanja
Hanja (; ), alternatively spelled Hancha, are Chinese characters used to write the Korean language. After characters were introduced to Korea to write Literary Chinese, they were adapted to write Korean as early as the Gojoseon period. () refers to Sino-Korean vocabulary, which can be written with Hanja, and () refers to Classical Chinese writing, although ''Hanja'' is also sometimes used to encompass both concepts. Because Hanja characters have never undergone any major reforms, they more closely resemble traditional Chinese and kyūjitai, traditional Japanese characters, although the stroke orders for certain characters are slightly different. Such examples are the characters and , as well as and . Only a small number of Hanja characters were modified or are unique to Korean, with the rest being identical to the traditional Chinese characters. By contrast, many of the Chinese characters currently in use in mainland China, Malaysia and Singapore have been simplified Chin ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hangul
The Korean alphabet is the modern writing system for the Korean language. In North Korea, the alphabet is known as (), and in South Korea, it is known as (). The letters for the five basic consonants reflect the shape of the speech organs used to pronounce them. They are systematically modified to indicate Phonetics, phonetic features. The vowel letters are systematically modified for related sounds, making Hangul a featural writing system. It has been described as a syllabic alphabet as it combines the features of Alphabet, alphabetic and Syllabary, syllabic writing systems. Hangul was created in 1443 by Sejong the Great, the fourth king of the Joseon dynasty. The alphabet was made as an attempt to increase literacy by serving as a complement to Hanja, which were Chinese characters used to write Literary Chinese in Korea by the 2nd century BCE, and had been adapted to write Korean by the 6th century CE. Modern Hangul orthography uses 24 basic letters: 14 consona ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Character Encoding
Character encoding is the process of assigning numbers to graphical character (computing), characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a character encoding are known as code points and collectively comprise a code space or a code page. Early character encodings that originated with optical or electrical telegraphy and in early computers could only represent a subset of the characters used in written languages, sometimes restricted to Letter case, upper case letters, Numeral system, numerals and some punctuation only. Over time, character encodings capable of representing more characters were created, such as ASCII, the ISO/IEC 8859 encodings, various computer vendor encodings, and Unicode encodings such as UTF-8 and UTF-16. The Popularity of text encodings, most popular character encoding on the World Wide Web is UTF-8, which is used in 98.2% of surve ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
GB 2312
is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. ''GB'' refers to the Guobiao standards (国家标准), whereas the ''T'' suffix ( zh, c= 推荐, p=tuījiàn, l=recommendation, labels=no) denotes a non-mandatory standard. was originally a mandatory national standard designated . However, following a National Standard Bulletin of the People's Republic of China in 2017, GB 2312 is no longer mandatory, and its standard code is modified to . has been superseded by GBK and GB 18030, which include additional characters, but remains in widespread use as a subset of those encodings. , GB2312 is the second-most popular encoding served from China and territories (after UTF-8), with 5.5% of web servers serving a page declaring it. Globally, GB2312 is declared on 0.1% of all web pages. However, all major web browsers decode GB2312-marked document ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
JIS X 0208
JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is . It was originally established as JIS C 6226 in 1978, and has been revised in 1983, 1990, and 1997. It is also called Code page 952 by IBM. The 1978 version is also called Code page 955 by IBM. Scope of use and compatibility The character set JIS X 0208 establishes is primarily for the purpose of between data processing systems and the devices connected to them, or mutually between data communication systems. This character set can be used for data processing and text processing. Partial implementations of the character set are not considered compatible. Because there are places where such things have happened as the original drafting committee of the first standard taking care to separate c ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
GB 12052
GB 12052-89, entitled ''Korean character coded character set for information interchange'' ( zh, s=信息交换用朝鲜文字编码字符集), is a character set standard established by China for the Korean language in China. It consists of a total of 5,979 characters, and has no relationship nor compatibility with South Korea's KS X 1001 and North Korea's KPS 9566. Characters Characters in GB 12052 are arranged in a 94×94 grid (as in ISO/IEC 2022), and the two-byte code point of each character is expressed in the ''qu''-''wei'' form, which specifies a row (''qu'' ) and the position of the character within the row (cell, ''wei'' ). The rows (numbered from 1 to 94) contain characters as follows: * 01–09: identical to GB 2312, except 03-04 ( in GB 2312, in GB 12052) * 16–37: modern Hangul syllables and ''jamo'', level 1 (2,017 syllables and 51 ''jamo'') * 38–52: modern Hangul syllables, level 2 (1,356 characters) * 53–72: archaic Hangul syllables and ''jamo'' (1,683 sy ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |