JOHAB
KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer. KS X 1001 is encoded by the most common legacy (pre-Unicode) character encodings for Korean, including EUC-KR and Microsoft's Unified Hangul Code (UHC). It contains Korean Hangul syllables, CJK ideographs (Hanja), Greek, Cyrillic, Japanese (Hiragana and Katakana) and some other characters. KS X 1001 is arranged as a 94×94 table, following the structure of 2-byte code words in ISO 2022 and EUC. Therefore, its code points are pairs of integers 1–94. However, some encodings (UHC and Johab), in addition to providing codes for every code point, provide additional codes for characters otherwise representable only as code point sequences. History This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Johab - Korean Standard Symbol
KS X 1001, "''Code for Information Interchange (Hangul and Hanja)''", formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer. KS X 1001 is encoded by the most common legacy (pre-Unicode) character encodings for Korean, including EUC-KR and Microsoft's Unified Hangul Code (UHC). It contains Korean Hangul syllables, CJK ideographs (Hanja), Greek, Cyrillic, Japanese (Hiragana and Katakana) and some other characters. KS X 1001 is arranged as a 94×94 table, following the structure of 2-byte code words in ISO 2022 and EUC. Therefore, its code points are pairs of integers 1–94. However, some encodings (UHC and Johab), in addition to providing codes for every code point, provide additional codes for characters otherwise representable only as code point sequences. History This standard was previously known as KS C 5601. There have been several revisions of this standard. For example, there were revisions i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
KPS 9566
KPS 9566 ("''DPRK Standard Korean Graphic Character Set for Information Interchange''") is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK. KPS 9566 differs in approach from KS X 1001, its South Korean counterpart, in using a different ordering of chosŏn'gŭl, in encoding explicit vertical presentation forms of punctuation, in not encoding duplicate hanja for multiple readings, and in including several characters specific to the North Korean political system, including special encodings for the names of the country's past and present leaders (Kim Il-sung, Kim Jong-il and Kim Jong-un). Although KPS 9566 was the original source of several characters added to Unicode, not all ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unified Hangul Code
Unified Hangul Code (UHC), or Extended Wansung, also known under Microsoft Windows as Code Page 949 (Windows-949, MS949 or ambiguously CP949), is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code (KS C 5601:1987, encoded as EUC-KR) to include all 11172 non-partial Hangul syllables present in Johab (KS C 5601:1992 annex 3). This corresponds to the pre-composed syllables available in Unicode 2.0 and later. Wansung Code has the drawback that it only assigns codes for the 2350 precomposed Hangul syllables which have their own KS X 1001 (KS C 5601) codepoints (out of 11172 in total, not counting those using obsolete jamo), and requires others to use eight-byte composition sequences, which are not supported by some partial implementations of the standard. UHC resolves this by assigning single codes for all possible syllables constructed using modern jamo, by making assignments outside of the encoding space used for KS X 1001. The lead byte r ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Extended Unix Code
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The most commonly used EUC codes are variable-length encodings with a character belonging to an compliant coded character set (such as ASCII) taking one byte, and a character belonging to a 94x94 coded character set (such as ) represented in two bytes. The EUC-CN form of and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes, including an initial , whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyphs of the EUC codes, and more, and is generally more portable with fewer vendor deviations and errors. EUC is however still very popular, especially EUC-KR for South Korea. Encoding structure The structure of EUC is based on the standard, which specifies a system of graphical character sets which can be repres ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO-2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the field of character encoding. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal emu ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO 2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the field of character encoding. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal em ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
EUC-KR
Extended Unix Code (EUC) is a multibyte character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmi ... system used primarily for Japanese language, Japanese, Korean language, Korean, and simplified Chinese. The most commonly used EUC codes are variable-width encoding, variable-length encodings with a character belonging to an compliant coded character set (such as ASCII) taking one byte, and a character belonging to a 94x94 coded character set (such as ) represented in two bytes. The EUC-CN form of and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes, including an initial , whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyp ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
GB 12052
GB 12052-89, entitled ''Korean character coded character set for information interchange'' ( zh, s=信息交换用朝鲜文字编码字符集), is a Korean-language character set standard established by China. It consists of a total of 5,979 characters, and has no relationship nor compatibility with South Korea's KS X 1001 and North Korea's KPS 9566. Characters Characters in GB 12052 are arranged in a 94×94 grid (as in ISO/IEC 2022), and the two-byte code point of each character is expressed in the ''qu''-''wei'' form, which specifies a row (''qu'' 区) and the position of the character within the row (cell, ''wei'' 位). The rows (numbered from 1 to 94) contain characters as follows: * 01–09: identical to GB 2312, except 03-04 (¥ in GB 2312, $ in GB 12052) * 16–37: modern hangul syllables and ''jamo'', level 1 (2,017 syllables and 51 ''jamo'') * 38–52: modern hangul syllables, level 2 (1,356 characters) * 53–72: archaic hangul syllables and ''jamo'' (1,683 syllables ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Character Encoding
Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmitted, and Computing, transformed using Digital electronics, digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "Character Map (Windows), character map". Early character codes associated with the optical or electrical Telegraphy, telegraph could only represent a subset of the characters used in written languages, sometimes restricted to Letter case, upper case letters, Numeral system, numerals and some punctuation only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages. Character enc ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hangul
The Korean alphabet, known as Hangul, . Hangul may also be written as following South Korea's standard Romanization. ( ) in South Korea and Chosŏn'gŭl in North Korea, is the modern official writing system for the Korean language. The letters for the five basic consonants reflect the shape of the speech organs used to pronounce them, and they are systematically modified to indicate phonetic features; similarly, the vowel letters are systematically modified for related sounds, making Hangul a featural writing system. It has been described as a syllabic alphabet as it combines the features of alphabetic and syllabic writing systems, although it is not necessarily an abugida. Hangul was created in 1443 CE by King Sejong the Great in an attempt to increase literacy by serving as a complement (or alternative) to the logographic Sino-Korean ''Hanja'', which had been used by Koreans as its primary script to write the Korean language since as early as the Gojoseon period (spanni ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hanja
Hanja (Hangul: ; Hanja: , ), alternatively known as Hancha, are Chinese characters () used in the writing of Korean. Hanja was used as early as the Gojoseon period, the first ever Korean kingdom. (, ) refers to Sino-Korean vocabulary, which can be written with Hanja, and (, ) refers to Classical Chinese writing, although "Hanja" is also sometimes used to encompass both concepts. Because Hanja never underwent any major reforms, they are mostly resemble to ''kyūjitai'' and traditional Chinese characters, although the stroke orders for some characters are slightly different. For example, the characters and as well as and . Only a small number of Hanja characters were modified or are unique to Korean, with the rest being identical to the traditional Chinese characters. By contrast, many of the Chinese characters currently in use in mainland China, Malaysia and Singapore have been simplified, and contain fewer strokes than the corresponding Hanja characters. In Japan, s ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic script (Unicode), scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with Universal Coded Character Set, ISO/IEC 10646, each being code-for-code id ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |