Code Page 936 (IBM)
   HOME

TheInfoList



OR:

IBM code page 936 is a character encoding for
Simplified Chinese Simplification, Simplify, or Simplified may refer to: Mathematics Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
including 1880 user-defined characters (UDC), which was superseded in 1993. It is a combination of the single-byte Code page 903 and the double-byte Code page 928. Code page 946 uses the same double-byte component, but an extended single-byte component ( Code page 1042). IBM code page 936 should not be confused with the identically numbered Windows code page, which is a variant of the GBK encoding; GBK is called
Code page 1386 Windows code page 936 (abbreviated MS936, Windows-936 or (Code page 936 (IBM), ambiguously) CP936), is Microsoft's legacy (pre-Unicode) character encoding for representing simplified Chinese text Chinese character IT, on computers. It is one of th ...
by IBM. While GBK is a superset of the EUC-CN encoding of
GB 2312 is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. ''GB'' refers to the Guobiao standards (国家标准), ...
, IBM-936 uses a different coded form of GB 2312, more closely resembling the relationship of
Shift JIS Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
to
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
.


History

The encoding was in use mainly during the 1980s and early 1990s. While the original IBM PC ( IBM 5150) lacked functionality for processing data in CJK languages, the IBM 5550 possessed such functionality, and was available in models supporting Japanese, Korean,
Traditional Chinese A tradition is a system of beliefs or behaviors (folk custom) passed down within a group of people or society with symbolic meaning or special significance with origins in the past. A component of cultural expressions and folklore, common examp ...
or
Simplified Chinese Simplification, Simplify, or Simplified may refer to: Mathematics Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
. Code page 936 for Simplified Chinese accompanied code page 932 (
Shift JIS Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
) for Japanese, code page 934 for Korean and code page 938 for Traditional Chinese. The last revision of IBM-928/936/946 was documented in 1992, and it was superseded in 1993 by the EUC-CN-based code pages 1380 through 1383; code page 1380 encodes the same characters as code page 928, but in a different layout. As of 1998, "some older Chinese packages" still included an algorithm for converting between IBM-936 and other encodings of GB 2312.


Status

Although chart definitions for Code page 1380 (the document C-H 3-3220-130 1993-11) are provided online by IBM, IBM does not similarly provide the chart definition for the older Code page 928 (the document C-H 3-3220-130 1992-11, i.e. an earlier revision of the same specification).
International Components for Unicode International Components for Unicode (ICU) is an open-source project of mature C/ C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and envir ...
(ICU) does not include an IBM-936 or IBM-946 codec, and uses the Windows code page for the "cp936" label. The ICU project does possess mapping data for IBM-946, which it makes publicly available, but does not ship it with ICU.


Structure

Code page 928, the double byte component, includes 9,355 characters as double-byte sequences starting with 0x81 through 0xAC and 0xF0 through 0xFA. The 0x81–AC lead byte range is used for GB 2312 characters: lead bytes 0x81–87 were used for non-hanzi, 0x88–9C are used for level 1 hanzi and 0x9C–AC are used for level 2 hanzi. Like
Shift JIS Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
, trail (second) bytes are in the range 0x40–FC excluding 0x7F, allowing two GB 2312 rows to be encoded per lead byte; unlike Shift JIS, the bytes 0xA0–AC are not excluded from the lead byte range, since
JIS X 0201 JIS X 0201, a Japanese Industrial Standards, Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. ...
compatibility was not required. The 0xF0–FA lead byte range is used for IBM extensions: 0xF0 through 0xF9 are used for user-defined characters, and 0xFA is used for additional non-hanzi.


References

{{character encoding Chinese character encodings