IBM code page 936 is a character encoding for
Simplified Chinese
Simplification, Simplify, or Simplified may refer to:
Mathematics
Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
including 1880
user-defined characters (UDC), which was superseded in 1993. It is a combination of the single-byte
Code page 903 and the double-byte Code page 928.
Code page 946 uses the same double-byte component, but an extended single-byte component (
Code page 1042).
IBM code page 936 should not be confused with
the identically numbered Windows code page, which is a variant of the
GBK encoding;
GBK is called
Code page 1386
Windows code page 936 (abbreviated MS936, Windows-936 or (Code page 936 (IBM), ambiguously) CP936), is Microsoft's legacy (pre-Unicode) character encoding for representing simplified Chinese text Chinese character IT, on computers. It is one of th ...
by IBM. While GBK is a superset of the
EUC-CN encoding of
GB 2312
is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. ''GB'' refers to the Guobiao standards (国家标准), ...
, IBM-936 uses a different coded form of GB 2312, more closely resembling the relationship of
Shift JIS
Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
to
JIS X 0208
JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
.
History

The encoding was in use mainly during the 1980s and early 1990s. While the original IBM PC (
IBM 5150) lacked functionality for processing data in
CJK languages, the
IBM 5550 possessed such functionality, and was available in models supporting Japanese,
Korean,
Traditional Chinese
A tradition is a system of beliefs or behaviors (folk custom) passed down within a group of people or society with symbolic meaning or special significance with origins in the past. A component of cultural expressions and folklore, common examp ...
or
Simplified Chinese
Simplification, Simplify, or Simplified may refer to:
Mathematics
Simplification is the process of replacing a mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded ordering. Examples include: ...
. Code page 936 for Simplified Chinese accompanied
code page 932 (
Shift JIS
Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
) for Japanese,
code page 934 for Korean and
code page 938 for Traditional Chinese.
The last revision of IBM-928/936/946 was documented in 1992, and it was superseded in 1993 by the
EUC-CN-based
code pages 1380 through 1383; code page 1380 encodes the same characters as code page 928, but in a different layout.
As of 1998, "some older Chinese packages" still included an algorithm for converting between IBM-936 and other encodings of GB 2312.
Status
Although chart definitions for Code page 1380 (the document C-H 3-3220-130 1993-11) are provided online by IBM, IBM does not similarly provide the chart definition for the older Code page 928 (the document C-H 3-3220-130 1992-11, i.e. an earlier revision of the same specification).
International Components for Unicode
International Components for Unicode (ICU) is an open-source project of mature C/ C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and envir ...
(ICU) does not include an IBM-936 or IBM-946 codec, and uses the Windows code page for the "cp936" label. The ICU project does possess mapping data for IBM-946, which it makes publicly available,
but does not ship it with ICU.
Structure
Code page 928, the double byte component, includes 9,355 characters as double-byte sequences starting with 0x81 through 0xAC and 0xF0 through 0xFA.
The 0x81–AC lead byte range is used for GB 2312 characters: lead bytes 0x81–87 were used for non-hanzi, 0x88–9C are used for level 1 hanzi and 0x9C–AC are used for level 2 hanzi.
Like
Shift JIS
Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS ...
, trail (second) bytes are in the range 0x40–FC excluding 0x7F, allowing two GB 2312 rows to be encoded per lead byte;
unlike Shift JIS, the bytes 0xA0–AC are not excluded from the lead byte range,
since
JIS X 0201
JIS X 0201, a Japanese Industrial Standards, Japanese Industrial Standard developed in 1969, was the first Japanese electronic character set to become widely used. The character set was initially known as JIS C 6220 before the JIS category reform. ...
compatibility was not required. The 0xF0–FA lead byte range is used for IBM extensions: 0xF0 through 0xF9 are used for user-defined characters, and 0xFA is used for additional non-hanzi.
References
{{character encoding
Chinese character encodings