Code page
In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Typically each number represents the binary value in a single byte. (In some c ...
942 (abbreviated as CP942 or IBM-942) is one of IBM's extensions of
Shift JIS
Shift JIS (Shift Japanese Industrial Standards, also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjunct ...
. The coded character sets are
JIS X 0201,
JIS X 0208
JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the curren ...
, IBM extensions for IBM 1880 UDC and IBM extensions. It is the combination of the single-byte
Code page 1041
Code page 897 (CCSID 897) is IBM's implementation of the 8-bit form of JIS X 0201. It includes several additional graphical characters in the C0 control characters area, and the code points in question may be used as control characters or graphica ...
and the double-byte
Code page 301.
It is a superset of
IBM-932
IBM code page 932 (abbreviated as IBM-932 or ambiguously as CP932) is one of IBM's extensions of Shift JIS. The coded character sets are JIS X 0201:1976, JIS X 0208:1983, IBM extensions and IBM extensions for IBM 1880 UDC. It is the combination o ...
, differing in its use of Code page 1041 in place of
Code page 897 for its single byte codes. Code page 1041 is an extension of Code page 897 and adds five single-byte characters. 0x80 is mapped to the
cent sign (
¢
), 0xA0 is mapped to the
pound sign
The pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Gibraltar ...
(
£
), 0xFD is mapped to the
not sign (
¬
), 0xFE is mapped to the
backslash (
\
) and 0xFF is mapped to the
tilde
The tilde () or , is a grapheme with several uses. The name of the character came into English from Spanish, which in turn came from the Latin ''titulus'', meaning "title" or "superscription". Its primary use is as a diacritic (accent) in ...
(
~
).
These are all unassigned in Code page 897 and therefore IBM-932.
Code page 942 contains standard 7-bit
ISO 646
ISO/IEC 646 is a set of ISO/ IEC standards, described as ''Information technology — ISO 7-bit coded character set for information interchange'' and developed in cooperation with ASCII at least since 1964. Since its first edition in ...
codes, and Japanese characters are indicated by the high bit of the first byte being set to 1. Some code points in this page require a second byte, so characters use either 8 or 16 bits for encoding.
Code page 1041, and therefore Code page 942, uses 0x5C for the
Yen sign
The yen and yuan sign, ¥, is a currency sign used for the Japanese yen and the Chinese yuan currencies when writing in Latin scripts. This monetary symbol resembles a Latin letter Y with a single or double horizontal stroke. The symbol is usua ...
(
Â¥
) and 0x7E for the overline (
‾
),
matching the lower half of
JIS X 0201 rather than
US-ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
. However, the version of Code page 942 used in
International Components for Unicode
International Components for Unicode (ICU) is an open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or cont ...
(called "ibm-942_P12A-1999" or "x-IBM942C") uses US-ASCII mappings for single-byte characters between 0x20 and 0x7E. This results in duplicate mapping for the tilde (0x7E and 0xFF) and the backslash (0x5C and 0xFE).
Layout
See also
*
Code page 943
References
External links
IBM Code Page 942
{{character encoding
942
Encodings of Japanese