Big5
   HOME
*





Big5
Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters. The People's Republic of China (PRC), which uses simplified Chinese characters, uses the GB 18030 character set instead. Big5 gets its name from the consortium of five companies in Taiwan that developed it. Organization The original Big5 character set is sorted first by usage frequency, second by stroke count, lastly by Kangxi radical. The original Big5 character set lacked many commonly used characters. To solve this problem, each vendor developed its own extension. The ETen extension became part of the current Big5 standard through popularity. The structure of Big5 does not conform to the ISO 2022 standard, but rather bears a certain similarity to the encoding. It is a double-byte character set (DBCS) with the following structure: (the prefix 0x signifying hexadecimal numbers). Standard assignments (excluding vendor or user-defined extensions) ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Hong Kong Supplementary Character Set
The Hong Kong Supplementary Character Set (; commonly abbreviated to HKSCS) is a set of Chinese characters – 4,702 in total in the initial release—used in Cantonese, as well as when writing the names of some places in Hong Kong (whether in written Cantonese or standard written Chinese sentences). It evolved from the preceding Government Chinese Character Set () or GCCS. GCCS is a set of supplementary Chinese characters coded in the user-defined areas of the Big5 character set. It was originally used within the Hong Kong Government and later used by the public. It later evolved into Hong Kong Supplementary Character Set when the characters in the set were submitted to ISO-10646 for coding. Development history Due to the inherent differences between standard written Chinese and written Cantonese, the Government of Hong Kong recognised the need for a standardised set of ''proprietary'' characters that would allow for the streamlining of electronic communication; at the time ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


HKSCS
The Hong Kong Supplementary Character Set (; commonly abbreviated to HKSCS) is a set of Chinese characters – 4,702 in total in the initial release—used in Cantonese, as well as when writing the names of some places in Hong Kong (whether in written Cantonese or standard written Chinese sentences). It evolved from the preceding Government Chinese Character Set () or GCCS. GCCS is a set of supplementary Chinese characters coded in the user-defined areas of the Big5 character set. It was originally used within the Hong Kong Government and later used by the public. It later evolved into Hong Kong Supplementary Character Set when the characters in the set were submitted to ISO-10646 for coding. Development history Due to the inherent differences between standard written Chinese and written Cantonese, the Government of Hong Kong recognised the need for a standardised set of ''proprietary'' characters that would allow for the streamlining of electronic communication; at the time ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


CNS 11643
The CNS 11643 character set (Chinese National Standard 11643), also officially known as the Chinese Standard Interchange Code or CSIC ( zh, tr=, t=中文標準交換碼), is officially the standard character set of Taiwan (Republic of China). In practice, variants of the related Big5 character set are ''de facto'' standard. CNS 11643 is designed to conform to ISO 2022. It contains 16 planes, so the maximum possible number of encodable characters is 16×94×94 = 141376. Planes 1 through 7 are defined by the standard; since 2007, planes 10 through 15 have also been defined by the standard. Prior to this, planes 12 to 15 (35344 code points) were specifically designated for user-defined characters. Unlike CCCII, the encoding of variant characters in CNS 11643 is not related. EUC-TW is an encoded representation of CNS 11643 and ASCII in Extended Unix Code (EUC) form. Other encodings capable of representing certain CSIC planes include ISO-2022-CN (planes 1 and 2) and ISO-2022-CN-EXT (p ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Chinese Character Encoding
In computing, Chinese character encodings can be used to represent text written in the CJK languages—Chinese, Japanese, Korean—and (rarely) obsolete Vietnamese, all of which use Chinese characters. Several general-purpose character encodings accommodate Chinese characters, and some of them were developed specifically for Chinese. In addition to Unicode (with the set of CJK Unified Ideographs), local encoding systems exist. The Chinese Guobiao (or GB, "national standard") system is used in Mainland China and Singapore, and the (mainly) Taiwanese Big5 system is used in Taiwan, Hong Kong and Macau as the two primary "legacy" local encoding systems. Guobiao is usually displayed using simplified characters and Big5 is usually displayed using traditional characters. There is however no mandated connection between the encoding system and the font used to display the characters; font and encoding are usually tied together for practical reasons. The issue of which encoding to ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Windows-950
Code page 950 is the code page used on Microsoft Windows for Traditional Chinese. It is Microsoft's implementation of the ''de facto'' standard Big5 character encoding. The code page is not registered with IANA, and hence, it is not a standard to communicate information over the internet, although it is usually labelled simply as , including by Microsoft library functions. Terminology and variants The major difference between Windows code page 950 and "common" (non-vendor-specific) Big5 is the incorporation of a subset of the ETEN extensions to Big5 at 0xF9D6 through 0xF9FE (comprising the seven Chinese characters 碁, 銹, 裏, 墻, 恒, 粧, and 嫺, followed by 34 box drawing characters and block elements). The ranges used by some of the other ETEN extended characters are instead defined as end-user defined (private use) characters. IBM's CCSID 950 comprises single byte code page 1114 (CCSID 1114) and double byte code page 947 (CCSID 947), and, while also a Big5 variant ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

ISO 2022
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the field of character encoding. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal em ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Simplified Chinese Characters
Simplified Chinese characters are standardized Chinese characters used in mainland China, Malaysia and Singapore, as prescribed by the ''Table of General Standard Chinese Characters''. Along with traditional Chinese characters, they are one of the two standard character sets of the contemporary Chinese written language. The Government of China, government of the People's Republic of China in mainland China has promoted them for use in printing since the 1950s and 1960s to encourage literacy. They are officially used in the China, People's Republic of China, Malaysia and Singapore, while traditional Chinese characters still remain in common use in Hong Kong, Macau, Taiwan, ROC/Taiwan and Japan to a certain extent. Simplified Chinese characters may be referred to by their official name above or colloquially . In its broadest sense, the latter term refers to all characters that have undergone simplifications of character "structure" or "body", some of which have existed for mille ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Variable-width Encoding
A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation, usually in a computer. Most common variable-width encodings are multibyte encodings, which use varying numbers of bytes ( octets) to encode different characters. (Some authors, notably in Microsoft documentation, use the term ''multibyte character set,'' which is a misnomer, because representation size is an attribute of the encoding, not of the character set.) Early variable width encodings using less than a byte per character were sometimes used to pack English text into fewer bytes in adventure games for early microcomputers. However disks (which unlike tapes allowed random access allowing text to be loaded on demand), increases in computer memory and general purpose compression algorithms have rendered such tricks largely obsolete. Multibyte encodings are usually the result of a need to increase ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Traditional Chinese Characters
Traditional Chinese characters are one type of standard Chinese character sets of the contemporary written Chinese. The traditional characters had taken shapes since the clerical change and mostly remained in the same structure they took at the introduction of the regular script in the 2nd century. Over the following centuries, traditional characters were regarded as the standard form of printed Chinese characters or literary Chinese throughout the Sinosphere until the middle of the 20th century, before different script reforms initiated by countries using Chinese characters as a writing system. Traditional Chinese characters remain in common use in Taiwan, Hong Kong and Macau, as well as in most overseas Chinese communities outside Southeast Asia; in addition, Hanja in Korean language remains virtually identical to traditional characters, which is still used to a certain extent in South Korea, despite differing standards used among these countries over some variant Chine ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


DBCS
A double-byte character set (DBCS) is a character encoding in which either all characters (including control characters) are encoded in two bytes, or merely every graphic character not representable by an accompanying single-byte character set (SBCS) is encoded in two bytes (Han characters would generally comprise most of these two-byte characters). A DBCS supports national languages that contain many unique characters or symbols (the maximum number of characters that can be represented with one byte is 256 characters, while two bytes can represent up to 65,536 characters). Examples of such languages include Japanese and Chinese. Korean Hangul does not contain as many characters, but KS X 1001 supports both Hangul and Hanja, and uses two bytes per character. In CJK (Chinese/Japanese/Korean) computing The term ''DBCS'' traditionally refers to a character encoding where each graphic character is encoded in two bytes. In an 8-bit code, such as Big-5 or Shift JIS, a character from ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Traditional Chinese Character
Traditional Chinese characters are one type of standard Chinese character sets of the contemporary written Chinese. The traditional characters had taken shapes since the clerical change and mostly remained in the same structure they took at the introduction of the regular script in the 2nd century. Over the following centuries, traditional characters were regarded as the standard form of printed Chinese characters or literary Chinese throughout the Sinosphere until the middle of the 20th century, before different script reforms initiated by countries using Chinese characters as a writing system. Traditional Chinese characters remain in common use in Taiwan, Hong Kong and Macau, as well as in most overseas Chinese communities outside Southeast Asia; in addition, Hanja in Korean language remains virtually identical to traditional characters, which is still used to a certain extent in South Korea, despite differing standards used among these countries over some variant Chines ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


CJK Characters
In internationalization, CJK characters is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives in their writing systems, sometimes paired with other scripts. Collectively, the CJK characters often include ''Hànzì'' in Chinese, ''Kanji'' and ''Kana'' in Japanese, ''Hanja'' and ''Hangul'' in Korean. Vietnamese can be included, making the abbreviation CJKV, as Vietnamese historically used Chinese characters in which they were known as ''Chữ Hán'' and ''Chữ Nôm'' in Vietnamese ('' Hán-Nôm'' altogether). Character repertoire Standard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters. Over 3,000 characters are required for general literacy, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, a ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]