CJK Character
In internationalization, CJK characters is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives in their writing systems, sometimes paired with other scripts. Collectively, the CJK characters often include ''Hànzì'' in Chinese, ''Kanji'' and ''Kana'' in Japanese, ''Hanja'' and ''Hangul'' in Korean. Vietnamese can be included, making the abbreviation CJKV, as Vietnamese historically used Chinese characters in which they were known as ''Chữ Hán'' and ''Chữ Nôm'' in Vietnamese ('' Hán-Nôm'' altogether). Character repertoire Standard Mandarin Chinese and Standard Cantonese are written almost exclusively in Chinese characters. Over 3,000 characters are required for general literacy, with up to 40,000 characters for reasonably complete coverage. Japanese uses fewer characters—general literacy in Japanese can be expected with 2,136 characters. The use of Chinese characters in Korea is increasingly rare, a ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Internationalization And Localization
In computing, internationalization and localization (American) or internationalisation and localisation (British English), often abbreviated i18n and L10n, are means of adapting computer software to different languages, regional peculiarities and technical requirements of a target locale. Internationalization is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. Localization is the process of adapting internationalized software for a specific region or language by translating text and adding locale-specific components. Localization (which is potentially performed multiple times, for different locales) uses the infrastructure or flexibility provided by internationalization (which is ideally performed only once before localization, or as an integral part of ongoing development). Naming The terms are frequently abbreviated to the numeronyms ''i18n'' (where ''18'' stands for the number of letters ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Pinyin
Hanyu Pinyin (), often shortened to just pinyin, is the official romanization system for Standard Mandarin Chinese in China, and to some extent, in Singapore and Malaysia. It is often used to teach Mandarin, normally written in Chinese form, to learners already familiar with the Latin alphabet. The system includes four diacritics denoting tones, but pinyin without tone marks is used to spell Chinese names and words in languages written in the Latin script, and is also used in certain computer input methods to enter Chinese characters. The word ' () literally means "Han language" (i.e. Chinese language), while ' () means "spelled sounds". The pinyin system was developed in the 1950s by a group of Chinese linguists including Zhou Youguang and was based on earlier forms of romanizations of Chinese. It was published by the Chinese Government in 1958 and revised several times. The International Organization for Standardization (ISO) adopted pinyin as an international standard ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chinese Character Code For Information Interchange
The Chinese Character Code for Information Interchange () or CCCII is a character set developed by the Chinese Character Analysis Group in Taiwan. It was first published in 1980, and significantly expanded in 1982 and 1987. It is used mostly by library systems. It is one of the earliest established and most sophisticated encodings for traditional Chinese (predating the establishment of Big5 in 1984 and CNS 11643 in 1986). It is distinguished by its unique system for encoding simplified versions and other variants of its main set of hanzi characters. A variant of an earlier version of CCCII is used by the Library of Congress as part of MARC-8, under the name East Asian Character Code (EACC, ANSI/NISO Z39.64), where it comprises part of MARC 21's JACKPHY support. However, EACC contains fewer characters than the most recent versions of CCCII. Design Byte ranges CCCII is designed as an 94n set, as defined by ISO/IEC 2022. Each Chinese character is represented by a 3-byte code in ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Big5
Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters. The People's Republic of China (PRC), which uses simplified Chinese characters, uses the GB 18030 character set instead. Big5 gets its name from the consortium of five companies in Taiwan that developed it. Organization The original Big5 character set is sorted first by usage frequency, second by stroke count, lastly by Kangxi radical. The original Big5 character set lacked many commonly used characters. To solve this problem, each vendor developed its own extension. The ETen extension became part of the current Big5 standard through popularity. The structure of Big5 does not conform to the ISO 2022 standard, but rather bears a certain similarity to the encoding. It is a double-byte character set (DBCS) with the following structure: (the prefix 0x signifying hexadecimal numbers). Standard assignments (excluding vendor or user-defined extensions) ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Han Unification
Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese ( hanzi), Japanese (kanji), Korean (hanja) and Vietnamese (chữ Hán). Modern Chinese, Japanese and Korean typefaces typically use regional or historical variants of a given Han character. In the formulation of Unicode, an attempt was made to unify these variants by considering them different glyphs representing the same "grapheme", or orthographic unit, hence, "Han unification", with the resulting character repertoire sometimes contracted to Unihan. Nevertheless, many characters have regional variants assigned to different code points, such as Traditional (U+500B) versus Simplified (U+4E2A). Unihan can also refer to the Unihan Database maintained by the Unicode Consortium, which provides informati ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
GB 18030
GB 18030 is a Chinese government standard, described as ''Information Technology — Chinese coded character set'' and defines the required language and character support necessary for software in China. GB18030 is the registered Internet name for the official character set of the People's Republic of China (PRC) superseding GB2312. As a Unicode Transformation Format (i.e. an encoding of all Unicode code points), GB18030 supports both simplified and traditional Chinese characters. It is also compatible with legacy encodings including GB2312, CP936, and GBK 1.0. In addition to the "GB18030 character encoding", this standard contains requirements about which scripts must be supported, font support, etc. As of 2022, in terms of font implementations, "only the Simplified Chinese fonts of the ''Noto Sans CJK'' (Google), ''Source Han Mono'' (Adobe), and ''Source Han Sans'' (Adobe) typeface families are already compliant with GB 18030-2022 Implementation Level 2 .''Microsoft ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic script (Unicode), scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. The Unicode character repertoire is synchronized with Universal Coded Character Set, ISO/IEC 10646, each being code-for-code id ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Character Encoding
Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmitted, and Computing, transformed using Digital electronics, digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "Character Map (Windows), character map". Early character codes associated with the optical or electrical Telegraphy, telegraph could only represent a subset of the characters used in written languages, sometimes restricted to Letter case, upper case letters, Numeral system, numerals and some punctuation only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages. Character enc ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chữ Quốc Ngữ
The Vietnamese alphabet ( vi, chữ Quốc ngữ, lit=script of the National language) is the modern Latin writing script or writing system for Vietnamese. It uses the Latin script based on Romance languages originally developed by Portuguese missionary Francisco de Pina (1585 – 1625). The Vietnamese alphabet contains 29 letters, including seven letters using four diacritics: ''ă'', ''â''/''ê''/''ô'', ''ơ''/''ư'', ''đ''. There are an additional five diacritics used to designate tone (as in ''à'', ''á'', ''ả'', ''ã'', and ''ạ''). The complex vowel system and the large number of letters with diacritics, which can stack twice on the same letter (e.g. ''nhất'' meaning "first"), makes it easy to distinguish the Vietnamese orthography from other writing systems that use the Latin script. The Vietnamese system's use of diacritics produces an accurate transcription for tones despite the limitations of the Roman alphabet. On the other hand, sound changes in the spoken ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chữ Nôm
Chữ Nôm (, ; ) is a logographic writing system formerly used to write the Vietnamese language. It uses Chinese characters (''Chữ Hán'') to represent Sino-Vietnamese vocabulary and some native Vietnamese words, with other words represented by new characters created using a variety of methods, including phono-semantic compounds. This composite script was therefore highly complex, and was accessible only to the small proportion of the Vietnamese population who had mastered written Chinese. Although formal writing in Vietnam was done in classical Chinese until the early 20th century (except for two brief interludes), chữ Nôm was widely used between the 15th and 19th centuries by the Vietnamese cultured elite for popular works in the vernacular, many in verse. One of the best-known pieces of Vietnamese literature, ''The Tale of Kiều'', was written in chữ Nôm by Nguyễn Du. The Vietnamese alphabet created by Portuguese Jesuit missionaries, with the earliest known usage ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Classical Chinese
Classical Chinese, also known as Literary Chinese (古文 ''gǔwén'' "ancient text", or 文言 ''wényán'' "text speak", meaning "literary language/speech"; modern vernacular: 文言文 ''wényánwén'' "text speak text", meaning "literary language writing"), is the language of the classic literature from the end of the Spring and Autumn period through to the either the start of the Qin dynasty or the end of the Han dynasty, a written form of Old Chinese (上古漢語, ''Shànɡɡǔ Hànyǔ''). Classical Chinese is a traditional style of written Chinese that evolved from the classical language, making it different from any modern spoken form of Chinese. Literary Chinese was used for almost all formal writing in China until the early 20th century, and also, during various periods, in Japan, Ryukyu, Korea and Vietnam. Among Chinese speakers, Literary Chinese has been largely replaced by written vernacular Chinese, a style of writing that is similar to modern spoken Mandarin ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Sinology
Sinology, or Chinese studies, is an academic discipline that focuses on the study of China primarily through Chinese philosophy, language, literature, culture and history and often refers to Western scholarship. Its origin "may be traced to the examination which Chinese scholars made of their own civilization." The field of sinology was historically seen to be equivalent to the application of philology to China and until the 20th century was generally seen as meaning "Chinese philology" (language and literature). Sinology has broadened in modern times to include Chinese history, epigraphy and other subjects. Terminology The terms "sinology" and "sinologist" were coined around 1838 and use "sino-", derived from Late Latin ''Sinae'' from the Greek ''Sinae'', from the Arabic ''Sin'' which in turn may derive from ''Qin'', as in the Qin dynasty. In the context of area studies, the European and the American usages may differ. In Europe, Sinology is usually known as ''Chinese S ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |