Chinese Character Collation
   HOME
*





Chinese Character Collation
Chinese character order, or ''Chinese character indexing'', ''Chinese character collation'' and ''Chinese character sorting'' (), is the way in which a Chinese character set is sorted into a sequence for the convenience of information retrieval. It may also refer to the sequence so produced. English dictionaries and indexes are normally arranged in alphabetical order for quick lookup, but Chinese is written in tens of thousands of different characters, not just dozens of letters in an alphabet, and that makes the sorting job much more challenging. The orders or sorting methods of Chinese dictionaries are traditionally divided into three categories: * Form-based orders, including stroke-based orders and component-based orders, which further includes radical-based orders, etc. * Sound-based orders, including Pinyin-based order and Bopomofo-based order * Meaning-based orders In modern Chinese, people also use frequency orders, where words or characters are sorted by their frequenci ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Alphabetical Order
Alphabetical order is a system whereby character strings are placed in order based on the position of the characters in the conventional ordering of an alphabet. It is one of the methods of collation. In mathematics, a lexicographical order is the generalization of the alphabetical order to other data types, such as sequences of numbers or other ordered mathematical objects. When applied to strings or sequences that may contain digits, numbers or more elaborate types of elements, in addition to alphabetical characters, the alphabetical order is generally called a lexicographical order. To determine which of two strings of characters comes first when arranging in alphabetical order, their first letters are compared. If they differ, then the string whose first letter comes earlier in the alphabet comes before the other string. If the first letters are the same, then the second letters are compared, and so on. If a position is reached where one string has no more letters to compare ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Pinyin
Hanyu Pinyin (), often shortened to just pinyin, is the official romanization system for Standard Mandarin Chinese in China, and to some extent, in Singapore and Malaysia. It is often used to teach Mandarin, normally written in Chinese form, to learners already familiar with the Latin alphabet. The system includes four diacritics denoting tones, but pinyin without tone marks is used to spell Chinese names and words in languages written in the Latin script, and is also used in certain computer input methods to enter Chinese characters. The word ' () literally means "Han language" (i.e. Chinese language), while ' () means "spelled sounds". The pinyin system was developed in the 1950s by a group of Chinese linguists including Zhou Youguang and was based on earlier forms of romanizations of Chinese. It was published by the Chinese Government in 1958 and revised several times. The International Organization for Standardization (ISO) adopted pinyin as an international standard ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


University Of Hawaiʻi Press
The University of Hawaii Press is a university press that is part of the University of Hawaii. The University of Hawaii Press was founded in 1947, publishing research in all disciplines of the humanities and natural and social sciences in the regions of Asia and the Pacific. In addition to scholarly monographs, the press publishes educational materials and reference works such as dictionaries, language texts, classroom readers, atlases, and encyclopedias. History The press was established in 1947 at the initiative of University of Hawaii president Gregg M. Sinclair. Its first publications included a reprint of '' The Hawaiian Kingdom'' by Ralph Kuykendall and ''Insects of Hawaii,'' by Elwood C. Zimmerman, both of which have become classics. Other enduring classics from its early years include the ''Hawaiian-English Dictionary,'' by Mary Kawena Pukui and Samuel Elbert, first published in 1957, last revised and enlarged in 1986, then reprinted 16 times; and ''Shoal of Time: ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chinese Characters
Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as ''kanji''. Chinese characters in South Korea, which are known as ''hanja'', retain significant use in Korean academia to study its documents, history, literature and records. Vietnam once used the '' chữ Hán'' and developed chữ Nôm to write Vietnamese before turning to a romanized alphabet. Chinese characters are the oldest continuously used system of writing in the world. By virtue of their widespread current use throughout East Asia and Southeast Asia, as well as their profound historic use throughout the Sinosphere, Chinese characters are among the most widely adopted writing systems in the world by number of users. The total number of Chinese characters ever to appear in a dictionary is in the tens of thousands, though most are graphic ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Modern Chinese Characters
Modern Chinese characters () are the Chinese characters used in modern languages, including Chinese, Japanese, Korean and Vietnamese. Chinese characters are composed of components, which are in turn composed of strokes. The 100 most frequently-used characters cover (i.e., having an accumulated frequency of) over 40% of modern Chinese texts. The 1000 most frequently-used characters cover approximately 90% of the texts. There are a variety of novel aspects of modern Chinese characters, including that of orthography, phonology, and semantics, as well as matters of collation and organization and statistical analysis, computer processing, and pedagogy. Background Historical development Since maturing as a complete writing system, Chinese characters have had an uninterrupted history of development over more than 3,000 years, with stages including * Oracle bone script, * Bronze script, * Seal script, * Clerical script, and * Regular script, leading to the modern written forms, as il ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Stroke Order
Stroke order is the order in which the strokes of a Chinese character (or Chinese derivative character) are written. A stroke is a movement of a writing instrument on a writing surface. Chinese characters are used in various forms in Chinese, Japanese, and Korean. They are known as '' Hanzi'' in (Mandarin) Chinese (Traditional form: ; Simplified form: ), ''kanji'' in Japanese (), and ''Hanja'' in Korean (). Basic principles Chinese characters are basically logograms constructed with strokes. Over the millennia a set of generally agreed rules have been developed by custom. Minor variations exist between countries, but the basic principles remain the same, namely that writing characters should be economical, with the fewest hand movements to write the most strokes possible. This promotes writing speed, accuracy, and readability. This idea is particularly important since as learners progress, characters often get more complex. Since stroke order also aids learning and memorizati ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Collation
Collation is the assembly of written information into a standard order. Many systems of collation are based on numerical order or alphabetical order, or extensions and combinations thereof. Collation is a fundamental element of most office filing systems, library catalogs, and reference books. Collation differs from ''classification'' in that the classes themselves are not necessarily ordered. However, even if the order of the classes is irrelevant, the identifiers of the classes may be members of an ordered set, allowing a sorting algorithm to arrange the items by class. Formally speaking, a collation method typically defines a total order on a set of possible identifiers, called sort keys, which consequently produces a total preorder on the set of items of information (items with the same identifier are not placed in any defined order). A collation algorithm such as the Unicode collation algorithm defines an order through the process of comparing two given character string ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Chinese Dictionary
Chinese dictionaries date back over two millennia to the Han dynasty, which is a significantly longer lexicographical history than any other language. There are hundreds of dictionaries for the Chinese language, and this article discusses some of the most important. Terminology The general term ''císhū'' (, "lexicographic books") semantically encompasses "dictionary; lexicon; encyclopedia; glossary". The Chinese language has two words for dictionary: ''zidian'' (character/logograph dictionary) for written forms, that is, Chinese characters, and ''cidian'' (word/phrase dictionary), for spoken forms. For character dictionaries, ''zidian'' () combines ''zi'' "character, graph; letter, script, writing; word") and ''dian'' "dictionary, encyclopedia; standard, rule; statute, canon; classical allusion"). For word dictionaries, ''cidian'' is interchangeably written /; ''cídiǎn''; ''tzʻŭ²-tien³''; "word dictionary") or (/; ''cídiǎn''; ''tzʻŭ²-tien³''; "word dictionary"); ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Corpus Linguistics
Corpus linguistics is the study of language, study of a language as that language is expressed in its text corpus (plural ''corpora''), its body of "real world" text. Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. The text-corpus method uses the body of texts written in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other languages which have undergone a similar analysis. The first such corpora were manually derived from source texts, but now that work is automated. Corpora have not only been used for linguistics research, they have also been used to compile dictionaries (starting with ''The American Heritage Dictionary of the English Language'' in 1969) and grammar guides, such as ''A Compreh ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Erya
The ''Erya'' or ''Erh-ya'' is the first surviving Chinese dictionary. Bernhard Karlgren (1931:49) concluded that "the major part of its glosses must reasonably date from the 3rd century BC." Title Chinese scholars interpret the first title character ''ěr'' (; "you, your; adverbial suffix") as a phonetic loan character for the homophonous ''ěr'' (; "near; close; approach"), and believe the second ''yǎ'' (; "proper; correct; refined; elegant") refers to words or language.''Shiming (Explanations of Names)'"Explaining the Classics"
versio

[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Jyutping
Jyutping is a romanisation system for Cantonese developed by the Linguistic Society of Hong Kong (LSHK), an academic group, in 1993. Its formal name is the Linguistic Society of Hong Kong Cantonese Romanization Scheme. The LSHK advocates for and promotes the use of this romanisation system. The name ''Jyutping'' (itself the Jyutping romanisation of its Chinese name, ) is a contraction consisting of the first Chinese characters of the terms ''Jyut6jyu5'' (, meaning " Yue language") and ''ping3jam1'' ( "phonetic alphabet", also pronounced as "pinyin" in Mandarin). Despite being intended as a romanisation system to indicate pronunciation, it has also been employed writing Cantonese as an alphabetic language, elevating it from its assistive status to a written language in effect. History The Jyutping system marks a departure from all previous Cantonese romanisation systems (approximately 12, including Robert Morrison's pioneering work of 1828, and the widely used Standard ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Commission On The Unification Of Pronunciation
The Commission on the Unification of Pronunciation () was the organization established by the Beiyang government in 1912 to select ancillary Phonetic transcription, phonetic symbols for Mandarin Chinese, Mandarin (resulting in the creation of Zhuyin) and set the standard Standard Chinese, Guoyu pronunciation of basic Chinese characters. History It was decided in a draft on 7 August 1912, a month after a conference led by the Cai Yuanpei on July 10, that a set of phonetic symbols were to be used for education purposes. The Commission was set up in December, led by Wu Zhihui (Woo Tsin-hang; ). The Commission ended on 22 May 1913. A later similar organization that still exists that had been headed by Wu Zhihui for a while is the Mandarin Promotion Council. Members The first meeting took place on 15 February 1913 in Beijing, with 44 delegates. The chairman was Wu; vice-chairman Wang Zhao (linguist), Wang Zhao (). There were two representatives per each of the 26 Chinese province, ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]