FunctionWhen the script was first used in the late 2nd millennium BC, words of were generally monosyllabic, and each character denoted a single word. Increasing numbers of polysyllabic words have entered the language from the period to the present day. It is estimated that about 25–30% of the vocabulary of classic texts from the was polysyllabic, though these words were used far less commonly than monosyllables, which accounted for 80–90% of occurrences in these texts. The process has accelerated over the centuries as phonetic change has increased the number of homophones. It has been estimated that over two thirds of the 3,000 most common words in modern are polysyllables, the vast majority of those being disyllables. The most common process has been to form of existing words, written with the characters of the constituent words. Words have also been created by adding , and borrowing from other languages. Polysyllabic words are generally written with one character per syllable. In most cases the character denotes a descended from an Old Chinese word. Many characters have multiple readings, with instances denoting different morphemes, sometimes with different pronunciations. In modern Standard Chinese, one fifth of the 2,400 most common characters have multiple pronunciations. For the 500 most common characters, the proportion rises to 30%. Often these readings are similar in sound and related in meaning. In the Old Chinese period, es could be added to a word to form a new word, which was often written with the same character. In many cases the pronunciations diverged due to subsequent . For example, many additional readings have the departing tone, the major source of the 4th tone in modern Standard Chinese. Scholars now believe that this tone is the reflex of an Old Chinese *-s suffix, with a range of semantic functions. For example, * / has readings OC * > MC > Mod. 'to transmit' and * > > 'a record'. (Middle Chinese forms are given in , in which ''H'' denotes the departing tone.) * has readings *maj > ''ma'' > ''mó'' 'to grind' and *majs > ''maH'' > ''mò'' 'grindstone'. * has readings * > > ''sù'' 'to stay overnight' and * > > 'celestial "mansion"'. * / has readings * > > 'speak' and * > > 'exhort'. Another common alternation is between voiced and voiceless initials (though the voicing distinction has disappeared on most modern varieties). This is believed to reflect an ancient prefix, but scholars disagree on whether the voiced or voiceless form is the original root. For example, * / has readings *kens > ''kenH'' > ''jiàn'' 'to see' and *gens > ''henH'' > ''xiàn'' 'to appear'. * / has readings *prats > ''pæjH'' > ''bài'' 'to defeat' and *brats > ''bæjH'' > ''bài'' 'to be defeated'. (In this case the pronunciations have converged in Standard Chinese, but not in some other varieties.) * has readings * > > 'to bend' and * > > 'to break by bending'.
Principles of formationChinese characters represent words of the language using several strategies. A few characters, including some of the most commonly used, were originally s, which depicted the objects denoted, or s, in which meaning was expressed iconically. The vast majority were written using the , in which a character for a similarly sounding word was either simply borrowed or (more commonly) extended with a disambiguating semantic marker to form a character. The traditional six-fold classification (' / "six writings") was first described by the scholar in the postface of his dictionary '' '' in 100 AD. While this analysis is sometimes problematic and arguably fails to reflect the complete nature of the Chinese writing system, it has been perpetuated by its long history and pervasive use.
Pictograms* ' s are highly stylized and simplified pictures of material objects. Examples of pictograms include ' for "sun", ' for "moon", and ' for "tree" or "wood". Xu Shen placed approximately 4% of characters in this category. Though few in number and expressing literal objects, pictograms and ideograms are nonetheless the basis on which all the more complex characters such as associative compound characters (会意字/會意字) and phono-semantic characters (形声字/形聲字) are formed. Pictograms are primary characters in the sense that they, along with ideograms (indicative characters i.e. symbols), are the building blocks of associative compound characters (会意字/會意字) and phono-semantic characters (形声字/形聲字). Over time pictograms were increasingly standardized, simplified, and stylized to make them easier to write. Furthermore, the same character element can be used to depict different objects. Thus, the image depicted by most pictograms is not often immediately evident. For example, 口 may indicate the mouth, a window as in 高 which depicts a tall building as a symbol of the idea of "tall" or the lip of a vessel as in 富 a wine jar under a roof as symbol of wealth. That is, pictograms extended from literal objects to take on symbolic or metaphoric meanings; sometimes even displacing the use of the character as a literal term, or creating ambiguity, which was resolved though character determinants, more commonly but less accurately known as " radicals" i.e. concept keys in the phono-semantic characters.
Simple ideograms* ' Also called ''simple indicatives,'' this small category contains characters that are direct illustrations. Examples include ' "up" and ' "down", originally a dot above and below a line. Indicative characters are symbols for abstract concepts which could not be depicted literally but nonetheless can be expressed as a visual symbol e.g. convex 凸, concave 凹, flat-and-level 平.
Compound ideographs* / ' Also translated as logical aggregates or associative idea characters, these characters have been interpreted as combining two or more pictographic or ideographic characters to suggest a third meaning. The canonical example is bright. 明 is the association of the two brightest objects in the sky the sun 日 and moon 月, brought together to express the idea of "bright". It is canonical because the term 明白 in Chinese (lit. "bright white") means "to understand, understand". Adding the abbreviated radical for grass, cao above the character, ming, changes it to meng 萌, which means to sprout or bud, alluding to the heliotropic behavior of plant life. Other commonly cited examples include "rest" (composed of the pictograms "person" and "tree") and "good" (composed of "woman" and "child"). Xu Shen placed approximately 13% of characters in this category, but many of his examples are now believed to be phono-semantic compounds whose origin has been obscured by subsequent changes in their form. and William Boltz go so far as to deny that any of the compound characters devised in ancient times were of this type, maintaining that now-lost "secondary readings" are responsible for the apparent absence of phonetic indicators, but their arguments have been rejected by other scholars. In contrast, associative compound characters are common among characters coined in Japan. Also, a few characters coined in China in modern times, such as platinum, "white metal" (see ) belong to this category.
Rebus* ' Also called ''borrowings'' or ''phonetic loan characters,'' the category covers cases where an existing character is used to represent an unrelated word with similar or identical pronunciation; sometimes the old meaning is then lost completely, as with characters such as ', which has lost its original meaning of "nose" completely and exclusively means "oneself", or ', which originally meant "scorpion" but is now used only in the sense of "ten thousand". Rebus was pivotal in the history of writing in China insofar as it represented the stage at which logographic writing could become purely phonetic (phonographic). Chinese characters used purely for their sound values are attested in the Spring and Autumn and period manuscripts, in which ' was used to write ' and vice versa, just lines apart; the same happened with ' 勺 for ' , with the characters in question being homophonous or nearly homophonous at the time.
Phonetical usage for foreign wordsChinese characters are used rebus-like and exclusively for their phonetic value when transcribing words of foreign origin, such as ancient Buddhist terms or modern foreign names. For example, the word for the country " " is 罗马尼亚/羅馬尼亞 (Luó Mǎ Ní Yà), in which the Chinese characters are only used for their sounds and do not provide any meaning. This usage is similar to that of the Japanese and , although the Kanas use a special set of simplified forms of Chinese characters, in order to advertise their value as purely phonetic symbols. The same rebus principle for names in particular has also been used in and . In the Chinese usage, in a few instances, the characters used for pronunciation might be carefully chosen in order to connote a specific meaning, as regularly happens for brand names: is translated phonetically as 可口可乐/可口可樂 (Kěkǒu Kělè), but the characters were carefully selected so as to have the additional meaning of "Delicious and Enjoyable".
Phono-semantic compounds* / Mandarin: ' ''Semantic-phonetic compounds'' or ''pictophonetic compounds'' are by far the most numerous characters. These characters are composed of at least two parts. The component suggests the general meaning of the compound character. The component suggests the pronunciation of the compound character. In most cases the semantic indicator is also the 部首 under which the character is listed in dictionaries. Because Chinese is replete in homophones phonetic elements may also carry semantic content. In some rare examples phono-semantic characters may also convey pictorial content. Each Chinese character is an attempt to combine sound, image, and idea in a mutually reinforcing fashion. Examples of phono-semantic characters include ' "river", ' "lake", ' "stream", ' "surge", ' "slippery". All these characters have on the left a radical of three short strokes (氵), which is a reduced form of the character 水 ''shuǐ'' meaning "water", indicating that the character has a semantic connection with water. The right-hand side in each case is a phonetic indicator- for instance: ' has a very similar pronunciation to and ' has a similar (though somewhat different) pronunciation to 河. For example, in the case of ' (Old Chinese ) "surge", the phonetic indicator is ' (Old Chinese ), which by itself means "middle". In this case it can be seen that the pronunciation of the character is slightly different from that of its phonetic indicator; the effect of historical sound change means that the composition of such characters can sometimes seem arbitrary today. In general, phonetic components do not determine the exact pronunciation of a character, but only give a clue as to its pronunciation. While some characters take the exact pronunciation of their phonetic component, others take only the initial or final sounds. In fact, some characters' pronunciations may not correspond to the pronunciations of their phonetic parts at all, which is sometimes the case with characters after having undergone simplification. The 8 characters in the following table all take 也 for their phonetic part, however, as it is readily apparent, none of them take the pronunciation of 也, which is yě (Old Chinese *lajʔ). As the table below shows, the sound changes that have taken place since the Shang/Zhou period when most of these characters were created can be dramatic, to the point of not providing any useful hint of the modern pronunciation. Xu Shen (c. 100 AD) placed approximately 82% of characters into this category, while in the (1716 AD) the number is closer to 90%, due to the extremely productive use of this technique to extend the Chinese vocabulary. The characters of Vietnam were created using this principle. This method is used to form new characters, for example / ' (" ") is the metal radical ' plus the phonetic component ', described in Chinese as " gives sound, gives meaning". Many Chinese names of elements in the periodic table and many other chemistry-related characters were formed this way. In fact, it is possible to tell from a Chinese periodic table at a glance which elements are metal (), solid nonmetal (, "stone"), liquid (), or gas () at . Occasionally a bisyllabic word is written with two characters that contain the same radical, as in ' "butterfly", where both characters have the insect radical . A notable example is (a Chinese lute, also a fruit, the , of similar shape) – originally written as with the hand radical (扌), referring to the down and up strokes when playing this instrument, which was then changed to (tree radical ), which is still used for the fruit, while the character was changed to when referring to the instrument (radical ). In other cases a compound word may coincidentally share a radical without this being meaningful.
Derivative cognates* / ' The smallest category of characters is also the least understood. In the postface to the ', Xu Shen gave as an example the characters ' "to verify" and ' "old", which had similar Old Chinese pronunciations (*khuʔ and *C-ruʔ respectively) and may once have been the same word, meaning "elderly person", but became lexicalized into two separate words. The term does not appear in the body of the dictionary, and is often omitted from modern systems.
Legendary originsAccording to legend, Chinese characters were invented by , a bureaucrat under the legendary . Inspired by his study of the animals of the world, the landscape of the earth and the stars in the sky, Cangjie is said to have invented symbols called ' () – the first Chinese characters. The legend relates that on the day the characters were created, grain rained down from the sky and that night the people heard ghosts wailing and demons crying because the human beings could no longer be cheated.
Early sign useIn recent decades, a series of inscribed graphs and pictures have been found at sites in China, including (c. 6500 BC), Dadiwan and from the 6th millennium BC, and (5th millennium BC). Often these finds are accompanied by media reports that push back the purported beginnings of Chinese writing by thousands of years. However, because these marks occur singly, without any implied context, and are made crudely and simply, concluded that "we do not have any basis for stating that these constituted writing nor is there reason to conclude that they were ancestral to Chinese characters." They do however demonstrate a history of sign use in the valley during the Neolithic through to the Shang period.
Oracle bone scriptThe earliest confirmed evidence of the Chinese script yet discovered is the body of inscriptions carved on bronze vessels and s from the late (c. 1250–1050 BC). The earliest of these is dated to around 1200 BC. In 1899, pieces of these bones were being sold as "dragon bones" for medicinal purposes, when scholars identified the symbols on them as Chinese writing. By 1928, the source of the bones had been traced to a village near in , which was excavated by the between 1928 and 1937. Over 150,000 fragments have been found. Oracle bone inscriptions are records of divinations performed in communication with royal ancestral spirits. The shortest are only a few characters long, while the longest are thirty to forty characters in length. The Shang king would communicate with his ancestors on topics relating to the royal family, military success, weather forecasting, ritual sacrifices, and related topics by means of , and the answers would be recorded on the divination material itself. The oracle-bone script is a well-developed writing system, suggesting that the Chinese script's origins may lie earlier than the late second millennium BC. Although these divinatory inscriptions are the earliest surviving evidence of ancient Chinese writing, it is widely believed that writing was used for many other non-official purposes, but that the materials upon which non-divinatory writing was done – likely wood and bamboo – were less durable than bone and shell and have since decayed away.
Bronze Age: parallel script forms and gradual evolutionThe traditional picture of an orderly series of scripts, each one invented suddenly and then completely displacing the previous one, has been conclusively demonstrated to be fiction by the archaeological finds and scholarly research of the later 20th and early 21st centuries. Gradual evolution and the coexistence of two or more scripts was more often the case. As early as the Shang dynasty, oracle-bone script coexisted as a simplified form alongside the normal script of books (preserved in typical bronze inscriptions), as well as the extra-elaborate pictorial forms (often clan emblems) found on many bronzes. Based on studies of these bronze inscriptions, it is clear that, from the Shang dynasty writing to that of the and early , the mainstream script evolved in a slow, unbroken fashion, until assuming the form that is now known as in the late Eastern Zhou in the state of , without any clear line of division. Meanwhile, other scripts had evolved, especially in the eastern and southern areas during the late , including regional forms, such as the ''gǔwén'' ("ancient forms") of the eastern preserved as variant forms in the character dictionary '' '', as well as decorative forms such as bird and insect scripts.
Unification: seal script, vulgar writing and proto-clerical, which had evolved slowly in the state of Qin during the Eastern , became standardized and adopted as the formal script for all of China in the (leading to a popular misconception that it was invented at that time), and was still widely used for decorative engraving and (name chops, or signets) in the period. However, despite the Qin script standardization, more than one script remained in use at the time. For example, a little-known, rectilinear and roughly executed kind of common (vulgar) writing had for centuries coexisted with the more formal seal script in the Qin state, and the popularity of this vulgar writing grew as the use of writing itself became more widespread. By the , an immature form of called "early clerical" or "proto-clerical" had already developed in the state of Qin based upon this vulgar writing, and with influence from seal script as well. The coexistence of the three scripts – small seal, vulgar and proto-clerical, with the latter evolving gradually in the Qin to early Han dynasties into – runs counter to the traditional belief that the Qin dynasty had one script only, and that clerical script was suddenly invented in the early Han dynasty from the .
Proto-clerical evolving to clericalProto-clerical script, which had emerged by the time of the Warring States period from vulgar Qin writing, matured gradually, and by the early Western Han period, it was little different from that of the Qin. Recently discovered bamboo slips show the script becoming mature clerical script by the middle-to-late reign of , who ruled from 141 to 87 BC.
Clerical and clerical cursiveContrary to the popular belief of there being only one script per period, there were in fact multiple scripts in use during the Han period. Although mature , also called () script, was dominant at that time, an early type of cursive script was also in use by the Han by at least as early as 24 BC (during the very late Western Han period), incorporating cursive forms popular at the time, well as many elements from the vulgar writing of the Warring State of Qin. By around the time of the Eastern Jin dynasty, this Han cursive became known as ''zhāngcǎo'' (also known as / ''lìcǎo'' today), or in English sometimes clerical cursive, ancient cursive, or draft cursive. Some believe that the name, based on ''zhāng'' meaning "orderly", arose because the script was a more orderly form of cursive than the modern form, which emerged during the Eastern Jin dynasty and is still in use today, called ''jīncǎo'' or "modern cursive".
Neo-clericalAround the mid- period, a simplified and easier-to-write form of clerical script appeared, which Qiu terms "neo-clerical" ( / , ''xīnlìtǐ''). By the late Eastern Han, this had become the dominant daily script, although the formal, mature () clerical script remained in use for formal works such as engraved . Qiu describes this neo-clerical script as a transition between clerical and regular script, and it remained in use through the and Jin dynasties.
Semi-cursiveBy the late Eastern Han period, an early form of appeared, developing out of a cursively written form of neo-clerical script and simple cursive. This semi-cursive script was traditionally attributed to Liu Desheng c. 147–188 AD, although such attributions refer to early masters of a script rather than to their actual inventors, since the scripts generally evolved into being over time. Qiu gives examples of early semi-cursive script, showing that it had popular origins rather than being purely Liu's invention.
Wei to Jin period
Regular scripthas been attributed to (c. 151–230 AD), during the period at the in the state of . Zhong Yao has been called the "father of regular script". However, some scholars postulate that one person alone could not have developed a new script which was universally adopted, but could only have been a contributor to its gradual formation. The earliest surviving pieces written in regular script are copies of Zhong Yao's works, including at least one copied by . This new script, which is the dominant modern Chinese script, developed out of a neatly written form of early semi-cursive, with addition of the pause (/ ''dùn'') technique to end horizontal strokes, plus heavy tails on strokes which are written to the downward-right diagonal. Thus, early regular script emerged from a neat, formal form of semi-cursive, which had itself emerged from neo-clerical (a simplified, convenient form of clerical script). It then matured further in the Eastern Jin dynasty in the hands of the "Sage of Calligraphy", , and his son Wang Xianzhi. It was not, however, in widespread use at that time, and most writers continued using neo-clerical, or a somewhat semi-cursive form of it, for daily writing, while the conservative clerical script remained in use on some stelae, alongside some semi-cursive, but primarily neo-clerical.
Modern cursiveMeanwhile, modern cursive script slowly emerged from the clerical cursive (''zhāngcǎo'') script during the Cao Wei to Jin period, under the influence of both semi-cursive and the newly emerged regular script. Cursive was formalized in the hands of a few master calligraphers, the most famous and influential of whom was .
Dominance and maturation of regular scriptIt was not until the that regular script rose to dominant status. During that period, regular script continued evolving stylistically, reaching full maturity in the early . Some call the writing of the early Tang calligrapher (557–641) the first mature regular script. After this point, although developments in the art of calligraphy and in character simplification still lay ahead, there were no more major stages of evolution for the mainstream script.
Modern historyAlthough most simplified Chinese characters in use today are the result of the works moderated by the government of the People's Republic of China (PRC) in the 1950s and 60s, the use of some of these forms predates the PRC's formation in 1949. , cursive written text, was the inspiration of some simplified characters, and for others, some are attested as early as the (221–206 BC) as either vulgar variants or original characters. One of the earliest proponents of character simplification was , who proposed in 1909 that simplified characters should be used in education. In the years following the in 1919, many Chinese intellectuals sought ways to modernise China as quickly as possible. Traditional culture and values such as were challenged and subsequently blamed for their problems. Soon, people in the Movement started to cite the traditional Chinese writing system as an obstacle in modernising China and therefore proposed that a reform be initiated. It was suggested that the Chinese writing system should be either simplified or completely abolished. , a renowned Chinese author in the 20th century, stated that, "If Chinese characters are not destroyed, then China will die" (). Recent commentators have claimed that Chinese characters were blamed for the economic problems in China during that time. In the 1930s and 1940s, discussions on character simplification took place within the government, and a large number of the intelligentsia maintained that character simplification would help boost literacy in China. In 1935, 324 simplified characters collected by were officially introduced as the table of first batch of simplified characters, but they were suspended in 1936 due to fierce opposition within the party. The People's Republic of China issued its first round of official character simplifications in two documents, the first in 1956 and the second in 1964. In the 1950s and 1960s, while confusion about simplified characters was still rampant, transitional characters that mixed simplified parts with yet-to-be simplified parts of characters together appeared briefly, then disappeared. " " was an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the so-called CJK languages (Chinese/Japanese/Korean) into a single set of unified characters and was completed for the purposes of in 1991 (Unicode 1.0). Apart from Chinese ones, Korean, Japanese and Vietnamese normative medium of record-keeping, written historical narratives and official communication are in adaptations and variations of Chinese script.
Adaptation to other languagesThe Chinese script spread to together with from the 2nd century BC to 5th century AD ( ). This was adopted for recording the Japanese language from the 5th century AD. Chinese characters were first used in Vietnam during the millennium of Chinese rule starting in 111 BC. They were used to write Classical Chinese and adapted around the 13th century to create the script to write Vietnamese. Currently, the only non-Chinese language outside of China that regularly uses the Chinese script is Japanese. Vietnam abandoned its use in the early 20th century in favour of a Latin-based script, and Korea in the late 20th century in favour of its homegrown script, although as Korea switched much more recently, many Koreans still learn them to read texts written before then, or in some cases to disambiguate homophones.
JapaneseChinese characters adapted to write words are known as . Chinese words borrowed into Japanese could be written with Chinese characters, while native Japanese words could also be written using the for a Chinese word of similar meaning. Most kanji have both the native (and often multi-syllabic) Japanese pronunciation, known as kun'yomi, and the (mono-syllabic) Chinese-based pronunciation, known as on'yomi. For example, the native Japanese word ''katana'' is written as in kanji, which uses the native pronunciation since the word is native to Japanese, while the Chinese loanword ''nihontō'' (meaning "Japanese sword") is written as , which uses the Chinese-based pronunciation. While nowadays loanwords from non-Sinosphere languages are usually just written in , one of the two syllabary systems of Japanese, loanwords that were borrowed into Japanese before the Meiji Period were typically written with Chinese characters whose on'yomi had the same pronunciation as the loanword itself, words like ''Amerika'' (kanji: , katakana: , meaning: America), ''karuta'' (kanji: , , katakana: , meaning: card, letter), and ''tenpura'' (kanji: , , katakana: , meaning: tempura), although the meanings of the characters used often had no relation to the words themselves. Only some of the old kanji spellings are in common use, like ''kan'' (, meaning: can). Kanji that are used to only represent the sounds of a word are called (). Because Chinese words have been borrowed from varying dialects at different times, a single character may have several on'yomi in Japanese. Written Japanese also includes a pair of known as , derived by simplifying Chinese characters selected to represent syllables of Japanese. The syllabaries differ because they sometimes selected different characters for a syllable, and because they used different strategies to reduce these characters for easy writing: the angular were obtained by selecting a part of each character, while were derived from the cursive forms of whole characters. Modern Japanese writing uses a composite system, using kanji for s, hiragana for inflectional endings and grammatical words, and katakana to transcribe non-Chinese loanwords as well as serve as a method to emphasize native words (similar to how italics are used in Latin-script languages).
KoreanIn times past, until the 15th century, in Korea, Literary Chinese was the dominant form of written communication prior to the creation of , the Korean alphabet. Much of the vocabulary, especially in the realms of science and sociology, comes directly from Chinese, comparable to Latin or Greek root words in European languages. However, due to the lack of tones in Modern Standard Korean, as the words were imported from Chinese, many dissimilar characters and syllables took on identical pronunciations, and subsequently identical spelling in hangul. Chinese characters are sometimes used to this day for either clarification in a practical manner, or to give a distinguished appearance, as knowledge of Chinese characters is considered by many Koreans a high class attribute and an indispensable part of a classical education. It is also observed that the preference for Chinese characters is treated as being conservative and Confucian. In South Korea, '' '' have become a politically contentious issue, with some urging a "purification" of the national language and culture by abandoning their use. Efforts to re-extend Hanja education to elementary schools in the 2015 were met with generally negative reaction from the public and from teachers' organizations. In South Korea, educational policy on characters has swung back and forth, often swayed by education ministers' personal opinions. At present, middle and high school students (grades 7 to 12) are taught 1,800 characters, albeit with the principal focus on recognition, with the aim of achieving newspaper literacy. There is a clear trend toward the exclusive use of hangul in day-to-day South Korean society. Hanja are still used to some extent, particularly in newspapers, weddings, place names and calligraphy (although it is nowhere near the extent of kanji use in day-to-day Japanese society). Hanja is also extensively used in situations where ambiguity must be avoided, such as academic papers, high-level corporate reports, government documents, and newspapers; this is due to the large number of homonyms that have resulted from Sino-Korean vocabulary, extensive borrowing of Chinese words. The issue of ambiguity is the main hurdle in any effort to "cleanse" the Korean language of Chinese characters. Characters convey meaning visually, while alphabets convey guidance to pronunciation, which in turn hints at meaning. As an example, in Korean dictionaries, the phonetic entry for ''gisa'' yields more than 30 different entries. In the past, this ambiguity had been efficiently resolved by parenthetically displaying the associated hanja. While hanja is sometimes used for Sino-Korean vocabulary, native Korean words are rarely, if ever, written in hanja. When learning how to write hanja, students are taught to memorize the native Korean pronunciation for the hanja's meaning and the Sino-Korean pronunciations (the pronunciation based on the Chinese pronunciation of the characters) for each hanja respectively so that students know what the syllable and meaning is for a particular hanja. For example, the name for the hanja is (mul-su) in which (mul) is the native Korean pronunciation for "water", while (su) is the Sino-Korean pronunciation of the character. The naming of hanja is similar to if "water" were named "water-aqua", "horse-equus", or "gold-aurum" based on a hybridization of both the English and the Latin names. Other examples include (saram-in) for "person/people", (keun-dae) for "big/large//great", (jakeul-so) for "small/little", (arae-ha) for "underneath/below/low", (abi-bu) for "father", and (naraireum-han) for "Han/Korea". In North Korea, the system was once completely banned since June 1949 due to fears of collapsed containment of the country; during the 1950s, Kim Il Sung had condemned all sorts of foreign languages (even the newly proposed New Korean Orthography). The ban continued into the 21st century. However, a textbook for university history departments containing 3,323 distinct characters was published in 1971. In the 1990s, school children were still expected to learn 2,000 characters (more than in South Korea or Japan). After Kim Jong Il, the second ruler of North Korea, died in December 2011, Kim Jong Un stepped up and began mandating the use of Hanja as a source of definition for the Korean language. Currently, it is said that North Korea teaches around 3,000 Hanja characters to North Korean students, and in some cases, the characters appear within advertisements and newspapers. However, it is also said that the authorities implore students not to use the characters in public. Due to North Korea's strict isolationism, accurate reports about hanja use in North Korea are hard to obtain.
OkinawanChinese characters are thought to have been first introduced to the Ryukyu Islands in 1265 by a Japanese Buddhist monk. After the Okinawan kingdoms became tributaries of Ming China, especially the Ryukyu Kingdom, Classical Chinese language, Classical Chinese was used in court documents, but was mostly used for popular writing and poetry. After Ryukyu became a vassal of Japan's Satsuma Domain, Chinese characters became more popular, as well as the use of Kanbun. In modern Okinawan, which is labeled as a Japanese dialect by the Japanese government, and hiragana are mostly used to write Okinawan, but Chinese characters are still used.
VietnameseIn Vietnam, Chinese characters (called or in Vietnamese) are now limited to ceremonial uses, but they were once in widespread use. Until the early 20th century, Literary Chinese was used in Vietnam for all official and scholarly writing. The oldest writing Chinese materials found in Vietnam is a epigraphy dated 618, erected by local Sui dynasty officials in Thanh Hoa. Around the 13th century, a script called was developed to record folk literature in the Vietnamese language. Similar to Zhuang Sawndip, the Nom script (demotic script) and its characters formed by fusing phonetic and semantic values of Chinese characters that resemble Vietnamese syllables. This process resulted in a highly complex system that was never mastered by more than 5% of the population. The oldest writing Vietnamese Chữ Nôm script written along with Chinese is a Buddhist inscription, dated 1209. In total, about 20,000 Chinese and Vietnamese epigraphy rubbings throughout Indochina were collected by the French School of the Far East, École française d'Extrême-Orient (EFEO) library in Hanoi before 1945. The oldest surviving extant manuscript in Vietnam is a late 15th-century bilingual Buddhist sutra ''Phật thuyết đại báo phụ mẫu ân trọng kinh,'' which is currently kept by the EFEO. The manuscript features Chinese texts in larger characters, and Vietnamese translation in smaller characters in Vietnamese language, Old Vietnamese. Every Sino-Vietnamese book in Vietnam after the ''Phật thuyết'' are dated either from 17th century to 20th century, and most are hand-written/copied works, only few are printed texts. The Institute of Hán-Nôm Studies's library in Hanoi had collected and kept 4,808 Sino-Vietnamese manuscripts in total by 1987. During French Indochina, French colonization in the late 19th and early 20th century, Literary Chinese fell out of use and was gradually replaced with the Latin-based Vietnamese alphabet. Currently this alphabet is the main script in Vietnam, but Chinese characters and are still used in some activities connected with Vietnamese traditional culture (e.g. calligraphy).
Other languagesSeveral minority languages of south and southwest China were formerly written with scripts based on Hanzi but also including many locally created characters. The most extensive is the ''sawndip'' script for the Zhuang language of Guangxi which is still used to this day. Other languages written with such scripts include Hmongic languages, Miao, Mienic languages, Yao, Bouyei language#Ancient Bouyei script, Bouyei, Mulam language, Mulam, Kam language, Kam, Bai language, Bai and Hani language, Hani. All these languages are now officially written using Latin-based scripts, while Chinese characters are still used for the Mulam language. Even today for Zhuang, according to survey, the traditional sawndip script has twice as many users as the official Latin script. The foreign dynasties that ruled northern China between the 10th and 13th centuries developed scripts that were inspired by Hanzi but did not use them directly: the Khitan large script, Khitan small script, Tangut script and Jurchen script. Other scripts in China that borrowed or adapted a few Chinese characters but are otherwise distinct include Geba script, Sui script, Yi script and the Lisu language#Lisu syllabary, Lisu syllabary.
Transcription of foreign languagesAlong with Persian alphabet, Persian and Arabic alphabet, Arabic, Chinese characters were also used as a foreign script to write the Mongolian language, where characters were used to phonetically transcribe Mongolian sounds. Most notably, the only surviving copies of ''The Secret History of the Mongols'' were written in such a manner; the Chinese characters (nowadays pronounced "Mánghuōlún niǔchá tuō[bo]cháān" in Chinese) is the rendering of ''Mongγol-un niγuca tobčiyan'', the title in Mongolian. Hanzi was also used to phonetically transcribe the Manchu language in the Qing dynasty. According to the Rev. John Gulick: "The inhabitants of other Asiatic nations, who have had occasion to represent the words of their several languages by Chinese characters, have as a rule used unaspirated characters for the sounds, g, d, b. The Muslims from Arabia and Persia have followed this method ... The Mongols, Manchu, and Japanese also constantly select unaspirated characters to represent the sounds g, d, b, and j of their languages. These surrounding Asiatic nations, in writing Chinese words in their own alphabets, have uniformly used g, d, b, etc., to represent the unaspirated sounds."
SimplificationChinese character simplification is the overall reduction of the number of strokes in the regular script of a set of Chinese characters.
Simplification in ChinaThe use of traditional Chinese characters versus simplified Chinese characters varies greatly, and can depend on both the local customs and the medium. Before the official reform, character simplifications were not officially sanctioned and generally adopted vulgar variants and idiosyncratic substitutions. Vulgar variant, Orthodox variants were mandatory in printed works, while the (unofficial) simplified characters would be used in everyday writing or quick notes. Since the 1950s, and especially with the publication of the 1964 list, the People's Republic of China has officially adopted simplified Chinese characters for use in mainland China, while Hong Kong, Macau, and the Republic of China (Taiwan) were not affected by the reform. There is no absolute rule for using either system, and often it is determined by what the target audience understands, as well as the upbringing of the writer. Although most often associated with the People's Republic of China, character simplification predates the 1949 communist victory. , cursive written text, are what inspired some simplified characters, and for others, some were already in use in print text, albeit not for most formal works. In the period of Republic of China (1912–1949), Republican China, discussions on character simplification took place within the government and the intelligentsia, in an effort to greatly reduce functional illiteracy among adults, which was a major concern at the time. Indeed, this desire by the Kuomintang to simplify the Chinese writing system (inherited and implemented by the Communist Party of China after its subsequent abandonment) also nursed aspirations of some for the adoption of a phonetic script based on the Latin script, and spawned such inventions as the Gwoyeu Romatzyh. The People's Republic of China issued its first round of official character simplifications in two documents, the first in 1956 and the second in 1964. A Second-round simplified Chinese character, second round of character simplifications (known as , or "second round simplified characters") was promulgated in 1977. It was poorly received, and in 1986 the authorities rescinded the second round completely, while making six revisions to the 1964 list, including the restoration of three traditional characters that had been simplified: ''dié'', ''fù'', ''xiàng''. As opposed to the second round, a majority of simplified characters in the first round were drawn from conventional abbreviated forms, or ancient forms. For example, the orthodox character ''lái'' ("come") was written as in the ( / , ''lìshū'') of the . This clerical form uses one fewer stroke, and was thus adopted as a simplified form. The character ''yún'' ("cloud") was written with the structure in the oracle bone script of the , and had remained in use later as a phonetic loan in the meaning of "to say" while the Radical (Chinese character), radical was added a semantic indicator to disambiguate the two. Simplified Chinese simply merges them.
Simplification in JapanIn the years after World War II, the Japanese government also instituted a series of orthographic reforms. Some characters were given simplified forms called ; the older forms were then labelled the . The number of characters in common use was restricted, and formal lists of characters to be learned during each grade of school were established, first the 1850-character list in 1945, the 1945-character list in 1981, and a 2136-character reformed version of the ''jōyō kanji'' in 2010. Many variant forms of characters and obscure alternatives for common characters were officially discouraged. This was done with the goal of facilitating learning for children and simplifying kanji use in literature and periodicals. These are simply guidelines, hence many characters outside these standards are still widely known and commonly used, especially those used for personal and place names (for the latter, see jinmeiyō kanji), as well as for some common words such as in which both old and new forms of the character are both acceptable and widely known amongst native Japanese speakers.
Southeast Asian Chinese communitiesSingapore underwent three successive rounds of character simplification. These resulted in some simplifications that differed from those used in mainland China. It ultimately adopted the reforms of the People's Republic of China in their entirety as official, and has implemented them in the Education in Singapore, educational system. However, unlike in China, personal names may still be registered in traditional characters. Malaysia started teaching a set of simplified characters at schools in 1981, which were also completely identical to the Mainland China simplifications. Chinese newspapers in Malaysia are published in either set of characters, typically with the headlines in traditional Chinese while the body is in simplified Chinese. Although in both countries the use of simplified characters is universal among the younger Chinese generation, a large majority of the older Chinese literate generation still use the traditional characters. Chinese shop signs are also generally written in traditional characters. In the Philippines, most Chinese schools and businesses still use the traditional characters and bopomofo, owing from influence from the Republic of China (Taiwan) due to the shared Hokkien heritage. Recently, however, more Chinese schools now use both simplified characters and pinyin. Since most readers of Chinese newspapers in the Philippines belong to the older generation, they are still published largely using traditional characters.
North AmericaPublic and private Chinese signage in the United States and Canada most often use traditional characters. There is some effort to get municipal governments to implement more simplified character signage due to recent immigration from mainland China. Most community newspapers printed in North America are also printed in traditional characters.
Comparisons of traditional Chinese, simplified Chinese, and JapaneseThe following is a comparison of Chinese characters in the ''Standard Form of National Characters'', a common traditional Chinese standard used in Taiwan; the ''Table of General Standard Chinese Characters'', the standard for Mainland Chinese simplified Chinese, ''jiantizi'' (simplified); and the ''jōyō kanji'', the standard for Japanese . Generally, the ''jōyō kanji'' are more similar to ''fantizi'' (traditional) than ''jiantizi'' are to ''fantizi''. "Simplified" refers to having significant differences from the Taiwan standard, not necessarily being a newly created character or a newly performed substitution. The characters in the List of Forms of Frequently Used Characters, Hong Kong standard and the are also known as "Traditional", but are not shown.
Written stylesThere are numerous styles, or scripts, in which Chinese characters can be written, deriving from various calligraphic and historical models. Most of these originated in China and are now common, with minor variations, in all countries where Chinese characters are used. The oracle bone script and the scripts found on Chinese bronze inscriptions are no longer used; the oldest script that is still in use today is the Seal Script ((), ''zhuànshū''). It evolved organically out of the Spring and Autumn period Zhou script, and was adopted in a standardized form under the first Emperor of China, Qin Shi Huang. The seal script, as the name suggests, is now used only in artistic seals. Few people are still able to read it effortlessly today, although the art of carving a traditional seal in the script remains alive; some Chinese calligraphy, calligraphers also work in this style. Scripts that are still used regularly are the "Clerical Script" ((), ''lìshū'') of the to the , the Wei Monumental Script, Weibei (, ''wèibēi''), the "Regular Script" ((), ''kǎishū''), which is used mostly for printing, and the "Semi-cursive Script" ((), ''xíngshū''), used mostly for handwriting. The cursive script (East Asia), cursive script ((), ''cǎoshū'', literally "grass script") is used informally. The basic character shapes are suggested, rather than explicitly realized, and the abbreviations are sometimes extreme. Despite being cursive to the point where individual strokes are no longer differentiable and the characters often illegible to the untrained eye, this script (also known as ''draft'') is highly revered for the beauty and freedom that it embodies. Some of the simplified Chinese characters adopted by the People's Republic of China, and some simplified characters used in Japan, are derived from the cursive script. The Japanese '' '' script is also derived from this script. There also exist scripts created outside China, such as the Japanese ''Edomoji'' styles; these have tended to remain restricted to their countries of origin, rather than spreading to other countries like the Chinese scripts.
CalligraphyThe art of writing Chinese characters is called Chinese calligraphy. It is usually done with ink brushes. In ancient China, Chinese calligraphy is one of the Four Arts of the Chinese Scholars. There is a minimalist set of rules of Chinese calligraphy. Every character from the Chinese scripts is built into a uniform shape by means of assigning it a geometric area in which the character must occur. Each character has a set number of brushstrokes; none must be added or taken away from the character to enhance it visually, lest the meaning be lost. Finally, strict regularity is not required, meaning the strokes may be accentuated for dramatic effect of individual style. Calligraphy was the means by which scholars could mark their thoughts and teachings for immortality, and as such, represent some of the most precious treasures that can be found from ancient China.
Typography and designThere are three major families of typefaces used in Chinese typography: * Ming (typeface), Song/Ming * East Asian sans-serif typeface, Sans-serif * Ming and sans-serif are the most popular in body text and are based on regular script for Chinese characters akin to Western serif and sans-serif typefaces, respectively. Regular script typefaces emulate regular script. The Ming (typeface), ''Song'' typeface ( / , ''sòngtǐ'') is known as the ''Ming'' typeface (, ''minchō'') in Japan, and it is also somewhat more commonly known as the ''Ming'' typeface ( / , ''míngtǐ'') than the ''Song'' typeface in and Hong Kong. The names of these styles come from the Song dynasty, Song and Ming dynasty, Ming dynasties, when block printing flourished in China. East Asian sans-serif typeface, Sans-serif typefaces, called black typeface ( / , ''hēitǐ'') in Chinese and Gothic typeface () in Japanese, are characterized by simple lines of even thickness for each stroke, akin to sans-serif styles such as Arial and Helvetica in Western typography. typefaces are also commonly used, but not as common as Ming or sans-serif typefaces for body text. Regular script typefaces are often used to teach students Chinese characters, and often aim to match the standard forms of the region where they are meant to be used. Most typefaces in the Song dynasty were regular script typefaces which resembled a particular person's handwriting (e.g. the handwriting of , Yan Zhenqing, or Liu Gongquan), while most modern regular script typefaces tend toward anonymity and regularity.
VariantsJust as Roman letters have a characteristic shape (lower-case letters mostly occupying the x-height, with ascenders or descenders on some letters), Chinese characters occupy a more or less square area in which the components of every character are written to fit in order to maintain a uniform size and shape, especially with small printed characters in Ming (typeface), Ming and East Asian sans-serif typeface, sans-serif styles. Because of this, beginners often practise writing on squared graph paper, and the Chinese sometimes use the term "Square-Block Characters" ( / , ''fāngkuàizì''), sometimes translated as ''tetragraph'', in reference to Chinese characters. Despite standardization, some nonstandard forms are commonly used, especially in handwriting. In older sources, even authoritative ones, variant characters are commonplace. For example, in the preface to the ''Kangxi Dictionary, Imperial Dictionary'', there are 30 variant characters which are not found in the dictionary itself. A few of these are reproduced at right.
Regional standardsThe nature of Chinese characters makes it very easy to produce allographs (variants) for many characters, and there have been many efforts at orthographical standardization throughout history. In recent times, the widespread usage of the characters in several nations has prevented any particular system becoming universally adopted and the standard form of many Chinese characters thus varies in different regions. Mainland China adopted simplified Chinese characters in 1956. They are also used in Singapore and Malaysia. Traditional Chinese characters are used in Hong Kong, Macau and . Postwar Japan has used its own less drastically simplified characters, Shinjitai, since 1946, while South Korea has limited its use of Chinese characters, and Vietnam and North Korea have completely abolished their use in favour of Vietnamese alphabet and Hangul, respectively. The standard character forms of each region are described in: * The List of Frequently Used Characters in Modern Chinese for Mainland China. * The List of Forms of Frequently Used Characters for Hong Kong. * The Standard Form of National Characters for Taiwan. * The list of for Japan. * The Han-Han Dae Sajeon (''de facto'') for Korea. In addition to strictness in character size and shape, Chinese characters are written with very precise rules. The most important rules regard the strokes employed, stroke placement, and stroke order. Just as each region that uses Chinese characters has standardized character forms, each also has standardized stroke orders, with each standard being different. Most characters can be written with just one correct stroke order, though some words also have many valid stroke orders, which may occasionally result in different stroke counts. Some characters are also written with different stroke orders due to character simplification.
Polysyllabic morphemesChinese characters are primarily morphosyllabic, meaning that most Chinese s are monosyllabic and are written with a single character, though in modern Chinese most ''words'' are disyllabic and dimorphemic, consisting of two syllables, each of which is a morpheme. In modern Chinese 10% of morphemes only occur as part of a given compound. However, a few morphemes are disyllabic, some of them dating back to Classical Chinese. Excluding foreign loan words, these are typically words for plants and small animals. They are usually written with a pair of phono-semantic compound characters sharing a common radical. Examples are ''húdié'' "butterfly" and ''shānhú'' "coral". Note that the ''hú'' of ''húdié'' and the ''hú'' of ''shānhú'' have the same phonetic, , but different radicals ("insect" and "jade", respectively). Neither exists as an independent morpheme except as a poetic abbreviation of the disyllabic word.
Polysyllabic charactersIn certain cases compound words and set phrases may be contracted into single characters. Some of these can be considered s, where characters represent whole words rather than syllable-morphemes, though these are generally instead considered Chinese ligature, ligatures or abbreviations (similar to scribal abbreviations, such as & for "et"), and as non-standard. These do see use, particularly in handwriting or decoration, but also in some cases in print. In Chinese, these ligatures are called ''héwén'' (), ''héshū'' (, ) or hétǐzì (, ), and in the special case of combining two characters, these are known as "two-syllable Chinese characters" (, ). A commonly seen example is the Double Happiness (calligraphy), double happiness symbol , formed as a ligature of and referred to by its disyllabic name (). In handwriting, numbers are very frequently squeezed into one space or combined – common ligatures include ''niàn,'' "twenty", normally read as ''èrshí,'' ''sà,'' "thirty", normally read as ''sānshí,'' and xì "forty", normally read as "sìshí". Calendars often use numeral ligatures in order to save space; for example, the "21st of March" can be read as ''.'' Modern examples particularly include International System of Units#Chinese and Japanese, Chinese characters for SI units. In Chinese these units are disyllabic and standardly written with two characters, as ''límǐ'' "centimeter" ( centi-, meter) or ''qiānwǎ'' "kilowatt". However, in the 19th century these were often written via compound characters, pronounced disyllabically, such as for or for – some of these characters were also used in Japan, where they were pronounced with borrowed European readings instead. These have now fallen out of general use, but are occasionally seen. Less systematic examples include ''túshūguǎn'' "library", a contraction of (simplified: ). Since polysyllabic characters are often non-standard, they are often excluded in character dictionaries. The use of such contractions is as old as Chinese characters themselves, and they have frequently been found in religious or ritual use. In the Oracle Bone script, personal names, ritual items, and even phrases such as () ''shòu yòu'' "receive blessings" are commonly contracted into single characters. A dramatic example is that in medieval manuscripts ''púsà'' "bodhisattva" (simplified: ) is sometimes written with a single character formed of a 2×2 grid of four (derived from the grass radical over two ). However, for the sake of consistency and standardization, the Communist Party of China, CPC seeks to limit the use of such polysyllabic characters in public writing to ensure that every character only has one syllable. Conversely, with the fusion of the diminutive ''-er'' suffix in Mandarin, some monosyllabic words may even be written with two characters, as in , ''huār'' "flower", which was formerly disyllabic. In most other languages that use the Chinese family of scripts, notably Korean, Vietnamese, and Zhuang, Chinese characters are typically monosyllabic, but in Japanese a single character is generally used to represent a borrowed monosyllabic Chinese morpheme (the ''on'yomi''), a polysyllabic native Japanese morpheme (the ''kun'yomi''), or even (in rare cases) a foreign loanword. These uses are completely standard and unexceptional.
Rare and complex charactersOften a character not commonly used (a "rare" or "variant" character) will appear in a personal or place name in Chinese, Japanese, Korean, and Vietnamese (see Chinese name, Japanese name, Korean name, and Vietnamese name, respectively). This has caused problems as many computer encoding systems include only the most common characters and exclude the less often used characters. This is especially a problem for personal names which often contain rare or classical, antiquated characters. One man who has encountered this problem is Taiwanese politician Yu Shyi-kun, due to the rarity of the last character (堃; pinyin: kūn) in his name. Newspapers have dealt with this problem in varying ways, including using software to combine two existing, similar characters, including a picture of the personality, or, especially as is the case with Yu Shyi-kun, simply substituting a homophone for the rare character in the hope that the reader would be able to make the correct inference. Taiwanese political posters, movie posters etc. will often add the bopomofo phonetic symbols next to such a character. Japanese newspapers may render such names and words in instead, and it is accepted practice for people to write names for which they are unsure of the correct kanji in katakana instead. There are also some extremely complex characters which have understandably become rather rare. According to Joël Bellassen (1989), the most complex Chinese character is / (U+2A6A5) ''zhé'' , meaning "verbose" and containing sixty-four strokes; this character fell from use around the 5th century. It might be argued, however, that while containing the most strokes, it is not necessarily the most complex character (in terms of difficulty), as it simply requires writing the same sixteen-stroke character ''lóng'' (lit. "dragon") four times in the space for one. Another 64-stroke character is / (U+2053B) ''zhèng'' composed of ''xīng/xìng'' (lit. "flourish") four times. One of the most complex characters found in modern Chinese dictionaries is (U+9F49) (''nàng'', , pictured below, middle image), meaning "snuffle" (that is, a pronunciation marred by a blocked nose), with "just" thirty-six strokes. Other stroke-rich characters include 靐 (bìng), with 39 strokes and 䨻 (bèng), with 52 strokes, meaning the loud noise of thunder. However, these are not in common use. The most complex character that can be input using the Microsoft New Phonetic IME 2002a for traditional Chinese is (''dá'', "the appearance of a dragon flying"). It is composed of the dragon radical represented three times, for a total of 16 × 3 = 48 strokes. Among the most complex characters in modern dictionaries and also in ''frequent modern use'' are (''yù'', "to implore"), with 32 strokes; (''yù'', "luxuriant, lush; gloomy"), with 29 strokes, as in (''yōuyù'', "depressed"); (''yàn'', "colorful"), with 28 strokes; and (''xìn'', "quarrel"), with 25 strokes, as in (''tiǎoxìn'', "to pick a fight"). Also in occasional modern use is (''xiān'' "fresh"; variant of ''xiān'') with 33 strokes. In Japanese writing system, Japanese, an 84-stroke ''kokuji'' exists: , normally read ''Taito (kanji), taito''. It is composed of triple "cloud" character () on top of the abovementioned triple "dragon" character (). Also meaning "the appearance of a dragon in flight", it has been pronounced ''otodo'', ''taito'', and ''daito''. The most elaborate character in the jōyō kanji list is the 29-stroke , meaning "depression" or "melancholy". The most complex Chinese character still in use may be /wikt:𰻞, 𰻞 (U+30EDE) (''biáng'', pictured right, bottom), with 58 strokes, which refers to biangbiang noodles, a type of noodle from China's Shaanxi province. This character along with the syllable ''biáng'' cannot be found in dictionaries. The fact that it represents a syllable that does not exist in any word means that it could be classified as a dialectal character.
Number of charactersThe total number of Chinese characters from past to present remains unknowable because new ones are being developed all the time – for instance, brands may create new characters when none of the existing ones allow for the intended meaning – or they have been invented by whoever wrote them and have never been adopted as official characters. Chinese characters are theoretically an open set and anyone can create new characters, though such inventions are rarely included in official character sets. The number of entries in major Chinese dictionaries is the best means of estimating the historical growth of character inventory. Even the ''Zhonghua Zihai'' does not include characters in the Chinese family of scripts created to represent non-Chinese languages, except the unique characters in use in Japan and Korea. Characters formed by Chinese principles in other languages include the roughly 1,500 Japanese-made ''kokuji'' given in the ''w:ja:国字の字典, Kokuji no Jiten,'' the Korean-made gukja, the over 10,000 Sawndip characters still in use in Guangxi, and the almost 20,000 Nôm characters formerly used in Vietnam. More divergent descendants of Chinese script include Tangut script, which created over 5,000 characters with similar strokes but different formation principles to Chinese characters. Modified radicals and new variants are two common reasons for the ever-increasing number of characters. There are about 300 radicals and 100 are in common use. Creating a new character by modifying the radical is an easy way to disambiguate homographs among ''xíngshēngzì'' pictophonetic compounds. This practice began long before the standardization of Chinese script by Qin Shi Huang and continues to the present day. The traditional 3rd-person pronoun ''tā'' ( "he, she, it"), which is written with the "person radical", illustrates modifying significs to form new characters. In modern usage, there is a graphic distinction between ''tā'' ( "she") with the "woman radical", ''tā'' ( "it") with the "animal radical", ''tā'' ( "it") with the "roof radical", and ''tā'' ( "He") with the "deity radical", One consequence of modifying radicals is the fossilization of rare and obscure variant logographs, some of which are not even used in Classical Chinese. For instance, ''he'' "harmony, peace", which combines the "grain radical" with the "mouth radical", has infrequent variants with the radicals reversed and with the "flute radical".
ChineseChinese characters (, meaning the semiotic sign, symbol, or glyph part) should not be confused with Chinese words (, meaning phrases or vocabulary words, consisting from a group of characters or possibly a single character), as the majority of modern Chinese words, unlike their and counterparts, are more frequently written with two or more characters, each character representing one syllable and/or . Knowing the meanings of the individual characters of a word will often allow the general meaning of the word to be inferred, but this is not always the case. Studies in China have shown that literate individuals know and use between 3,000 and 4,000 characters. Specialists in classical literature or history, who would often encounter characters no longer in use, are estimated to have a working vocabulary characters between 5,000 and 6,000. In China, which uses simplified Chinese characters, the ''Xiàndài Hànyǔ Chángyòng Zìbiǎo'' (, Chart of Common Characters of Modern Chinese) lists 2,500 common characters and 1,000 less-than-common characters, while the ''Xiàndài Hànyǔ Tōngyòng Zìbiǎo'' (, Chart of Generally Utilized Characters of Modern Chinese) lists 7,000 characters, including the 3,500 characters already listed above. In June 2013, the ''Table of General Standard Chinese Characters, Tōngyòng Guīfàn Hànzì Biǎo'' (, Table of General Standard Chinese Characters) became the current standard, replacing the previous two lists. It includes 8,105 characters, 3,500 as primary, 3,000 as secondary, and 1,605 as tertiary. GB2312, an early version of the national encoding standard used in the People's Republic of China, has 6,763 code points. GB18030, the modern, mandatory standard, has a much higher number. The Hànyǔ Shuǐpíng Kǎoshì (, Chinese Proficiency Test) after July 2021 would cover 3,000 characters and 11,092 words at its highest level (level nine). In Taiwan, which uses traditional Chinese characters, the Ministry of Education's ''Chart of Standard Forms of Common National Characters, Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo'' (, Chart of Standard Forms of Common National Characters) lists 4,808 characters; the ''Cì Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo'' (, Chart of Standard Forms of Less-Than-Common National Characters) lists another 6,341 characters. The ''Chinese Standard Interchange Code'' (CNS11643)—the official national encoding standard—supports 48,027 characters in its 1992 version (currently over 100,500 characters), while the most widely used encoding scheme, Big5, BIG-5, supports only 13,053. The Test of Chinese as a Foreign Language (, TOCFL) covers 8,000 words at its highest level (level six). The Taiwan Benchmarks for the Chinese Language (, TBCL), a guideline developed to describe levels of Chinese language proficiency, covers 3,100 characters and 14,470 words at its highest level (level seven). In Hong Kong, which uses traditional Chinese characters, the Education and Manpower Bureau's ''List of Graphemes of Commonly-Used Chinese Characters, Soengjung Zi Zijing Biu'' (), intended for use in elementary and junior secondary education, lists a total of 4,759 characters. In addition, there are a number of ''dialect characters'' () that are not generally used in formal written Chinese but represent colloquial terms in nonstandard . In general, it is common practice to use standard characters to transcribe Chinese dialects when obvious cognates with words in Standard Mandarin exist. However, when no obvious cognate could be found for a word, due to factors like irregular sound change or semantic drift in the meanings of characters, or the word originates from a non-Chinese source like a substratum from an earlier displaced language or a later borrowing from another language family, then characters are borrowed and used according to the rebus principle or invented in an ''ad hoc'' manner to transcribe it. These new characters are generally phonosemantic compounds (e.g., 侬, 'person' in Min), although a few are compound ideographs (e.g., 孬, 'bad', in Northeast Mandarin). Except in the case of Written Cantonese, there is no official orthography, and there may be several ways to write a dialectal word, often one that is etymologically correct and one or several that are based on the current pronunciation (e.g., 觸祭 (etymological) vs. 戳鸡/戳雞 (phonetic), 'eat' (low-register) in Shanghainese). Speakers of a dialect will generally recognize a dialectal word if it is transcribed according to phonetic considerations, while the etymologically correct form may be more difficult or impossible to recognize. For example, few Gan speakers would recognize the character meaning 'to lean' in their dialect, because this character (隑) has become archaic in Standard Mandarin. The historically "correct" transcription is often so obscure that it is uncovered only after considerable scholarly research into philology and historical phonology and may be disputed by other researchers. As an exception, Written Cantonese is in widespread use in Hong Kong, even for certain formal documents, due to the former British colony, colonial administration's recognition of Cantonese for use for official purposes. In Taiwan, there is also a body of semi-official characters used to represent Taiwanese Hokkien and Hakka Chinese, Hakka. For example, the vernacular character , pronounced ''cii11'' in Hakka Chinese, Hakka, means "to kill". Other varieties of Chinese with a significant number of speakers, like Shanghainese Wu, Gan Chinese, and Sichuanese Mandarin, Sichuanese, also have their own series of characters, but these are not often seen, except on advertising billboards directed toward locals and are not used in formal settings except to give precise transcriptions of witness statements in legal proceedings. Written Standard Mandarin is the preference for all mainland regions.
JapaneseIn Japanese there are 2,136 ''jōyō kanji'' (, lit. "frequently used Chinese characters") designated by the Japanese Ministry of Education; these are taught during primary and secondary school. The list is a recommendation, not a restriction, and many characters missing from it are still in common use. One area where character usage is officially restricted is in names, which may contain only government-approved characters. Since the ''jōyō kanji'' list excludes many characters that have been used in personal and place names for generations, an additional list, referred to as the ''jinmeiyō kanji'' (, lit. "kanji for use in personal names"), is published. It currently contains 863 characters. Today, a well-educated Japanese person may know upwards of 3,500 characters. The ''kanji kentei'' (, ''Nihon Kanji Nōryoku Kentei Shiken'' or ''Test of Japanese Kanji Aptitude'') tests a speaker's ability to read and write kanji. The highest level of the ''kanji kentei'' tests on approximately 6,000 kanji (corresponding to the kanji characters list of the JIS X 0208), though in practice few people attain (or need to attain) this level.
Korean''Basic Hanja for educational use'' () are a subset of 1,800 Hanja defined in 1972 by a South Korea education standard. 900 characters are expected to be learnt by middle school students and a further 900 at high school. In March 1991, the Supreme Court of South Korea published the ''Table of Hanja for Personal Name Use'' (), which allowed a total of 2,854 ''hanja'' in South Korean given names. The list expanded gradually, and until 2015 there are 8,142 ''hanja'' (including the Hanmun gyoyukyong gicho hanja, set of basic ''hanja'') permitted using in Korean names.
Modern creationNew characters can in principle be coined at any time, just as new words can be, but they may not be adopted. Significant historically recent coinages date to scientific terms of the 19th century. Specifically, Chinese coined new characters for chemical elements – see – which continue to be used and taught in schools in China and Taiwan. In Japan, in the Meiji era (specifically, late 19th century), new characters were coined for some (but not all) SI units, such as ( "meter" + "thousand, kilo-") for kilometer. These kokuji (Japanese-coinages) have found use in China as well – see International System of Units#Chinese and Japanese, Chinese characters for SI units for details. While new characters can be easily coined by writing on paper, they are difficult to represent on a computer – they must generally be represented as a picture, rather than as text – which presents a significant barrier to their use or widespread adoption. Compare this with the use of symbols as names in 20th century musical albums such as ''Led Zeppelin IV'' (1971) and ''Love Symbol Album'' (1993); an album cover may potentially contain any graphics, but in writing and other computation these symbols are difficult to use.
IndexingDozens of indexing schemes have been created for arranging Chinese characters in Chinese dictionary, Chinese dictionaries. The great majority of these schemes have appeared in only a single dictionary; only one such system has achieved truly widespread use. This is the system of Radical (Chinese character), radicals (see for example, the 214 so-called s). Chinese character dictionaries often allow users to locate entries in several ways. Many Chinese, Japanese, and Korean dictionaries of Chinese characters list characters in radical order: characters are grouped together by radical, and radicals containing fewer stroke (Chinese character), strokes come before radicals containing more strokes (radical-and-stroke sorting). Under each radical, characters are listed by their total number of strokes. It is often also possible to search for characters by sound, using pinyin (in Chinese dictionaries), bopomofo, zhuyin (in Taiwanese dictionaries), (in Japanese dictionaries) or (in Korean dictionaries). Most dictionaries also allow searches by total number of strokes, and individual dictionaries often allow other search methods as well. For instance, to look up the character where the sound is not known, e.g., (pine tree), the user first determines which part of the character is the radical (here ), then counts the number of strokes in the radical (four), and turns to the radical index (usually located on the inside front or back cover of the dictionary). Under the number "4" for radical stroke count, the user locates , then turns to the page number listed, which is the start of the listing of all the characters containing this radical. This page will have a sub-index giving remainder stroke numbers (for the non-radical portions of characters) and page numbers. The right half of the character also contains four strokes, so the user locates the number 4, and turns to the page number given. From there, the user must scan the entries to locate the character he or she is seeking. Some dictionaries have a sub-index which lists every character containing each radical, and if the user knows the number of strokes in the non-radical portion of the character, he or she can locate the correct page directly. Another dictionary system is the four corner method, where characters are classified according to the shape of each of the four corners. Most modern Chinese dictionaries and Chinese dictionaries sold to English speakers use the traditional radical-based character index in a section at the front, while the main body of the dictionary arranges the main character entries alphabetically according to their pinyin spelling. To find a character with unknown sound using one of these dictionaries, the reader finds the radical and stroke number of the character, as before, and locates the character in the radical index. The character's entry will have the character's pronunciation in pinyin written down; the reader then turns to the main dictionary section and looks up the pinyin spelling alphabetically.
See also* Adoption of Chinese literary culture * Character amnesia * Chinese character encoding * Chinese family of scripts * Chinese input methods for computers * Chinese numerals, Chinese numerals, or how to write numbers with Chinese characters * Chinese punctuation * Eight Principles of Yong * Horizontal and vertical writing in East Asian scripts * List of languages by writing system#Chinese characters and derivatives, List of languages written in Chinese characters and derivatives of Chinese characters * Romanization of Chinese * Stroke order * Transcription into Chinese characters
Works cited* * * * * * * * * * * (English translation of ''Wénzìxué Gàiyào'' , Shangwu, 1988.) * *
Further reading* ;Early works of historical interest * * * * * * * * * Translated by L.C. Hopkins with a Memoir of the Translator by W. Perceval Yetts
External links;History and construction of Chinese characters * Excerpt fro