HZ-GB-2312
The HZ character encoding is an encoding of GB 2312 that was formerly commonly used in email and USENET postings. It was designed in 1989 by Fung Fung Lee () of Stanford University, and subsequently codified in 1995 into RFC 1843. The HZ, short for ''Hanzi'' (), encoding was invented to facilitate the use of Chinese characters through e-mail, which at that time only allowed 7-bit characters. Therefore, in lieu of standard ISO 2022 escape sequences (as in the case of ISO-2022-JP) or 8-bit characters (as in the case of EUC), the HZ code uses only printable, 7-bit characters to represent Chinese characters. It was also popular in USENET networks, which in the late 1980s and early 1990s, generally did not allow transmission of 8-bit characters or escape characters. History HZ superseded the earlier "zW" encoding, which marked entire lines as being GB 2312 text by beginning them with the characters zW. Structure and use In the HZ encoding system, the character sequences "~" act as e ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
GB 2312
is a key official character set of the People's Republic of China, used for Simplified Chinese characters. GB2312 is the registered internet name for EUC-CN, which is its usual encoded form. ''GB'' refers to the Guobiao standards (国家标准), whereas the ''T'' suffix ( zh, c= 推荐, p=tuījiàn, l=recommendation, labels=no) denotes a non-mandatory standard. was originally a mandatory national standard designated . However, following a National Standard Bulletin of the People's Republic of China in 2017, GB 2312 is no longer mandatory, and its standard code is modified to . has been superseded by GBK and GB 18030, which include additional characters, but remains in widespread use as a subset of those encodings. , GB2312 is the second-most popular encoding served from China and territories (after UTF-8), with 5.5% of web servers serving a page declaring it. Globally, GB2312 is declared on 0.1% of all web pages. However, all major web browsers decode GB2312-marked docume ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
UTF-7
UTF-7 (7-bit Unicode Transformation Format) is an obsolete variable-length character encoding for representing Unicode text using a stream of ASCII characters. It was originally intended to provide a means of encoding Unicode text for use in Internet E-mail messages that was more efficient than the combination of UTF-8 with quoted-printable. UTF-7 (according to its RFC) isn't a "Unicode Transformation Format", as the definition can only encode code points in the BMP (the first 65536 Unicode code points, which does not include emojis and many other characters). However if a UTF-7 translator is to/from UTF-16 then it can (and probably does) encode each surrogate half as though it was a 16-bit code point, and thus can encode all code points. It is unclear if other UTF-7 software (such as translators to UTF-32 or UTF-8) support this. UTF-7 has never has been an official standard of the Unicode Consortium. It is known to have security issues, which is why software has been changed to ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO-2022-JP
ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the field of character encoding. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal emu ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Simplified Chinese
Simplification, Simplify, or Simplified may refer to: Mathematics Simplification is the process of replacing a mathematical expression by an equivalent one, that is simpler (usually shorter), for example * Simplification of algebraic expressions, in computer algebra * Simplification of boolean expressions i.e. logic optimization * Simplification by conjunction elimination in inference in logic yields a simpler, but generally non-equivalent formula * Simplification of fractions Science * Approximations simplify a more detailed or difficult to use process or model Linguistics * Simplification of Chinese characters * Simplified English (other) * Text simplification Music * Simplified (band), a 2002 rock band from Charlotte, North Carolina * ''Simplified'' (album), a 2005 album by Simply Red * "Simplify", a 2008 song by Sanguine * "Simplify", a 2018 song by Young the Giant from ''Mirror Master'' See also * Muntzing (simplification of electric circuits) * Reduction (math ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Hanzi
Chinese characters () are logograms developed for the writing of Chinese. In addition, they have been adapted to write other East Asian languages, and remain a key component of the Japanese writing system where they are known as ''kanji''. Chinese characters in South Korea, which are known as ''hanja'', retain significant use in Korean academia to study its documents, history, literature and records. Vietnam once used the ''chữ Hán'' and developed chữ Nôm to write Vietnamese before turning to a romanized alphabet. Chinese characters are the oldest continuously used system of writing in the world. By virtue of their widespread current use throughout East Asia and Southeast Asia, as well as their profound historic use throughout the Sinosphere, Chinese characters are among the most widely adopted writing systems in the world by number of users. The total number of Chinese characters ever to appear in a dictionary is in the tens of thousands, though most are graphic v ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Character Sets
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map". Early character codes associated with the optical or electrical telegraph could only represent a subset of the characters used in written languages, sometimes restricted to upper case letters, numerals and some punctuation only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages. Character encoding using internationally accepted standards permits worldwide interchange of text in electronic form. History The history of character codes illustrates the evolvi ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Microsoft Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for servers, and Windows IoT for embedded systems. Defunct Windows families include Windows 9x, Windows Mobile, and Windows Phone. The first version of Windows was released on November 20, 1985, as a graphical operating system shell for MS-DOS in response to the growing interest in graphical user interfaces (GUIs). Windows is the most popular desktop operating system in the world, with 75% market share , according to StatCounter. However, Windows is not the most used operating system when including both mobile and desktop OSes, due to Android's massive growth. , the most recent version of Windows is Windows 11 for consumer PCs and tablets, Windows 11 Enterprise for corporations, and Windows Server 2022 for servers. Genealogy By marketing ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others. Initially intended for use inside the Bell System, AT&T licensed Unix to outside parties in the late 1970s, leading to a variety of both academic and commercial Unix variants from vendors including University of California, Berkeley (Berkeley Software Distribution, BSD), Microsoft (Xenix), Sun Microsystems (SunOS/Solaris (operating system), Solaris), Hewlett-Packard, HP/Hewlett Packard Enterprise, HPE (HP-UX), and IBM (IBM AIX, AIX). In the early 1990s, AT&T sold its rights in Unix to Novell, which then sold the UNIX trademark to The Open Group, an industry consortium founded in 1996. The Open Group allows the use of the mark for certified operating systems that comply with the Single UNIX Specification (SUS). Unix systems are chara ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Kuten
JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. The official title of the current standard is . It was originally established as JIS C 6226 in 1978, and has been revised in 1983, 1990, and 1997. It is also called Code page 952 by IBM. The 1978 version is also called Code page 955 by IBM. Scope of use and compatibility The character set JIS X 0208 establishes is primarily for the purpose of between data processing systems and the devices connected to them, or mutually between data communication systems. This character set can be used for data processing and text processing. Partial implementations of the character set are not considered compatible. Because there are places where such things have happened as the original drafting committee of the first standard taking care to separate characters between level 1 and l ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Extended Unix Code
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The most commonly used EUC codes are variable-length encodings with a character belonging to an compliant coded character set (such as ASCII) taking one byte, and a character belonging to a 94x94 coded character set (such as ) represented in two bytes. The EUC-CN form of and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes, including an initial , whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyphs of the EUC codes, and more, and is generally more portable with fewer vendor deviations and errors. EUC is however still very popular, especially EUC-KR for South Korea. Encoding structure The structure of EUC is based on the standard, which specifies a system of graphical character sets which can be repres ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 code points, of which only 95 are , which severely limited its scope. All modern computer systems instead use Unicode, which has millions of code points, but the first 128 of these are the same as the ASCII set. The Internet Assigned Numbers Authority (IANA) prefers the name US-ASCII for this character encoding. ASCII is one of the List of IEEE milestones, IEEE milestones. Overview ASCII was developed from telegraph code. Its first commercial use was as a seven-bit teleprinter code promoted by Bell data services. Work on the ASCII standard began in May 1961, with the first meeting of the American Standards Association's (ASA) (now the American Nat ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Extended Unix Code
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese. The most commonly used EUC codes are variable-length encodings with a character belonging to an compliant coded character set (such as ASCII) taking one byte, and a character belonging to a 94x94 coded character set (such as ) represented in two bytes. The EUC-CN form of and EUC-KR are examples of such two-byte EUC codes. EUC-JP includes characters represented by up to three bytes, including an initial , whereas a single character in EUC-TW can take up to four bytes. Modern applications are more likely to use UTF-8, which supports all of the glyphs of the EUC codes, and more, and is generally more portable with fewer vendor deviations and errors. EUC is however still very popular, especially EUC-KR for South Korea. Encoding structure The structure of EUC is based on the standard, which specifies a system of graphical character sets which can be repres ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |