Unicode Alias Names And Abbreviations
In Unicode, characters can have a unique name. A character can also have one or more alias names. An alias name can be an abbreviation, a C0 or C1 control name, a correction, an alternate name or a figment. An alias too is unique over all names and aliases, and therefore identifying. Background The formal, primary Unicode name is unique over all names, only uses certain characters & format, and is guaranteed never to change. The formal name consists of characters A–Z (uppercase), 0–9, " " (space), and "-" (hyphen). Next to this name, a character can have one or more formal (normative) alias names. Such an alias name also follows the rules of a name: characters used (A-Z, -, 0-9, <space>) and not used (a-z, %, $, etc.). Alias names are also unique in the full name set (that is, all names and alias names are all unique in their combined set). Alias names are formally described in the Unicode Standard. In this sense, an abbreviation is also considered a Unicode '' ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Character (computing), characters and 168 script (Unicode), scripts used in various ordinary, literary, academic, and technical contexts. Unicode has largely supplanted the previous environment of a myriad of incompatible character sets used within different locales and on different computer architectures. The entire repertoire of these sets, plus many additional characters, were merged into the single Unicode set. Unicode is used to encode the vast majority of text on the Internet, including most web pages, and relevant Unicode support has become a common consideration in contemporary software development. Unicode is ultimately capable of encoding more than 1.1 million characters. The Unicode character repertoire is synchronized with Univers ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Unicode Character Property
The Unicode Standard assigns various properties to each Unicode character and code point. The properties can be used to handle characters (code points) in processes, like in line-breaking, script direction right-to-left or applying controls. Some "character properties" are also defined for code points that have no character assigned and code points that are labelled like "<not a character>". The character properties are described in Standard Annex #44. Properties have levels of forcefulness: normative, informative, contributory, or provisional. For simplicity of specification, a character property can be assigned by specifying a continuous range of code points that have the same property. Semantic elements Properties are displayed in the following order: ode ame c c c ecomposition v-dec v-dig v-num m lias; pper case ower case itle case *alias = corrected name. Obsolete. Now tracked with a separate database, but remains for Unicode 1.0 names. *bc = bidi (bidirectional) cat ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
ISO 6429
ANSI escape sequences are a standard for in-band signaling to control cursor location, color, font styling, and other options on video text terminals and terminal emulators. Certain sequences of bytes, most starting with an ASCII escape character and a bracket character, are embedded into text. The terminal interprets these sequences as commands, rather than text to display verbatim. ANSI sequences were introduced in the 1970s to replace vendor-specific sequences and became widespread in the computer equipment market by the early 1980s. Although hardware text terminals have become increasingly rare in the 21st century, the relevance of the ANSI standard persists because a great majority of terminal emulators and command consoles interpret at least a portion of the ANSI standard. History Almost all manufacturers of video terminals added vendor-specific escape sequences to perform operations such as placing the cursor at arbitrary positions on the screen. One example is the V ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Reference Mark
The reference mark or reference symbol "※" is a typographic mark or word used in Chinese, Japanese and Korean (CJK) writing. The symbol was used historically to call attention to an important sentence or idea, such as a prologue or footnote. As an indicator of a note, the mark serves the same purpose as the asterisk in English. However, in Japanese usage, the note text is placed directly into the main text immediately after the reference mark, rather than at the bottom of the page or end of chapter as is the case in English writing. Names The Japanese name, (; , , ), refers to the symbol's visual similarity to the for "rice" (). In Korean, the symbol's name, (), simply means "reference mark". Informally, the symbol is often called (; ), as it is often used to indicate the presence of pool halls, due to its visual similarity to two crossed cue sticks and four billiard balls. In Chinese, the symbol is called () or ( due to its visual similarity to "rice"). It is ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Control Pictures
Control Pictures is a Unicode block containing characters for graphically representing the C0 control codes, and other control characters. Its block name in Unicode 1.0 was Pictures for Control Codes. Block History The following Unicode-related documents record the purpose and process of defining specific characters in the Control Pictures block: See also * ISO 2047 ISO 2047 (Information processing – Graphical representations for the control characters of the 7-bit coded character set) is a standard for graphical representation of the control characters for debugging purposes, such as may be found in t ... References {{reflist Unicode blocks ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Enclosed Alphanumeric Supplement
Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane. The block is mostly an extension of the Enclosed Alphanumerics block, containing further enclosed alphanumeric characters which are not included in that block or Enclosed CJK Letters and Months. Most of the characters are single alphanumerics in boxes or circles, or with trailing commas. Two of the symbols are identified as dingbats. A number of multiple-letter enclosed abbreviations are also included, mostly to provide compatibility with Broadcast Markup Language standards (see ARIB STD B24 character set) and Japanese telecommunications networks' emoji sets. The block also includes the regional indicator symbols to be used for emoji country flag support. Block Emoji The Enclosed Alphanumeric Supplement ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Tags (Unicode Block)
Tags is a Unicode block containing formatting tag characters. The block is designed to mirror ASCII. It was originally intended for language tags, but has now been repurposed as emoji modifiers, specifically for region flags. Legacy use U+E0001, U+E0020–U+E007F were originally intended for invisibly tagging texts by language but that use is no longer recommended. All of those characters were deprecated in Unicode 5.1. With the release of Unicode 8.0, U+E0020–U+E007E are no longer deprecated characters. The change was made "to clear the way for the potential future use of tag characters for a purpose other than to represent language tags". Unicode states that "the use of tag characters to represent language tags in a plain text stream is still a deprecated mechanism for conveying language information about text". Current use With the release of Unicode 9.0, U+E007F is no longer a deprecated character. (U+E0001 LANGUAGE TAG remains deprecated.) The release of Emoji 5.0 in ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |