ARIB STD B24 Character Set

	ARIB STD B24 Character Set Volume 1 of the Association of Radio Industries and Businesses (ARIB) STD-B24 standard for Broadcast Markup Language specifies, amongst other details, a character encoding for use in Japanese-language broadcasting. It was introduced on . The latest revision is version 6.3 as of . It includes a number of not found in the base standards (JIS X 0208 and JIS X 0201). It was the source standard for many symbol characters which were added to Unicode, including portions of the Miscellaneous Symbols, Enclosed Alphanumeric Supplement and Enclosed Ideographic Supplement blocks. Its contributions partially overlap the Unicode emoji, but were added a year earlier, in Unicode 5.2. Fascicle 1 of the ARIB STD-B62 standard, published in 2014, defines Unicode mappings for a selection of the B24 extended characters (excluding, for example, those duplicated by JIS X 0213), as well as a few extended Kanji. It also includes a mapping of utilised characters outside the Basic Multilingual Plane to ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	JIS X 0201 JIS X 0201, a Japanese Industrial Standard developed in 1969 (then called JIS C 6220 until the JIS category reform), was the first Japanese electronic character set to become widely used. It is either a 7-bit encoding or an 8-bit encoding, although the 8-bit form is dominant for modern use (or was until Unicode, e.g. UTF-8 took over). The full name of this standard is ''7-bit and 8-bit coded character sets for information interchange'' (). The first 96 codes comprise an ISO 646 variant, mostly following ASCII with some differences, while the second 96 character codes represent the phonetic Japanese katakana signs. Since the encoding does not provide any way to express hiragana or kanji, it is only capable of expressing simplified written Japanese. Nevertheless, it is possible to express, at least phonetically, the full range of sounds in the language. In the 1970s, this was acceptable for media such as text mode computer terminals, telegrams, receipts or other electronically hand ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Enclosed Alphanumeric Supplement Enclosed Alphanumeric Supplement is a Unicode block consisting of Latin alphabet characters and Arabic numerals enclosed in circles, ovals or boxes, used for a variety of purposes. It is encoded in the range U+1F100–U+1F1FF in the Supplementary Multilingual Plane. The block is mostly an extension of the Enclosed Alphanumerics block, containing further enclosed alphanumeric characters which are not included in that block or Enclosed CJK Letters and Months. Most of the characters are single alphanumerics in boxes or circles, or with trailing commas. Two of the symbols are identified as dingbats. A number of multiple-letter enclosed abbreviations are also included, mostly to provide compatibility with Broadcast Markup Language standards (see ARIB STD B24 character set) and Japanese telecommunications networks' emoji sets. The block also includes the regional indicator symbols to be used for emoji country flag support. Emoji The Enclosed Alphanumeric Supplement block conta ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Space Character In computer programming, whitespace is any character or series of characters that represent horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visible mark, but typically does occupy an area on a page. For example, the common whitespace symbol (also ASCII 32) represents a blank space punctuation character in text, used as a word divider in Western scripts. Overview With many keyboard layouts, a whitespace character may be entered by pressing . Horizontal whitespace may also be entered on many keyboards with the key, although the length of the space may vary. Vertical whitespace may be input by typing , which creates a 'newline' code sequence in most programs. On older keyboards, this key may instead be labeled , a holdover from typewriter keyboards' carriage return keys, which generated an electromechanical return to the left stop (Unicode character ) and a move to the next line (). Many early computer games used whitesp ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Pseudographics Text-based semigraphics or pseudographics is a primitive method used in early text mode video hardware to emulate raster graphics without having to implement the logic for such a display mode. There are two different ways to accomplish the emulation of raster graphics. The first one is to create a low-resolution all points addressable mode using a set of special characters with all binary combinations of a certain subdivision matrix of the text mode character size; this method is referred to as block graphics, or sometimes mosaic graphics. The second one is to use special shapes instead of glyphs (letters and figures) that appear as if drawn in raster graphics mode, sometimes referred to as semi- or pseudo-graphics; an important example of this is box-drawing characters. Semigraphical characters (including some block elements) are still incorporated into the BIOS of any VGA compatible video card, so any PC can display these characters from the moment it is turned on, even whe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	JISCII Code page 895 (CCSID 895) is a 7-bit character set and is Japan's national ISO 646 variant. It is the Roman set (first or left half) of the JIS X 0201 (formerly JIS C 6220) Japanese Standard and is variously called Japan 7-Bit Latin, JISCII, JIS Roman, JIS C6220-1969-ro, ISO646-JP or Japanese-Roman. Its ISO-IR registration number is 14. Amongst IBM's code pages, it accompanies code page 896 (Half-width_kana#Encoding, half-width katakana), which encodes the Kana set of JIS X 0201 with extensions, and code page 897 which encodes the 8-bit form of JIS X 0201. It is used in Unix-like systems and, when combined with code page 896 and the 2-byte IBM code page 952 and code page 953, makes up the four code-sets of code page 954, one of IBM's versions of Extended Unix Code#EUC-JP, EUC-JP. Codepage layout See also * Shift JIS References {{DEFAULTSORT:Code Page 895 IBM AIX code pages, 895 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	ISO-2022-JP ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the field of character encoding. Originating in 1971, it was most recently revised in 1994. ISO 2022 specifies a general structure which character encodings can conform to, dedicating particular ranges of bytes ( 0x00–1F and 0x7F–9F) to be used for non-printing control codes for formatting and in-band instructions (such as line breaks or formatting instructions for text terminals), rather than graphical characters. It also specifies a syntax for escape sequences, multiple-byte sequences beginning with the control code, which can likewise be used for in-band instructions. Specific sets of control codes and escape sequences designed to be used with ISO 2022 include ISO/IEC 6429, portions of which are implemented by ANSI.SYS and terminal emu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Pseudographics Text-based semigraphics or pseudographics is a primitive method used in early text mode video hardware to emulate raster graphics without having to implement the logic for such a display mode. There are two different ways to accomplish the emulation of raster graphics. The first one is to create a low-resolution all points addressable mode using a set of special characters with all binary combinations of a certain subdivision matrix of the text mode character size; this method is referred to as block graphics, or sometimes mosaic graphics. The second one is to use special shapes instead of glyphs (letters and figures) that appear as if drawn in raster graphics mode, sometimes referred to as semi- or pseudo-graphics; an important example of this is box-drawing characters. Semigraphical characters (including some block elements) are still incorporated into the BIOS of any VGA compatible video card, so any PC can display these characters from the moment it is turned on, even whe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Private Use Areas In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium. Three private use areas are defined: one in the Basic Multilingual Plane (), and one each in, and nearly covering, planes 15 and 16 (, ). The code points in these areas cannot be considered as standardized characters in Unicode itself. They are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments. Under the Unicode Stability Policy, the Private Use Areas will remain allocated for that purpose in all future Unicode versions. Assignments to Private Use Area characters need not be private in the sense of strictly internal to an organisation; a number of assignment schemes have been published by several organisations. Such publication may include a font that supports the definition (showing the glyphs), and software making use of the private-use characters (e ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Basic Multilingual Plane In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal format (U+''hhhhhh''). Plane 0 is the Basic Multilingual Plane (BMP), which contains most commonly used characters. The higher planes 1 through 16 are called "supplementary planes". The last code point in Unicode is the last code point in plane 16, U+10FFFF. As of Unicode version , five of the planes have assigned code points (characters), and seven are named. The limit of 17 planes is due to UTF-16, which can encode 220 code points (16 planes) as pairs of words, plus the BMP as a single word. UTF-8 was designed with a much larger limit of 231 (2,147,483,648) code points (32,768 planes), and would still be able to encode 221 (2,097,152) code points (32 planes) even under the current limit of 4 bytes. The 17 planes can accommodate 1,114,1 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	JIS X 0213 JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan. This standard extends JIS X 0208. The first version was published in 2000 and revised in 2004 (JIS2004) and 2012. As well as adding a number of special characters, characters with diacritic marks, etc., it included an additional 3,625 kanji. The full name of the standard is . JIS X 0213 has two "planes" (94×94 character tables). Plane 1 is a superset of JIS X 0208 containing kanji sets level 1 to 3 and non-kanji characters such as Hiragana, Katakana (including letters used to write the Ainu language), Latin, Greek and Cyrillic alphabets, digits, symbols and so on. Plane 2 contains only level 4 kanji set. Total number of the defined characters is 11,233. Each character is capable of being encoded in two bytes. This standard largely replaced the rarely used JIS X 0212-1990 "supplementary" standard, which included 5,801 kanji and 266 non-kanji. Of the additional 3,6 ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]