Videotex character set
   HOME

TheInfoList



OR:

The character sets used by
Videotex Videotex (or interactive videotex) was one of the earliest implementations of an end-user information system. From the late 1970s to early 2010s, it was used to deliver information (usually pages of text) to a user in computer-like format, typi ...
are based, to greater or lesser extents, on
ISO/IEC 2022 ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
. Three Data Syntax systems are defined by
ITU The International Telecommunication Union is a specialized agency of the United Nations responsible for many matters related to information and communication technologies. It was established on 17 May 1865 as the International Telegraph Union ...
T.101, corresponding to the Videotex systems of different countries.


Data Syntax 1

Data Syntax 1 is defined in Annex B of T.101:1994. It is based on the
CAPTAIN Captain is a title, an appellative for the commanding officer of a military unit; the supreme leader of a navy ship, merchant ship, aeroplane, spacecraft, or other vessel; or the commander of a port, fire or police department, election precinct, e ...
system used in
Japan Japan ( ja, 日本, or , and formally , ''Nihonkoku'') is an island country in East Asia. It is situated in the northwest Pacific Ocean, and is bordered on the west by the Sea of Japan, while extending from the Sea of Okhotsk in the n ...
. Its graphical sets include
JIS X 0201 JIS X 0201, a Japanese Industrial Standard developed in 1969 (then called JIS C 6220 until the JIS category reform), was the first Japanese electronic character set to become widely used. It is either a 7-bit encoding or an 8-bit encoding, altho ...
and
JIS X 0208 JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standards, Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language. Th ...
. The following G-sets are available through
ISO/IEC 2022 ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
-based designation escapes:


Mosaic sets for Data Syntax 1

The mosaic sets supply characters for use in
semigraphics Text-based semigraphics or pseudographics is a primitive method used in early text mode video hardware to emulate raster graphics without having to implement the logic for such a display mode. There are two different ways to accomplish the emu ...
. � Not in Unicode


Data Syntax 2

Data Syntax 2 is defined in Annex C of T.101:1994. It corresponds to some European Videotex systems such as CEPT T/CD 06-01. The graphical character coding of Data Syntax 2 is based on T.51. The default G2 set of Data Syntax 2 is based on an older version of T.51, lacking the
non-breaking space In word processing and digital typesetting, a non-breaking space, , also called NBSP, required space, hard space, or fixed space (though it is not of fixed width), is a space character that prevents an automatic line break at its position. I ...
, soft hyphen, not sign ( ¬) and broken bar ( ¦) present in the current version, but adding a dialytika tonos (΅—combining form is U+0344) at the beginning of the row of diacritical marks for combination with codes from a
Greek Greek may refer to: Greece Anything of, from, or related to Greece, a country in Southern Europe: *Greeks, an ethnic group. *Greek language, a branch of the Indo-European language family. **Proto-Greek language, the assumed last common ancestor ...
primary set. An umlaut diacritic code distinct from the diaeresis code, as included in some versions of T.61, is also sometimes included. The default G1 set is the second mosaic set, corresponding roughly to the second mosaic set of Data Syntax 1. The default G3 set is the third mosaic set, matching the first mosaic set of Data Syntax 1 for 0x60 through 0x6D and 0x70 through 0x7D, and otherwise differing. The first mosaic set matches the second except for 0x40 through 0x5E: 0x40 through 0x5A follow ASCII (supplying uppercase letters), whereas the remainder are national variant characters; the displaced full block is placed at 0x7F. * Representation of 0x5B-5E is not guaranteed in international communication and may be replaced by national application oriented variants. * 0x5F may be displayed either as ⌗ (square) or _ (lower bar) to represent the terminator function required by Videotex services.


Data Syntax 3

Data Syntax 3 is defined in Annex D of T.101:1994. The graphical character coding of Data Syntax 3 is based on T.51. The supplementary set for Data Syntax 3 is based on an older version of T.51, lacking the
non-breaking space In word processing and digital typesetting, a non-breaking space, , also called NBSP, required space, hard space, or fixed space (though it is not of fixed width), is a space character that prevents an automatic line break at its position. I ...
, soft hyphen, not sign ( ¬) and broken bar ( ¦) present in the current version, and allocating non-spacing marks for a "vector overbar" and
solidus Solidus (Latin for "solid") may refer to: * Solidus (coin), a Roman coin of nearly solid gold * Solidus (punctuation), or slash, a punctuation mark * Solidus (chemistry), the line on a phase diagram below which a substance is completely solid * ...
and several semigraphic characters to unallocated space in that set. See the comments in the T.51 article for caveats about the combining mark Unicode mappings shown below. Unlike Unicode
combining characters In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks (including combining accents). Unicode also ...
, T.51 diacritic codes precede the base character.


C0 control codes

C0 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
for Videotex differ from ASCII as shown in the table below. The , , (LS1), (LS0) and codes are also available in some or all data syntaxes, but without change in name or semantic from ASCII.


C1 control codes

The following specialised
C1 control codes The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII. The codes represent additional information about the text, such as the position of a cursor, ...
are used in Videotex. There are four registered sets, with some differences between them.


References

{{character encodings Character sets Videotex