HOME

TheInfoList



OR:

This is a list of some binary codes that are (or have been) used to represent
text Text may refer to: Written word * Text (literary theory), any object that can be read, including: **Religious text, a writing that a religious tradition considers to be sacred **Text, a verse or passage from scripture used in expository preachin ...
as a sequence of
binary digit Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that ta ...
s "0" and "1". Fixed-width binary codes use a set number of bits to represent each character in the text, while in variable-width binary codes, the number of bits may vary from character to character. the binary codes are used to read the computer language.


Five-bit binary codes

Several different five-bit codes were used for early
punched tape Five- and eight-hole punched paper tape Paper tape reader on the Harwell computer with a small piece of five-hole tape connected in a circle – creating a physical program loop Punched tape or perforated paper tape is a form of data storage ...
systems. Five bits per character only allows for 32 different characters, so many of the five-bit codes used two sets of characters per value referred to as FIGS (figures) and LTRS (letters), and reserved two characters to switch between these sets. This effectively allowed the use of 60 characters. Standard five-bit standard codes are: * International Telegraph Alphabet No. 1 (ITA1) – Also commonly referred to as
Baudot code The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII ...
* International Telegraph Alphabet No. 2 (ITA2) – Also commonly referred to as
Murray code The Baudot code is an early character encoding for telegraphy invented by Émile Baudot in the 1870s. It was the predecessor to the International Telegraph Alphabet No. 2 (ITA2), the most common teleprinter code in use until the advent of ASCII. ...
* American Teletypewriter code (USTTY) – A variant of ITA2 used in the USA * DIN 66006 – Developed for the presentation of
ALGOL ALGOL (; short for "Algorithmic Language") is a family of imperative computer programming languages originally developed in 1958. ALGOL heavily influenced many other languages and was the standard method for algorithm description used by the ...
/ ALCOR programs on paper tape and punch cards The following early computer systems each used its own five-bit code: * J. Lyons and Co. LEO (Lyon's Electronic Office) * English Electric DEUCE *
University of Illinois at Urbana-Champaign The University of Illinois Urbana-Champaign (U of I, Illinois, University of Illinois, or UIUC) is a public land-grant research university in Illinois in the twin cities of Champaign and Urbana. It is the flagship institution of the Univ ...
ILLIAC ILLIAC (Illinois Automatic Computer) was a series of supercomputers built at a variety of locations, some at the University of Illinois at Urbana–Champaign. In all, five computers were built in this series between 1951 and 1974. Some more modern ...
*
ZEBRA Zebras (, ) (subgenus ''Hippotigris'') are African equines with distinctive black-and-white striped coats. There are three living species: the Grévy's zebra (''Equus grevyi''), plains zebra (''E. quagga''), and the mountain zebra (''E. zebr ...
* EMI 1100 * Ferranti Mercury, Pegasus, and Orion systems The steganographic code, commonly known as Bacon's cipher uses groups of 5 binary-valued elements to represent letters of the alphabet.


Six-bit binary codes

Six bits per character allows 64 distinct characters to be represented. Examples of six-bit binary codes are: * International Telegraph Alphabet No. 4 ( ITA4) *
Six-bit BCD BCD (''binary-coded decimal''), also called alphanumeric BCD, alphameric BCD, BCD Interchange Code, or BCDIC, is a family of representations of numerals, uppercase Latin letters, and some special and control characters as six-bit character code ...
(Binary Coded Decimal), used by early mainframe computers. * Six-bit ASCII subset of the primitive seven-bit ASCII *
Braille Braille (Pronounced: ) is a tactile writing system used by people who are visually impaired, including people who are blind, deafblind or who have low vision. It can be read either on embossed paper or by using refreshable braille disp ...
– Braille characters are represented using six dot positions, arranged in a rectangle. Each position may contain a raised dot or not, so Braille can be considered to be a six-bit binary code. See also:
Six-bit character code A six-bit character code is a character encoding designed for use on computers with word lengths a multiple of 6. Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punc ...
s


Seven-bit binary codes

Examples of seven-bit binary codes are: * International Telegraph Alphabet No. 3 ( ITA3) – derived from the Moore ARQ code, and also known as the RCA *
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
– The ubiquitous ASCII code was originally defined as a seven-bit character set. The ASCII article provides a detailed set of equivalent standards and variants. In addition, there are various extensions of ASCII to eight bits (see Eight-bit binary codes) * CCIR 476 – Extends ITA2 from 5 to 7 bits, using the extra 2 bits as check digits * International Telegraph Alphabet No. 4 ( ITA4)


Eight-bit binary codes

* Extended ASCII – A number of standards extend ASCII to eight bits by adding a further 128 characters, such as: **
HP Roman In computing HP Roman is a family of character sets consisting of HP Roman Extension, HP Roman-8, HP Roman-9 and several variants. Originally introduced by Hewlett-Packard around 1978, revisions and adaptations were published several times up t ...
**
ISO/IEC 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. ...
**
Mac OS Roman Mac OS Roman is a character encoding created by Apple Computer, Inc. for use by Macintosh computers. It is suitable for representing text in English and several other Western languages. Mac OS Roman encodes 256 characters, the first 128 of which ...
**
Windows-1252 Windows-1252 or CP-1252 ( code page 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows for English and many European languages including Spanish, French, and German. I ...
*
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding ...
– Used in early IBM computers and current IBM i and
System z IBM Z is a family name used by IBM for all of its z/Architecture mainframe computers. In July 2017, with another generation of products, the official family was changed to IBM Z from IBM z Systems; the IBM Z family now includes the newest mod ...
systems.


10-bit binary codes

*AUTOSPEC – Also known as Bauer code. AUTOSPEC repeats a five-bit character twice, but if the character has odd parity, the repetition is inverted. * Decabit – A datagram of electronic pulses which are transmitted commonly through power lines. Decabit is mainly used in Germany and other European countries.


16-bit binary codes

*
UCS-2 The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, ''Information technology — Universal Coded Character Set (UCS)'' (plus amendments to that standard), w ...
– An obsolete encoding capable of representing the
basic multilingual plane In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecima ...
of Unicode


32-bit binary codes

*
UTF-32/UCS-4 UTF-32 (32-bit Unicode Transformation Format) is a fixed-length encoding used to encode Unicode code points that uses exactly 32 bits (four bytes) per code point (but a number of leading bits must be zero as there are far fewer than 232 Unicode co ...
– A four-bytes-per-character representation of
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, wh ...


Variable-length binary codes

*
UTF-8 UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''. UTF-8 is capable of ...
– Encodes characters in a way that is mostly compatible with
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
but can also encode the full repertoire of Unicode characters with sequences of up to four 8-bit bytes. *
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as cod ...
– Extends UCS-2 to cover the whole of Unicode with sequences of one or two 16-bit elements *
GB 18030 GB 18030 is a Chinese government standard, described as ''Information Technology — Chinese coded character set'' and defines the required language and character support necessary for software in China. GB18030 is the registered Internet n ...
– A full-Unicode variable-length code designed for compatibility with older Chinese multibyte encodings *
Huffman coding In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code proceeds by means of Huffman coding, an algo ...
– A technique for expressing more common characters using shorter bit strings than are used for less common characters
Data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
systems such as
Lempel–Ziv–Welch Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempe ...
can compress arbitrary binary data. They are therefore not binary codes themselves but may be applied to binary codes to reduce storage needs


Other

* Morse code is a variable-length telegraphy code, which traditionally uses a series of long and short pulses to encode characters. It relies on gaps between the pulses to provide separation between letters and words, as the letter codes do not have the "prefix property". This means that Morse code is not necessarily a binary system, but in a sense may be a ternary system, with a 10 for a "dit" or a "dot", a 1110 for a dash, and a 00 for a single unit of separation. Morse code can be represented as a binary stream by allowing each bit to represent one unit of time. Thus a "dit" or "dot" is represented as a 1 bit, while a "dah" or "dash" is represented as three consecutive 1 bits. Spaces between symbols, letters, and words are represented as one, three, or seven consecutive 0 bits. For example, "NO U" in Morse Code is "— . — — — . . —", which could be represented in binary as "1110100011101110111000000010101110". If, however, Morse code is represented as a ternary system, "NO U" would be represented as "1110, 10, 00, 1110, 1110, 1110, 00, 00, 00, 10, 10, 1110".


See also

* List of computer character sets


References

{{Reflist Primitive types Data types Computing terminology Data unit Units of information