Chen–Ho Encoding
   HOME

TheInfoList



OR:

Chen–Ho encoding is a memory-efficient alternate system of
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
encoding for
decimal The decimal numeral system (also called the base-ten positional numeral system and denary or decanary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numeral ...
digits. The traditional system of binary encoding for decimal digits, known as
binary-coded decimal In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for ...
(BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10), even when using
packed BCD In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for ...
. The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple Boolean transformations avoiding any complex arithmetic operations like a
base conversion Positional notation (or place-value notation, or positional numeral system) usually denotes the extension to any base of the Hindu–Arabic numeral system (or decimal system). More generally, a positional system is a numeral system in which th ...
.


History

In what appears to have been a
multiple discovery Multiple may refer to: Economics *Multiple finance, a method used to analyze stock prices *Multiples of the P/E, price-to-earnings ratio *Chain stores, are also referred to as 'Multiples' *Box office multiple, the ratio of a film's total gross to ...
, some of the concepts behind what later became known as Chen–Ho encoding were independently developed by Theodore M. Hertz in 1969 and by
Tien Chi Chen Tien may refer to: *Tian, also known as Tien or T'ien, the Chinese religious idea of God or heaven *Tian (surname), also romanized as Tien *Tien (TV channel), a Dutch television channel *Tiền, currency used in Vietnam during the 19th and 20th cen ...
() (1928–) in 1971. Hertz of Rockwell filed a patent for his encoding in 1969, which was granted in 1971. Chen first discussed his ideas with Irving Tze Ho () (1921–2003) in 1971. Chen and Ho were both working for IBM at the time, albeit in different locations. Chen also consulted with Frank Chin Tung to verify the results of his theories independently. IBM filed a patent in their name in 1973, which was granted in 1974. At least by 1973, Hertz's earlier work must have been known to them, as the patent cites his patent as
prior art Prior art (also known as state of the art or background art) is a concept in patent law used to determine the patentability of an invention, in particular whether an invention meets the novelty and the inventive step or non-obviousness criteria f ...
. With input from Joseph D. Rutledge and John C. McPherson, the final version of the Chen–Ho encoding was circulated inside IBM in 1974 and published in 1975 in the journal ''
Communications of the ACM ''Communications of the ACM'' is the monthly journal of the Association for Computing Machinery (ACM). It was established in 1958, with Saul Rosen as its first managing editor. It is sent to all ACM members. Articles are intended for readers with ...
''. This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like
prefix code A prefix code is a type of code system distinguished by its possession of the "prefix property", which requires that there is no whole code word in the system that is a prefix (initial segment) of any other code word in the system. It is trivially t ...
. The encoding was referred to as ''Chen and Ho's scheme'' in 1975, ''Chen's encoding'' in 1982 and became known as ''Chen–Ho encoding'' or ''Chen–Ho algorithm'' since 2000. After having filed a patent for it in 2001, Michael F. Cowlishaw published a further refinement of Chen–Ho encoding known as
densely packed decimal Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits. The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in signif ...
(DPD) encoding in ''IEE Proceedings – Computers and Digital Techniques'' in 2002. DPD has subsequently been adopted as the ''decimal encoding'' used in the
IEEE 754-2008 The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
and ISO/IEC/IEEE 60559:2011
floating-point In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
standards.


Application

Chen noted that the digits zero through seven were simply encoded using three binary digits of the corresponding
octal The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
group. He also postulated that one could use a
flag A flag is a piece of fabric (most often rectangular or quadrilateral) with a distinctive design and colours. It is used as a symbol, a signalling device, or for decoration. The term ''flag'' is also used to refer to the graphic design empl ...
to identify a different encoding for the digits eight and nine, which would be encoded using a single bit. In practice, a series of Boolean transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a look-up table. Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called '' declets''). Of the 1024 states possible by using 10 bits, it leaves only 24 states unused (with
don't care In digital logic, a don't-care term (abbreviated DC, historically also known as ''redundancies'', ''irrelevancies'', ''optional entries'', ''invalid combinations'', ''vacuous combinations'', ''forbidden combinations'', ''unused states'' or ''l ...
bits typically set to 0 on write and ignored on read). With only 0.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits. Both, Hertz and Chen also proposed similar, but less efficient, encoding schemes to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bits. Larger sets of decimal digits could by divided into three- and two-digit groups. The patents also discuss the possibility to adapt the scheme to digits encoded in any other decimal codes than 8-4-2-1 BCD, like f.e.
Excess-3 Excess-3, 3-excess or 10-excess-3 binary code (often abbreviated as XS-3, 3XS or X3), shifted binary or Stibitz code (after George Stibitz, who built a relay-based adding machine in 1937) is a self-complementary binary-coded decimal (BCD) cod ...
, Excess-6, Jump-at-2, Jump-at-8,
Gray Grey (more common in British English) or gray (more common in American English) is an intermediate color between black and white. It is a neutral or achromatic color, meaning literally that it is "without color", because it can be composed o ...
, Glixon, O'Brien type-I and Gray–Stibitz code. The same principles could also be applied to other bases. In 1973, some form of Chen–Ho encoding appears to have been utilized in the address conversion hardware of the optional
IBM 7070 IBM 7070 was a decimal-architecture intermediate data-processing system that was introduced by IBM in 1958. It was part of the IBM 700/7000 series, and was based on discrete transistors rather than the vacuum tubes of the 1950s. It was the compa ...
/ 7074 emulation feature for the
IBM System/370 Model 165 The IBM System/370 Model 165 (and the Model 155) were jointly announced June 30, 1970 as "designed for ... the Seventies." That same day IBM announced the 370/195. They were the first three models of the IBM System/370 line of computers. Th ...
and 370 Model 168 computers. One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same number of digits).


Encodings for two decimal digits


Hertz encoding

* This encoding is not parity-preserving.


Early Chen–Ho encoding, method A

* This encoding is not parity-preserving.


Early Chen–Ho encoding, method B

* This encoding is not parity-preserving.


Patented and final Chen–Ho encoding

* Assuming certain values for the don't-care bits (f.e. 0), this encoding is parity-preserving.


Encodings for three decimal digits


Hertz encoding

* This encoding is not parity-preserving.


Early Chen–Ho encoding

* This encoding is not parity-preserving.


Patented Chen–Ho encoding

* This encoding is not parity-preserving.


Final Chen–Ho encoding

* This encoding is not parity-preserving.


Storage efficiency


See also

*
Binary-coded decimal In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for ...
(BCD) *
Densely packed decimal Densely packed decimal (DPD) is an efficient method for binary encoding decimal digits. The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in signif ...
(DPD) *
DEC RADIX 50 RADIX 50 or RAD50 (also referred to as RADIX50, RADIX-50 or RAD-50), is an uppercase-only character encoding created by Digital Equipment Corporation (DEC) for use on their DECSYSTEM-20, DECsystem, Programmed Data Processor, PDP, and VAX com ...
/ MOD40 *
IBM SQUOZE SQUOZE (abbreviated as SQZ) is a memory-efficient representation of a combined source and relocatable object program file with a symbol table on punched cards which was introduced in 1958 with the SCAT assembler on the SHARE Operating System ( ...
*
Packed BCD In computing and electronic systems, binary-coded decimal (BCD) is a class of binary encodings of decimal numbers where each digit is represented by a fixed number of bits, usually four or eight. Sometimes, special bit patterns are used for ...
*
Unicode transformation format Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, whic ...
(UTF) (similar encoding scheme) *
Length-limited Huffman code In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code proceeds by means of Huffman coding, an algor ...


Notes


References


Further reading

* * * (60 pages

(40 pages

and (11 pages

(NB. Three expired patents cited in both, the and s.) * * * {{DEFAULTSORT:Chen-Ho encoding Binary arithmetic