Verhoeff algorithm
   HOME

TheInfoList



OR:

The Verhoeff algorithm is a
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
for
error detection In information theory and coding theory with applications in computer science and telecommunications, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communi ...
first published by Dutch mathematician Jacobus Verhoeff in 1969. It was the first decimal
check digit A check digit is a form of redundancy check used for Error detection and correction, error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It ...
algorithm which detects all single-digit errors, and all transposition errors involving two adjacent digits, which was at the time thought impossible with such a code. The method was independently discovered by H. Peter Gumm in 1985, this time including a
formal proof In logic and mathematics, a formal proof or derivation is a finite sequence of sentences (known as well-formed formulas when relating to formal language), each of which is an axiom, an assumption, or follows from the preceding sentences in the s ...
and an extension to any base.


Goals

Verhoeff had the goal of finding a decimal code—one where the check digit is a single decimal digit—which detected all single-digit errors and all transpositions of adjacent digits. At the time, supposed proofs of the nonexistence of these codes made base-11 codes popular, for example in the ISBN check digit. His goals were also practical, and he based the evaluation of different codes on live data from the Dutch postal system, using a weighted points system for different kinds of error. The analysis broke the errors down into a number of categories: first, by how many digits are in error; for those with two digits in error, there are ''transpositions'' (''ab'' → ''ba''), ''twins'' (''aa'' → 'bb'), ''jump transpositions'' (''abc'' → ''cba''), ''phonetic'' (''1a'' → ''a0''), and ''jump twins'' (''aba'' → ''cbc''). Additionally there are omitted and added digits. Although the frequencies of some of these kinds of errors might be small, some codes might be immune to them in addition to the primary goals of detecting all singles and transpositions. The phonetic errors in particular showed linguistic effects, because in Dutch, numbers are typically read in pairs; and also while 50 sounds similar to 15 in Dutch, 80 doesn't sound like 18. Taking six-digit numbers as an example, Verhoeff reported the following classification of the errors:.


Description

The general idea of the algorithm is to represent each of the digits (0 through 9) as elements of the
dihedral group In mathematics, a dihedral group is the group (mathematics), group of symmetry, symmetries of a regular polygon, which includes rotational symmetry, rotations and reflection symmetry, reflections. Dihedral groups are among the simplest example ...
D_5. That is, map digits to D_5, manipulate these, then map back into digits. Let this mapping be m: , 9\to D_5 m = \begin 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9\\ e & r & r^2 & r^3 & r^4 & s & rs & r^2s & r^3s & r^4s\end Let the nth digit be a_n and let the number of digits be k . For example given the code 248 then k is 3 and a_3 = m(8) = r^s . Now define the permutation f: D_5 \to D_5 f = \begin e & r & r^2 & r^3 & r^4 & s & rs & r^2s & r^3s & r^4s\\ r & s & r^2s & rs & r^2 & r^3s & r^3 & e & r^4s & r^4\end For example f(r^3) = rs . Another example is f^2(r^3) = r^3 since f(f(r^3)) = f(rs) = r^3 Using multiplicative notation for the group operation of D_5, the check digit is then simply a value c such that f(a_1) \cdot f^2(a_2) \cdot \ldots \cdot f^k(a_k) \cdot f^(c) = e c is explicitly given by inverse permutation c = f^\left(\prod_^k f^n(a_n)^\right) For example the check digit for 248 is 5. To verify this, use the mapping to D_5 and insert into the LHS of the previous equation f(r^2) \cdot f^2(r^4) \cdot f^3(r^3s) \cdot f^4(s) = e To evaluate this permutation quickly use that f^4(s) = f^3(r^3s) = f^2(r^4) = f(r^2) = r^2 s to get that r^2 s \cdot r^2 s \cdot r^2 s \cdot r^2 s = e This is the same reflection being iteratively multiplied. Use that reflections are their own inverse. (r^2 s \cdot r^2 s) \cdot (r^2 s \cdot r^2 s) = e^2 = e In practice the algorithm is implemented using simple
lookup table In computer science, a lookup table (LUT) is an array data structure, array that replaces runtime (program lifecycle phase), runtime computation of a mathematical function (mathematics), function with a simpler array indexing operation, in a proc ...
s without needing to understand how to generate those tables from the underlying group and permutation theory. This is more properly considered a family of algorithms, as other permutations work too. Verhoeff's notes that the particular permutation, given above, is special as it has the property of detecting 95.3% of the phonetic errors. The strengths of the algorithm are that it detects all transliteration and transposition errors, and additionally most twin, twin jump, jump transposition and phonetic errors. The main weakness of the Verhoeff algorithm is its complexity. The calculations required cannot easily be expressed as a formula in say . Lookup tables are required for easy calculation. A similar code is the
Damm algorithm In error detection, the Damm algorithm is a check digit algorithm that detects all single-digit errors and all adjacent transposition errors. It was presented by H. Michael Damm in 2004, as a part of his PhD dissertation entitled ''Totally Antisy ...
, which has similar qualities.


Table-based algorithm

The Verhoeff algorithm can be implemented using three tables: a multiplication table ''d'', an inverse table ''inv'', and a permutation table ''p''. The first table, ''d'', is based on multiplication in the dihedral group D5. and is simply the
Cayley table Named after the 19th-century United Kingdom, British mathematician Arthur Cayley, a Cayley table describes the structure of a finite group by arranging all the possible products of all the group's elements in a square table reminiscent of an additi ...
of the group. Note that this group is not
commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Perhaps most familiar as a pr ...
, that is, for some values of ''j'' and ''k'', ''d''(''j'',''k'') ≠ ''d''(''k'', ''j''). The inverse table ''inv'' represents the
multiplicative inverse In mathematics, a multiplicative inverse or reciprocal for a number ''x'', denoted by 1/''x'' or ''x''−1, is a number which when Multiplication, multiplied by ''x'' yields the multiplicative identity, 1. The multiplicative inverse of a ra ...
of a digit, that is, the value that satisfies ''d''(''j'', ''inv''(''j'')) = 0. The permutation table ''p'' applies a
permutation In mathematics, a permutation of a set can mean one of two different things: * an arrangement of its members in a sequence or linear order, or * the act or process of changing the linear order of an ordered set. An example of the first mean ...
to each digit based on its position in the number. This is actually a single permutation applied iteratively; i.e. ''p''(''i''+''j'',''n'') = ''p''(''i'', ''p''(''j'',''n'')). The Verhoeff checksum calculation is performed as follows: # Create an array ''n'' out of the individual digits of the number, taken from right to left (rightmost digit is ''n''0, etc.). # Initialize the checksum ''c'' to zero. # For each index ''i'' of the array ''n,'' starting at zero, replace ''c'' with . The original number is valid if and only if . To generate a check digit, append a , perform the calculation: the correct check digit is ..


Examples

Generate a check digit for ''236'': ''c'' is 2, so the check digit is ''inv''(2), which is 3. Validate the check digit ''2363'': ''c'' is zero, so the check is correct.


See also

*
Luhn algorithm The Luhn algorithm or Luhn formula (creator: IBM scientist Hans Peter Luhn), also known as the " modulus 10" or "mod 10" algorithm, is a simple check digit formula used to validate a variety of identification numbers. The algorithm is in the pub ...
, earlier (1960) check digit algorithm


References


External links


Detailed description of the Verhoeff algorithm
{{DEFAULTSORT:Verhoeff Algorithm Modular arithmetic Checksum algorithms Error detection and correction 1969 introductions