A check digit is a form of redundancy check used for
error detection on identification numbers, such as bank account numbers, which are used in an application where they will at least sometimes be input manually. It is analogous to a binary
parity bit
A parity bit, or check bit, is a bit added to a string of binary code. Parity bits are a simple form of error detecting code. Parity bits are generally applied to the smallest units of a communication protocol, typically 8-bit octets (bytes) ...
used to check for errors in computer-generated data. It consists of one or more digits (or letters) computed by an algorithm from the other digits (or letters) in the sequence input.
With a check digit, one can detect simple errors in the input of a series of characters (usually digits) such as a single mistyped digit or some permutations of two successive digits.
Design
Check digit
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s are generally designed to capture ''human''
transcription errors. In order of complexity, these include the following:
* letter/digit errors, such as l → 1 or O → 0
* single-digit errors, such as 1 → 2
* transposition errors, such as 12 → 21
* twin errors, such as 11 → 22
* jump transpositions errors, such as 132 → 231
* jump twin errors, such as 131 → 232
* phonetic errors, such as 60 → 16 ("sixty" to "sixteen")
In choosing a system, a high probability of catching errors is traded off against implementation difficulty; simple check digit systems are easily understood and implemented by humans but do not catch as many errors as complex ones, which require sophisticated programs to implement.
A desirable feature is that left-padding with zeros should not change the check digit. This allows variable length numbers to be used and the length to be changed.
If there is a single check digit added to the original number, the system will not always capture ''multiple'' errors, such as two replacement errors (12 → 34) though, typically, double errors will be caught 90% of the time (both changes would need to change the output by offsetting amounts).
A very simple check digit method would be to take the sum of all digits (
digital sum)
modulo
In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation.
Given two positive numbers and , mo ...
10. This would catch any single-digit error, as such an error would always change the sum, but does not catch any transposition errors (switching two digits) as re-ordering does not change the sum.
A slightly more complex method is to take the
weighted sum of the digits, modulo 10, with different weights for each number position.
To illustrate this, for example if the weights for a four digit number were 5, 3, 2, 7 and the number to be coded was 4871, then one would take 5×4 + 3×8 + 2×7 + 7×1 = 65, i.e. 65 modulo 10, and the check digit would be 5, giving 48715.
Systems with weights of 1, 3, 7, or 9, with the weights on neighboring numbers being different, are widely used: for example, 31 31 weights in
UPC codes, 13 13 weights in
EAN numbers (GS1 algorithm), and the 371 371 371 weights used in United States bank
routing transit numbers. This system detects all single-digit errors and around 90% of transposition errors. 1, 3, 7, and 9 are used because they are
coprime
In number theory, two integers and are coprime, relatively prime or mutually prime if the only positive integer that is a divisor of both of them is 1. Consequently, any prime number that divides does not divide , and vice versa. This is equiv ...
with 10, so changing any digit changes the check digit; using a coefficient that is divisible by 2 or 5 would lose information (because 5×0 = 5×2 = 5×4 = 5×6 = 5×8 = 0 modulo 10) and thus not catch some single-digit errors. Using different weights on neighboring numbers means that most transpositions change the check digit; however, because all weights differ by an even number, this does not catch transpositions of two digits that differ by 5 (0 and 5, 1 and 6, 2 and 7, 3 and 8, 4 and 9), since the 2 and 5 multiply to yield 10.
The code instead uses modulo 11, which is prime, and all the number positions have different weights 1, 2, ... 10. This system thus detects all single-digit substitution and transposition errors (including jump transpositions), but at the cost of the check digit possibly being 10, represented by "X". (An alternative is simply to avoid using the serial numbers which result in an "X" check digit.) instead uses the GS1 algorithm used in EAN numbers.
More complicated algorithms include the
Luhn algorithm
The Luhn algorithm or Luhn formula (creator: IBM scientist Hans Peter Luhn), also known as the " modulus 10" or "mod 10" algorithm, is a simple check digit formula used to validate a variety of identification numbers.
The algorithm is in the pub ...
(1954), which captures 98% of single-digit transposition errors (it does not detect 90 ↔ 09) and the still more sophisticated
Verhoeff algorithm (1969), which catches all single-digit substitution and transposition errors, and many (but not all) more complex errors. Similar is another
abstract algebra
In mathematics, more specifically algebra, abstract algebra or modern algebra is the study of algebraic structures, which are set (mathematics), sets with specific operation (mathematics), operations acting on their elements. Algebraic structur ...
-based method, the
Damm algorithm (2004), that too detects all single-digit errors and all adjacent transposition errors. These three methods use a single check digit and will therefore fail to capture around 10% of more complex errors. To reduce this failure rate, it is necessary to use more than one check digit (for example, the modulo 97 check referred to below, which uses two check digits—for the algorithm, see
International Bank Account Number) and/or to use a wider range of characters in the check digit, for example letters plus numbers.
Examples
UPC, EAN, GLN, GTIN, numbers administered by GS1
The final digit of a
Universal Product Code
The Universal Product Code (UPC or UPC code) is a barcode#Symbologies, barcode symbology that is used worldwide for tracking trade items in stores.
The chosen symbology has bars (or spaces) of exactly 1, 2, 3, or 4 units wide each; each decimal ...
,
International Article Number
International Article Number, also known as European Article Number (EAN), is a global standard that defines a barcode format and a unique numbering system used in retail and trade. It helps identify specific types of retail products based on thei ...
,
Global Location Number
The Global Location Number (GLN) is part of the GS1 systems of standards. It is a simple tool used to identify a location and can identify locations uniquely where required. This identifier is compliant with norm ISO/IEC 6523.
The GS1 Identifica ...
or
Global Trade Item Number is a check digit computed as follows:
# Add the digits in the odd-numbered positions from the left (first, third, fifth, etc.—not including the check digit) together and multiply by three.
# Add the digits (up to but not including the check digit) in the even-numbered positions (second, fourth, sixth, etc.) to the result.
# Take the remainder of the result divided by 10 (i.e. the modulo 10 operation). If the remainder is equal to 0 then use 0 as the check digit, and if not 0 subtract the remainder from 10 to derive the check digit.
A GS1 check digit calculator and detailed documentation is online at GS1's website.
Another official calculator page shows that the mechanism for GTIN-13 is the same for
Global Location Number
The Global Location Number (GLN) is part of the GS1 systems of standards. It is a simple tool used to identify a location and can identify locations uniquely where required. This identifier is compliant with norm ISO/IEC 6523.
The GS1 Identifica ...
/GLN.
For instance, the UPC-A barcode for a box of tissues is "036000241457". The last digit is the check digit "7", and if the other numbers are correct then the check digit calculation must produce 7.
# Add the odd number digits: 0+6+0+2+1+5 = 14.
# Multiply the result by 3: 14 × 3 = 42.
# Add the even number digits: 3+0+0+4+4 = 11.
# Add the two results together: 42 + 11 = 53.
# To calculate the check digit, take the remainder of (53 / 10), which is also known as (53 modulo 10), and if not 0, subtract from 10. Therefore, the check digit value is 7. i.e. (53 / 10) = 5 remainder 3; 10 - 3 = 7.
Another example: to calculate the check digit for the following food item "01010101010''x''".
# Add the odd number digits: 0+0+0+0+0+0 = 0.
# Multiply the result by 3: 0 x 3 = 0.
# Add the even number digits: 1+1+1+1+1=5.
# Add the two results together: 0 + 5 = 5.
# To calculate the check digit, take the remainder of (5 / 10), which is also known as (5 modulo 10), and if not 0, subtract from 10: i.e. (5 / 10) = 0 remainder 5; (10 - 5) = 5. Therefore, the check digit ''x'' value is 5.
ISBN 10
The final character of a ten-digit
International Standard Book Number
The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase or receive ISBNs from an affiliate of the International ISBN Agency.
A different ISBN is assigned to e ...
is a check digit computed so that multiplying each digit by its position in the number (counting from the right) and taking the sum of these products
modulo
In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation.
Given two positive numbers and , mo ...
11 is 0. The digit the farthest to the right (which is multiplied by 1) is the check digit, chosen to make the sum correct. It may need to have the value 10, which is represented as the letter X. For example, take the : The sum of products is 0×10 + 2×9 + 0×8 + 1×7 + 5×6 + 3×5 + 0×4 + 8×3 + 2×2 + 1×1 = 99 ≡ 0 (mod 11). So the ISBN is valid. Positions can also be counted from left, in which case the check digit is multiplied by 10, to check validity: 0×1 + 2×2 + 0×3 + 1×4 + 5×5 + 3×6 + 0×7 + 8×8 + 2×9 + 1×10 = 143 ≡ 0 (mod 11).
ISBN 13
ISBN 13 (in use January 2007) is equal to the
EAN-13 code found underneath a book's barcode. Its check digit is generated in a similar way to the UPC.
The check digit is computed as follows:
# Add the digits in the odd-numbered positions from the left (first, third, fifth, etc.—not including the check digit) together.
# Add the digits (up to but not including the check digit) in the even-numbered positions (second, fourth, sixth, etc.) together, and multiply by three, and add this to the result.
# Take the remainder of the result divided by 10 (i.e. the modulo 10 operation). If the remainder is equal to 0 then use 0 as the check digit, and if not 0 subtract the remainder from 10 to derive the check digit.
For example, take the , belonging to ''Harry Potter and the Philosopher's Stone.'' 9 is the check digit here, so the calculations must yield 9 at the end.
# Add the odd number digits: 9+8+7+7+3+6 = 40.
# Add the even number digits: 7+0+4+5+2+9 = 27.
# Multiply the result by 3: 27 x 3 = 81.
# Add the two results together: 40 +81 = 121.
# To calculate the check digit, take the remainder of (121 / 10), which is also known as (121 modulo 10), and if not 0, subtract from 10. Therefore, the check digit value is 9, i.e. (121 / 10) = 12 remainder 1; 10 - 1 = 9.
NCDA
The NOID Check Digit Algorithm (NCDA), in use since 2004, is designed for application in
persistent identifier
A persistent identifier (PI or PID) is a long-lasting reference to a document, file, web page, or other object.
The term "persistent identifier" is usually used in the context of digital objects that are accessible over the Internet. Typically, s ...
s and works with variable length strings of letters and digits, called extended digits. It is widely used with the
ARK identifier scheme and somewhat used with schemes, such as the
Handle System
The Handle System is a proprietary registry assigning persistent identifiers, or ''handles'', to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".
...
and
DOI. An extended digit is constrained to
betanumeric characters, which are alphanumerics minus vowels and the letter 'l' (ell). This restriction helps when generating opaque strings that are unlikely to form words by accident and will not contain both O and 0, or l and 1. Having a prime radix of R=29, the betanumeric repertoire permits the algorithm to guarantee detection of single-character and transposition errors for strings less than R=29 characters in length (beyond which it provides a slightly weaker check). The algorithm generalizes to any character repertoire with a prime radix R and strings less than R characters in length.
Other examples of check digits
International
* The International SEDOL number.
* The final digit of an
ISSN
An International Standard Serial Number (ISSN) is an eight-digit to uniquely identify a periodical publication (periodical), such as a magazine. The ISSN is especially helpful in distinguishing between serials with the same title. ISSNs a ...
code or
IMO Number
The IMO number of the International Maritime Organization is a generic term with two distinct applications:
* the IMO ship identification number is a unique ship identifier; or,
* the IMO company and registered owner identification number is u ...
.
* The
International Securities Identifying Number
An International Securities Identification Number (ISIN) is a code that uniquely identifies a security globally for the purposes of facilitating clearing, reporting and settlement of trades. Its structure is defined in ISO 6166. The ISIN code ...
(ISIN).
* Object Management Group
FIGI
The Financial Instrument Global Identifier (FIGI) (formerly Bloomberg Global Identifier (BBGID)) is an open standard, unique identifier of financial instruments that can be assigned to instruments including common stock, options, derivatives, fut ...
standard final digit.
* The International
CAS registry number's final digit.
* Modulo 10 check digits in
credit card account numbers, calculated by the
Luhn algorithm
The Luhn algorithm or Luhn formula (creator: IBM scientist Hans Peter Luhn), also known as the " modulus 10" or "mod 10" algorithm, is a simple check digit formula used to validate a variety of identification numbers.
The algorithm is in the pub ...
.
**Also used in the Norwegian KID (customer identification number) numbers used in bank giros (credit transfer),
**Used in
IMEI of mobile phones.
* Last check digit in EAN/UPC serialisation of Global Trade Identification Number (
GTIN). It applies to GTIN-8, GTIN-12, GTIN-13 and GTIN-14.
* The final digit of a
DUNS number (though this is scheduled to change, such as that the final digit will be chosen freely in new allocations, rather than being a check digit).
* The third and fourth digits in an
International Bank Account Number (Modulo 97 check).
* The final digit in an
International Standard Text Code.
* The final character encoded in a
magnetic stripe card
The term digital card can refer to a physical item, such as a memory card on a camera, or, increasingly since 2017, to the digital content hosted
as a virtual card or cloud card, as a digital virtual representation of a physical card. They shar ...
is a computed
Longitudinal redundancy check
In telecommunication, a longitudinal redundancy check (LRC), or horizontal redundancy check, is a form of redundancy check that is applied independently to each of a parallel group of bit streams. The data must be divided into transmission block ...
.
In the US
* The tenth digit of the
National Provider Identifier for the US healthcare industry.
* The final digit of a
POSTNET code.
* The North American
CUSIP
A CUSIP () is a nine-character numeric or alphanumeric code that uniquely identifies a North American financial security (finance), security for the purposes of facilitating Clearing (finance), clearing and settlement (finance), settlement of tr ...
number.
* The final (ninth) digit of the
ABA routing transit number, a
bank code
A bank code is a code assigned by a central bank, a bank supervisory body or a Bankers Association in a country to all its licensed member banks or financial institutions. The rules vary to a great extent between the countries. Also the name of ...
used in the United States.
* The ninth digit of a
Vehicle Identification Number
A vehicle identification number (VIN; also called a chassis number or frame number) is a unique code, including a serial number, used by the automotive industry to identify individual motor vehicles, towed vehicles, motorcycles, scooters a ...
(VIN).
*
Mayo Clinic
Mayo Clinic () is a Nonprofit organization, private American Academic health science centre, academic Medical centers in the United States, medical center focused on integrated health care, healthcare, Mayo Clinic College of Medicine and Science ...
patient identification numbers used in Arizona and Florida include a trailing check digit.
* The eleventh digit of a
Customs & Border Protection entry number.
In Central America
* The Guatemalan Tax Number (NIT – Número de Identificación Tributaria) based on
modulo
In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation.
Given two positive numbers and , mo ...
11.
In Africa
* The
South African identity (ID) number uses the Luhn algorithm (modulus 10) to calculate its 13th and final digit.
In Eurasia
* The UK
NHS Number
NHS numbers are the unique numbers allocated in a shared numbering scheme to registered users of the three public health services in England, Wales and the Isle of Man. It is the key to the identification of patients, especially in delivering saf ...
uses the modulo 11 algorithm.
* The Spanish fiscal identification number (número de identificación fiscal, NIF) (based on modulo 23).
* The
Dutch Burgerservicenummer (BSN) (national identifier) uses the modulo 11 algorithm.
* The ninth digit of an
Israel
Israel, officially the State of Israel, is a country in West Asia. It Borders of Israel, shares borders with Lebanon to the north, Syria to the north-east, Jordan to the east, Egypt to the south-west, and the Mediterranean Sea to the west. Isr ...
i
Teudat Zehut (Identity Card) number.
* The 13th digit of the
Serbia
, image_flag = Flag of Serbia.svg
, national_motto =
, image_coat = Coat of arms of Serbia.svg
, national_anthem = ()
, image_map =
, map_caption = Location of Serbia (gree ...
n and
Former Yugoslav Unique Master Citizen Number (JMBG). (but not all of them, due to errors or non-residency)
* The last two digits of the 11-digit
Turkish Identification Number ().
* The ninth character in the 14-character
EU cattle passport number (cycles from 1 to 7: see
British Cattle Movement Service).
* The ninth digit in an
Iceland
Iceland is a Nordic countries, Nordic island country between the Atlantic Ocean, North Atlantic and Arctic Oceans, on the Mid-Atlantic Ridge between North America and Europe. It is culturally and politically linked with Europe and is the regi ...
ic
Kennitala (national ID number).
* Modulo 97 check digits in a
Belgian and
Serbia
, image_flag = Flag of Serbia.svg
, national_motto =
, image_coat = Coat of arms of Serbia.svg
, national_anthem = ()
, image_map =
, map_caption = Location of Serbia (gree ...
n bank account numbers. Serbia sometimes also uses modulo 11, for reference number.
* The ninth digit in a
Hungarian TAJ number (social insurance number).
* For the residents of
India
India, officially the Republic of India, is a country in South Asia. It is the List of countries and dependencies by area, seventh-largest country by area; the List of countries by population (United Nations), most populous country since ...
, the unique identity number named
Aadhaar
Aadhaar (Hindi: आधार, ) is a twelve-digit unique identity number that can be obtained voluntarily by all residents of India, based on their biometrics and demography, demographic data. The data is collected by the Unique Identification ...
has a trailing 12th digit that is calculated with the
Verhoeff algorithm.
alternate url
/ref>
* The Intellectual Property Office of Singapore (IPOS) has confirmed a new format for application numbers of registrable intellectual property
Intellectual property (IP) is a category of property that includes intangible creations of the human intellect. There are many types of intellectual property, and some countries recognize more than others. The best-known types are patents, co ...
(IP, e.g., trademark
A trademark (also written trade mark or trade-mark) is a form of intellectual property that consists of a word, phrase, symbol, design, or a combination that identifies a Good (economics and accounting), product or Service (economics), service f ...
s, patent
A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an sufficiency of disclosure, enabling discl ...
s, registered designs). It will include a check character calculated with the Damm algorithm.
* The last digit of Chinese citizen ID number (second generation) is calculated by modulo 11-2 as specified in Chinese GuoBiao (aka national standard) GB11643-1999 which adopts ISO 7064:1983. 'X' is used if the calculated checking digit is 10.
*The 11th digit of Estonia
Estonia, officially the Republic of Estonia, is a country in Northern Europe. It is bordered to the north by the Gulf of Finland across from Finland, to the west by the Baltic Sea across from Sweden, to the south by Latvia, and to the east by Ru ...
n Isikukood (Personal Identification Code).
*The last letter on vehicle registration plates of Singapore.
In Oceania
* The Australian tax file number
A tax file number (TFN) is a unique identifier issued by the Australian Taxation Office (ATO) to each taxpaying entity—an individual, company, Pension fund, superannuation fund, partnership, or Trust law, trust. Not all individuals have a TFN, a ...
(based on modulo
In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation.
Given two positive numbers and , mo ...
11).
* The seventh character of a New Zealand
New Zealand () is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island () and the South Island ()—and List of islands of New Zealand, over 600 smaller islands. It is the List of isla ...
NHI Number.
* The last digit in a New Zealand locomotive's Traffic Monitoring System (TMS) number.
Algorithms
Notable algorithms include:
* Luhn algorithm
The Luhn algorithm or Luhn formula (creator: IBM scientist Hans Peter Luhn), also known as the " modulus 10" or "mod 10" algorithm, is a simple check digit formula used to validate a variety of identification numbers.
The algorithm is in the pub ...
(1954)
* Verhoeff algorithm (1969)
* Damm algorithm (2004)
See also
* Checksum
A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
* Casting out nines – similar modular sum check
* Check bit – binary equivalent
References
External links
* Identification numbers and check digit schemes (a mathematical explanation of various check digit schemes)
* UPC, EAN, and SCC-14 check digit calculator
* GS1 check digit calculator
Apache Commons Validator CheckDigit
a Java library to validate check digits
{{DEFAULTSORT:Check Digit
Error detection and correction