ISO 2033
   HOME

TheInfoList



OR:

The ISO 2033:1983 standard (''"Coding of machine readable characters (MICR and OCR)"'') defines
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that ...
s for use with
Optical Character Recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...
or
Magnetic Ink Character Recognition Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the '' ...
systems. The Japanese standard JIS X 9010:1984 (''"Coding of machine readable characters (OCR and MICR)"'', originally designated JIS C 6229-1984) is closely related.


Character set for OCR-A

The version of the encoding for the OCR-A font registered with the
ISO-IR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
registry as ISO-IR-91 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in the addition of a
Yen sign The yen and yuan sign, ¥, is a currency sign used for the Japanese yen and the Renminbi, Chinese yuan currency, currencies when writing in Latin scripts. This monetary symbol resembles a Latin letter Y with a single or double horizontal stroke. ...
at 5C.


Character set for OCR-B

The version of the G0 set for the OCR-B font registered with the
ISO-IR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
registry as ISO-IR-92 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in being based on
JIS-Roman Code page 895 (CCSID 895) is a 7-bit character set and is Japan's national ISO 646 variant. It is the Roman set (first or left half) of the JIS X 0201 (formerly JIS C 6220) Japanese Standard and is variously called Japan 7-Bit Latin, JISCII, JIS ...
(with a
dollar sign The dollar sign, also known as peso sign, is a symbol consisting of a capital " S" crossed with one or two vertical strokes ($ or ), used to indicate the unit of various currencies around the world, including most currencies denominated "pes ...
at 0x24 and a
Yen sign The yen and yuan sign, ¥, is a currency sign used for the Japanese yen and the Renminbi, Chinese yuan currency, currencies when writing in Latin scripts. This monetary symbol resembles a Latin letter Y with a single or double horizontal stroke. ...
at 0x5C) rather than on the
ISO 646 ISO/IEC 646 is a set of ISO/IEC standards, described as ''Information technology — ISO 7-bit coded character set for information interchange'' and developed in cooperation with ASCII at least since 1964. Since its first edition in 1 ...
IRV (with a
backslash The backslash is a typographical mark used mainly in computing and mathematics. It is the mirror image of the common slash . It is a relatively recent mark, first documented in the 1930s. History , efforts to identify either the origin o ...
at 0x5C and, at the time, a
universal currency sign The currency sign is a character (symbol), character used to denote an unspecified currency. It can be described as a circle the size of a lowercase character with four short radiating arms at 45° (NE), 135° (SE), 225° (SW) and 315° (NW). I ...
(¤) at 0x24). Besides those code points, it differs from
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
only in omitting the backtick (`) and
tilde The tilde () or , is a grapheme with several uses. The name of the character came into English from Spanish, which in turn came from the Latin '' titulus'', meaning "title" or "superscription". Its primary use is as a diacritic (accent) in ...
(~). An additional supplementary set registered as ISO-IR-93 assigns the
pound sign The pound sign is the symbol for the pound unit of sterling – the currency of the United Kingdom and previously of Great Britain and of the Kingdom of England. The same symbol is used for other currencies called pound, such as the Gibralta ...
(£), universal currency sign (¤) and
section sign The section sign, §, is a typographical character for referencing individually numbered sections of a document; it is frequently used when citing sections of a legal code. It is also known as the section symbol, section mark, double-s, or ...
(§) to their
ISO-8859-1 ISO/IEC 8859-1:1998, ''Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1'', is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1 ...
codepoints, and the backslash to the ISO-8859-1 codepoint for the Yen sign.


Character set for JIS X 9008 (JIS C 6257)

JIS X 9010 (JIS C 6229) also defines character sets for the JIS X 9008:1981 (formerly JIS C 6257-1981) "hand-printed" OCR font. These include subsets of the
JIS X 0201 JIS X 0201, a Japanese Industrial Standard developed in 1969 (then called JIS C 6220 until the JIS category reform), was the first Japanese electronic character set to become widely used. It is either a 7-bit encoding or an 8-bit encoding, altho ...
Roman set (registered as ISO-IR-94 and omitting the backtick (`), lowercase letters,
curly braces A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. Typically deployed in symmetric pairs, an individual bracket may be identified as a 'left' or 'r ...
() and overline (‾)), and kana set (registered as ISO-IR-96 and omitting the East Asian style comma (、) and full stop (。), the
interpunct An interpunct , also known as an interpoint, middle dot, middot and centered dot or centred dot, is a punctuation mark consisting of a vertically centered dot used for interword separation in ancient Latin script. (Word-separating spaces did no ...
(・) and the small kana), in addition to a set (registered as ISO-IR-95) containing only the backslash, which is assigned to the same code point as in ISO-IR-93. The JIS C 6527 font stylises the slash and backslash characters with a doubled appearance. The character names given are "Solidus" and "Reverse Solidus", matching the Unicode character names for the ASCII slash and backslash. However, the Unicode
Optical Character Recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...
block includes an additional code point for an "OCR Double Backslash" (⑊), although not for a double (forward) slash, although a double slash is available elsewhere, as .


Character set for E-13B

The ISO-IR-98 encoding defined by ISO 2033 encodes the character repertoire of the E13B font, as used with
magnetic ink character recognition Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the '' ...
. Although ISO 2033 also specifies other encodings, the encoding for E-13B is the encoding referred to as ISO_2033_1983 by
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
libintl, and as or by the
IANA The Internet Assigned Numbers Authority (IANA) is a standards organization that oversees global IP address allocation, autonomous system number allocation, root zone management in the Domain Name System (DNS), media types, and other Interne ...
. Other registered labels include , its
ISO-IR ISO/IEC 2022 ''Information technology—Character code structure and extension techniques'', is an ISO/ IEC standard (equivalent to the ECMA standard ECMA-35, the ANSI standard ANSI X3.41 and the Japanese Industrial Standard JIS X 0202) in the ...
registration number, and simply . The digits are preserved in their
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
locations. Letters and symbols unavailable in the E13B font are omitted, while specialised punctuation for
bank cheque A cashier's check (or cashier's cheque, cashier's order) is a check guaranteed by a bank, drawn on the bank's own funds and signed by a cashier. Cashier's checks are treated as guaranteed funds because the bank, rather than the purchaser, is respo ...
s included in the E13B font is added. The same symbols are available in
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
in the Optical Character Recognition block.


References


External links


ISO 2033
distributed by ISO
JIS X 9010
distributed by
AFNOR Association Française de Normalisation (AFNOR, English: French Standardization Association) is a Paris-based standards organization and a member body for France at the International Organization for Standardization (ISO). The AFNOR Group develop ...
{{ISO standards Character sets #2033 Optical character recognition OCR typefaces