A numeral (often called ''number'' in
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
) is a character that denotes a number. The
decimal
The decimal numeral system (also called the base-ten positional numeral system and denary or decanary) is the standard system for denoting integer and non-integer numbers. It is the extension to non-integer numbers of the Hindu–Arabic numeral ...
number digits are used widely in various writing systems throughout the world, however the
graphemes
In linguistics, a grapheme is the smallest functional unit of a writing system.
The word ''grapheme'' is derived and the suffix ''-eme'' by analogy with ''phoneme'' and other names of emic units. The study of graphemes is called ''graphemics' ...
representing the decimal digits differ widely. Therefore Unicode includes 22 different sets of graphemes for the decimal digits, and also various decimal points, thousands separators, negative signs, etc. Unicode also includes several non-decimal numerals such as
Aegean numerals
Aegean numbers was an additive sign-value numeral system used by the Minoan and Mycenaean civilizations. They are attested in Linear A and Linear B scripts. They may have survived in the Cypro-Minoan script, where a single sign with "100" val ...
,
Roman numerals
Roman numerals are a numeral system that originated in ancient Rome and remained the usual way of writing numbers throughout Europe well into the Late Middle Ages. Numbers are written with combinations of letters from the Latin alphabet, eac ...
,
counting rod numerals
Counting Rod Numerals is a Unicode block containing traditional Chinese counting rod symbols, which mathematicians used for calculation in ancient China, Japan, Korea, and Vietnam. The orientation of the Unicode characters follows Song dynasty co ...
,
Mayan numerals
The Mayan numeral system was the system to represent numbers and calendar dates in the Maya civilization. It was a vigesimal (base-20) positional numeral system. The numerals are made up of three symbols; zero (a shell), one (a dot) and f ...
,
Cuneiform numerals and
ancient Greek numerals. There is also a large number of typographical variations of the
Western Arabic numerals
Arabic numerals are the ten numerical digits: , , , , , , , , and . They are the most commonly used symbols to write decimal numbers. They are also used for writing numbers in other systems such as octal, and for writing identifiers such as c ...
provided for specialized mathematical use and for compatibility with earlier character sets, such as
² or ②, and composite characters such as ½.
Numerals by numeric property
Grouped by their numerical property as used in a text, Unicode has four values for Numeric Type. First there is the "not a number" type. Then there are
decimal-radix numbers, commonly used in Western style decimals (plain 0–9), there are numbers that are not part of a decimal system such as Roman numbers, and decimal numbers in typographic context, such as encircled numbers. Not noted is a numbering like "A. B. C." for chapter numbering.
Hexadecimal digits
Hexadecimal digits in Unicode are not separate characters; existing letters and numbers are used. These characters have marked
Character properties Hex_digit=Yes
, and
ASCII_Hex_digit=Yes
when appropriate.
Numerals by script
Hindu–Arabic numerals
The
Hindu–Arabic numeral system
The Hindu–Arabic numeral system or Indo-Arabic numeral system Audun HolmeGeometry: Our Cultural Heritage 2000 (also called the Hindu numeral system or Arabic numeral system) is a positional decimal numeral system, and is the most common syste ...
involves ten digits representing 0–9. Unicode includes the
Western Arabic numerals
Arabic numerals are the ten numerical digits: , , , , , , , , and . They are the most commonly used symbols to write decimal numbers. They are also used for writing numbers in other systems such as octal, and for writing identifiers such as c ...
in the Basic Latin (or ASCII derived) block. The digits are repeated in several other scripts:
Eastern Arabic, Balinese, Bengali, Devanagari, Ethiopic, Gujarati, Gurmukhi, Telugu, Khmer, Lao, Limbu, Malayalam, Mongolian, Myanmar, New Tai Lue, Nko, Oriya, Telugu, Thai, Tibetan, Osmanya. Unicode includes a numeric value property for each digit to assist in collation and other text processing operations. However, there is no mapping between the various related digits.
Although Arabic is written from right to left, while English is written left to right, in both languages numbers are written with the most significant digit on the left and the least significant on the right.
Fractions
The fraction slash character (U+2044) allows authors using Unicode to compose any arbitrary fraction along with the decimal digits. This was intended to instruct font rendering to make the surrounding digits smaller and raise them on the left and lower them on the right, but this is rarely implemented. (A workaround is to use the super/subscript characters described below, but only Arabic numerals are available.) Unicode also includes a handful of
vulgar fraction
A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
s as compatibility characters, but discourages their use.
Decimal fractions
Several characters in Unicode can serve as a decimal separator depending on the locale. Decimal fractions are represented in text as a sequence of decimal digit numerals with a decimal separator separating the whole-number portion from the fractional portion. For example, the decimal fraction for ¼ is expressed as zero-point-two-five ("0.25"). Unicode has no dedicated general decimal separator but unifies the decimal separator function with other punctuation characters. So the "." used in "0.25" is the same period character (U+002E) used to end the sentence. However, cultures vary in the glyph or grapheme used for a decimal separator. So in some locales, the comma (U+002C) may be used instead: "0,25". Still other locales use a space (or non-breaking space) for "0 25". The Arabic writing system includes a dedicated character for a decimal separator that looks much like a comma "٫" (U+066B) which when combined with the Arabic digits to express one-quarter appears as: "٠٫٢٥".
Characters for mathematical constants
Currently, three Unicode characters semantically represent mathematical constants: , the , and . Other mathematical constants can be represented using characters that have multiple semantic uses. For example, although Unicode includes a character for ''natural exponent'' ℯ (U+212F) its UCS canonical name derives from its glyph: ; and the mathematical constant
π, 3.141592.., is represented by .
Rich text and other compatibility numerals
The Western Arabic numerals also appear among the compatibility characters as rich text variant forms including bold, double-struck, monospace, sans-serif and sans-serif bold, along with fullwidth variants for legacy vertical text support.
Rich text parenthesized, circled and other variants are also included in the blocks Enclosed CJK Letters and Months; Enclosed Alphanumerics, Superscripts and Subscripts; Number Forms; and Dingbats.
Suzhou (huāmǎ/Sūzhōu mǎzi) numerals
The ''huāmǎ'' ('')''/''Sūzhōu mǎzi'' () system is a variation of the rod numeral system. Rod numerals are closely related to the
counting rods
Counting rods () are small bars, typically 3–14 cm long, that were used by mathematicians for calculation in ancient East Asia. They are placed either horizontally or vertically to represent any integer or rational number.
The written fo ...
and the
abacus
The abacus (''plural'' abaci or abacuses), also called a counting frame, is a calculating tool which has been used since ancient times. It was used in the ancient Near East, Europe, China, and Russia, centuries before the adoption of the Hin ...
, which is why the numeric symbols for 1, 2, 3, 6, 7 and 8 in the ''huāmǎ'' system are represented in a similar way as on the abacus. Nowadays, the ''huāmǎ'' system is only used for displaying prices in Chinese markets or on traditional handwritten invoices.
The digits of the Suzhou numerals are in the
CJK Symbols and Punctuation
CJK Symbols and Punctuation is a Unicode block containing symbols and punctuation used for writing the Chinese, Japanese and Korean languages. It also contains one Chinese character.
Block
The block has variation sequences defined for East ...
block at U+3021—U+3029, U+3007, U+5341, U+5344, and U+5345. In Unicode 3.0 these characters are incorrectly called
Hangzhou
Hangzhou ( or , ; , , Standard Mandarin pronunciation: ), also romanized as Hangchow, is the capital and most populous city of Zhejiang, China. It is located in the northwestern part of the province, sitting at the head of Hangzhou Bay, whi ...
style numerals. In the Unicode 4.0, an erratum was added which stated:
All references to "Hangzhou" in the Unicode standard have been corrected to "Suzhou" except for the character names themselves, which cannot be changed once assigned, according to the Unicode Stability Policy. (This policy allows software to use the names as unique identifiers.)
Japanese and Korean numerals
Ancient Greek numerals
Unicode provides support for several variants of
Greek numerals
Greek numerals, also known as Ionic, Ionian, Milesian, or Alexandrian numerals, are a system of writing numbers using the letters of the Greek alphabet. In modern Greece, they are still used for ordinal numbers and in contexts similar to tho ...
, assigned to the
Supplementary Multilingual Plane
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal ...
from U+10140 through U+1018F.
[Unicode Charts: Ancient Greek Numbers]
/ref>
Attic numerals were used by ancient Greeks
Ancient Greece ( el, Ἑλλάς, Hellás) was a northeastern Mediterranean civilization, existing from the Greek Dark Ages of the 12th–9th centuries BC to the end of classical antiquity ( AD 600), that comprised a loose collection of cultu ...
, possibly from the 7th century BC
The 7th century BC began the first day of 700 BC and ended the last day of 601 BC.
The Neo-Assyrian Empire continued to dominate the Near East during this century, exercising formidable power over neighbors like Babylon and Egypt. In the last t ...
. They were also known as Herodianic numerals because they were first described in a 2nd-century manuscript by Herodian
Herodian or Herodianus ( el, Ἡρωδιανός) of Syria, sometimes referred to as "Herodian of Antioch" (c. 170 – c. 240), was a minor Roman civil servant who wrote a colourful history in Greek titled ''History of the Empire from the Death o ...
. They are also known as acrophonic numerals because all of the symbols used derive from the first letters of the words that the symbols represent: 'one', 'five', 'ten', 'hundred', 'thousand' and 'ten thousand'. See Greek numerals
Greek numerals, also known as Ionic, Ionian, Milesian, or Alexandrian numerals, are a system of writing numbers using the letters of the Greek alphabet. In modern Greece, they are still used for ordinal numbers and in contexts similar to tho ...
and acrophony
Acrophony (; Greek: ἄκρος ''akros'' uppermost + φωνή ''phone'' sound) is the naming of letters of an alphabetic writing system so that a letter's name begins with the letter itself. For example, Greek letter names are acrophonic: the name ...
.
Roman numerals
Roman numerals
Roman numerals are a numeral system that originated in ancient Rome and remained the usual way of writing numbers throughout Europe well into the Late Middle Ages. Numbers are written with combinations of letters from the Latin alphabet, eac ...
originated in ancient Rome
, established_title = Founded
, established_date = 753 BC
, founder = King Romulus (legendary)
, image_map = Map of comune of Rome (metropolitan city of Capital Rome, region Lazio, Italy).svg
, map_caption ...
, adapted from Etruscan numerals
Etruscan numerals are the words and phrases for numbers of the Etruscan language, and the numerical digits used to write them.
Digits
The Etruscan numerical system included the following digits with known values:Gilles Van Heems (2009)>Nombre, c ...
. The system used in classical antiquity
Classical antiquity (also the classical era, classical period or classical age) is the period of cultural history between the 8th century BC and the 5th century AD centred on the Mediterranean Sea, comprising the interlocking civilizations of ...
was slightly modified in the Middle Ages
In the history of Europe, the Middle Ages or medieval period lasted approximately from the late 5th to the late 15th centuries, similar to the post-classical period of global history. It began with the fall of the Western Roman Empire a ...
to produce the system we use today. It is based on certain letters which are given values as numerals.
Roman numerals are commonly used today in numbered lists (in outline format), clockfaces, pages preceding the main body of a book, chord triads in music analysis (Roman numeral analysis
In music theory, Roman numeral analysis is a type of musical analysis in which chords are represented by Roman numerals (I, II, III, IV, …). In some cases, Roman numerals denote scale degrees themselves. More commonly, however, they represent t ...
), the numbering of movie and video game sequels, book publication dates, successive political leaders or children with identical names, and the numbering of some sport events, such as the Olympic Games
The modern Olympic Games or Olympics (french: link=no, Jeux olympiques) are the leading international sporting events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a var ...
or the Super Bowl
The Super Bowl is the annual final playoff game of the National Football League (NFL) to determine the league champion. It has served as the final game of every NFL season since 1966, replacing the NFL Championship Game. Since 2022, the game ...
.
Unicode has a number of characters specifically designated as Roman numerals, as part of the ''Number Forms
Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the cha ...
''[Unicode Number Forms]
/ref> range from U+2160 to U+2188. This range includes both upper- and lowercase numerals, as well as pre-combined characters for numbers up to 12 (Ⅻ or XII). One reason for the existence of pre-combined numbers is to facilitate the setting of multiple-letter numbers (such as VIII) on a single horizontal line in Asian vertical text. The Unicode standard, however, includes special Roman numeral code points for compatibility only, stating that " r most purposes, it is preferable to compose the Roman numerals from sequences of the appropriate Latin letters".
Additionally, characters exist for archaic forms of 1000, 5000, 10,000, large reversed C (Ɔ), late 6 (ↅ, similar to Greek Stigma: Ϛ), early 50 (ↆ, similar to down arrow ↓⫝⊥[David J. Perry: Proposal to Add Additional Ancient Roman Characters to UCS](_blank)
/ref>), 50,000, and 100,000. The small reversed c, ↄ, is not intended to be used in Roman numerals, but as lower case
Letter case is the distinction between the letters that are in larger uppercase or capitals (or more formally ''majuscule'') and smaller lowercase (or more formally ''minuscule'') in the written representation of certain languages. The writing ...
Claudian letter
The Claudian letters were developed by the Roman emperor Claudius (reigned 41–54). He introduced three new letters to the Latin alphabet:
*Ↄ or ↃϹ/X (''antisigma'') to replace BS and PS, much as X stood in for CS and GS. The shape o ...
Ↄ.
If using blackletter
Blackletter (sometimes black letter), also known as Gothic script, Gothic minuscule, or Textura, was a script used throughout Western Europe from approximately 1150 until the 17th century. It continued to be commonly used for the Danish, Norweg ...
or script
Script may refer to:
Writing systems
* Script, a distinctive writing system, based on a repertoire of specific elements or symbols, or that repertoire
* Script (styles of handwriting)
** Script typeface, a typeface with characteristics of handw ...
typefaces, Roman numerals are set in Roman type
In Latin script typography, roman is one of the three main kinds of historical type, alongside blackletter and italic. Roman type was modelled from a European scribal manuscript style of the 15th century, based on the pairing of inscriptional ...
. Such typefaces may contain Roman numerals matching the style of the typeface in the Unicode range U+2160–217F; if they don't exist, a matching Antiqua typeface is used for Roman numerals.
Unicode has characters for Roman fractions in the '' Ancient Symbols''[Unicode Ancient Symbols]
/ref> block: sextans, uncia, semuncia, sextula, dimidia sextula, siliqua, and as.
Counting rod numerals
Counting rod numerals are included in their own block in the Supplementary Multilingual Plane
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points. There are 17 planes, identified by the numbers 0 to 16, which corresponds with the possible values 00–1016 of the first two positions in six position hexadecimal ...
(SMP) as of Unicode 5.0. There are nine "horizontal" digits (U+1D360 to U+1D368) and nine "vertical" digits (U+1D369 to U+1D371), the horizontal digits are used for odd powers of ten and the vertical digits for even powers of ten. Zero should be represented by U+3007 (〇, ideographic number zero) and the negative sign should be represented by U+20E5 (combining reverse solidus overlay). This block also contains other counting-rod-like symbols, such as the well-known tally mark for 5 . As these were recently added to the character set and are not in the BMP, font support may still be limited.
See also
* Number Forms
Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters. They consist primarily of vulgar fractions and Roman numerals. In addition to the cha ...
(Unicode block)
References
{{unicode navigation
Unicode
Numerals