In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, an integer literal is a kind of
literal for an
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
whose
value
Value or values may refer to:
Ethics and social
* Value (ethics) wherein said concept may be construed as treating actions themselves as abstract objects, associating value to them
** Values (Western philosophy) expands the notion of value beyo ...
is directly represented in
source code
In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the wo ...
. For example, in the assignment statement
x = 1
, the string
1
is an integer literal indicating the value 1, while in the statement
x = 0x10
the string
0x10
is an integer literal indicating the value 16, which is represented by
10
in hexadecimal (indicated by the
0x
prefix).
By contrast, in
x = cos(0)
, the expression
cos(0)
evaluates to 1 (as the
cosine of 0), but the value 1 is not ''literally'' included in the source code. More simply, in
x = 2 + 2,
the expression
2 + 2
evaluates to 4, but the value 4 is not literally included. Further, in
x = "1"
the
"1"
is a
string literal
A string literal (computer programming), literal or anonymous string is a String (computer science), string value in the source code of a computer program. Modern Computer programming, programming languages commonly use a quoted sequence of charact ...
, not an integer literal, because it is in quotes. The value of the string is
1
, which happens to be an integer string, but this is semantic analysis of the string literal – at the syntactic level
"1"
is simply a string, no different from
"foo"
.
Parsing
Recognizing a string (sequence of characters in the source code) as an integer literal is part of the
lexical analysis
In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of ''lexical tokens'' ( strings with an assigned and thus identified ...
(lexing) phase, while evaluating the literal to its value is part of the
semantic analysis phase. Within the lexer and phrase grammar, the token class is often denoted
integer
, with the lowercase indicating a lexical-level token class, as opposed to phrase-level production rule (such as
ListOfIntegers
). Once a string has been lexed (tokenized) as an integer literal, its value cannot be determined syntactically (it is ''just'' an integer), and evaluation of its value becomes a semantic question.
Integer literals are generally lexed with
regular expression
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
s, as in
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
.
[2.4.4. Integer and long integer literals]
Evaluation
As with other literals, integer literals are generally evaluated at compile time, as part of the semantic analysis phase. In some cases this semantic analysis is done in the lexer, immediately on recognition of an integer literal, while in other cases this is deferred until the parsing stage, or until after the
parse tree
A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in co ...
has been completely constructed. For example, on recognizing the string
0x10
the lexer could immediately evaluate this to 16 and store that (a token of type
integer
and value 16), or defer evaluation and instead record a token of type
integer
and value
0x10
.
Once literals have been evaluated, further semantic analysis in the form of
constant folding is possible, meaning that literal expressions involving literal values can be evaluated at the compile phase. For example, in the statement
x = 2 + 2
after the literals have been evaluated and the expression
2 + 2
has been parsed, it can then be evaluated to 4, though the value 4 does not itself appear as a literal.
Affixes
Integer literals frequently have prefixes indicating base, and less frequently suffixes indicating type.
For example, in
C++
C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
0x10ULL
indicates the value 16 (because hexadecimal) as an unsigned long long integer.
Common prefixes include:
*
0x
or
0X
for
hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of 16. Unlike the decimal system representing numbers using 10 symbols, hexa ...
(base 16);
*
0
,
0o
or
0O
for
octal
The octal numeral system, or oct for short, is the radix, base-8 number system, and uses the Numerical digit, digits 0 to 7. This is to say that 10octal represents eight and 100octal represents sixty-four. However, English, like most languages, ...
(base 8);
*
0b
or
0B
for
binary
Binary may refer to:
Science and technology Mathematics
* Binary number, a representation of numbers using only two digits (0 and 1)
* Binary function, a function that takes two arguments
* Binary operation, a mathematical operation that t ...
(base 2).
Common suffixes include:
*
l
or
L
for long integer;
*
ll
or
LL
for long long integer;
*
u
or
U
for unsigned integer.
These affixes are somewhat similar to
sigils
A sigil () is a type of symbol used in magic. The term has usually referred to a pictorial signature of a deity or spirit. In modern usage, especially in the context of chaos magic, sigil refers to a symbolic representation of the practitione ...
, though sigils attach to identifiers (names), not literals.
Digit separators
In some languages, integer literals may contain digit separators to allow
digit grouping
A decimal separator is a symbol used to separate the integer part from the fractional part of a number written in decimal form (e.g., "." in 12.45). Different countries officially designate different symbols for use as the separator. The choi ...
into more legible forms. If this is available, it can usually be done for floating point literals as well. This is particularly useful for
bit field
A bit field is a data structure that consists of one or more adjacent bits which have been allocated for specific purposes, so that any single bit or group of bits within the structure can be set or inspected. A bit field is most commonly used to ...
s and makes it easier to see the size of large numbers (such as a million) at a glance by
subitizing rather than counting digits. It is also useful for numbers that are typically grouped, such as
credit card number
A payment card number, primary account number (PAN), or simply a card number, is the card identifier found on payment cards, such as credit cards and debit cards, as well as stored-value cards, gift cards and other similar cards. In some situati ...
or
social security number
In the United States, a Social Security number (SSN) is a nine-digit number issued to U.S. citizens, permanent residents, and temporary (working) residents under section 205(c)(2) of the Social Security Act, codified as . The number is issued to ...
s. Very long numbers can be further grouped by doubling up separators.
Typically decimal numbers (base-10) are grouped in three digit groups (representing one of 1000 possible values), binary numbers (base-2) in four digit groups (one
nibble
In computing, a nibble (occasionally nybble, nyble, or nybl to match the spelling of byte) is a four-bit aggregation, or half an octet. It is also known as half-byte or tetrade. In a networking or telecommunication context, the nibble is oft ...
, representing one of 16 possible values), and hexadecimal numbers (base-16) in two digit groups (each digit is one nibble, so two digits are one
byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
, representing one of 256 possible values). Numbers from other systems (such as id numbers) are grouped following whatever convention is in use.
Examples
In
Ada
Ada may refer to:
Places
Africa
* Ada Foah, a town in Ghana
* Ada (Ghana parliament constituency)
* Ada, Osun, a town in Nigeria
Asia
* Ada, Urmia, a village in West Azerbaijan Province, Iran
* Ada, Karaman, a village in Karaman Province, ...
,
C# (from version 7.0),
D,
Eiffel
Eiffel may refer to:
Places
* Eiffel Peak, a summit in Alberta, Canada
* Champ de Mars – Tour Eiffel station, Paris, France; a transit station
Structures
* Eiffel Tower, in Paris, France, designed by Gustave Eiffel
* Eiffel Bridge, Ungheni, M ...
,
Go (from version 1.13),
Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
(from GHC version 8.6.1),
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
(from version 7),
Julia
Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...
,
Perl
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
,
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
(from version 3.6),
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
,
Rust
Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH ...
and
Swift
Swift or SWIFT most commonly refers to:
* SWIFT, an international organization facilitating transactions between banks
** SWIFT code
* Swift (programming language)
* Swift (bird), a family of birds
It may also refer to:
Organizations
* SWIFT, ...
, integer literals and float literals can be separated with an
underscore
An underscore, ; also called an underline, low line, or low dash; is a line drawn under a segment of text. In proofreading, underscoring is a convention that says "set this text in italic type", traditionally used on Manuscript (publishing), man ...
(
_
). There can be some restrictions on placement; for example, in Java they cannot appear at the start or end of the literal, nor next to a decimal point. Note that while the period, comma, and (thin) spaces are used in normal writing for digit separation, these conflict with their existing use in programming languages as
radix point
A decimal separator is a symbol used to separate the integer part from the fractional part of a number written in decimal form (e.g., "." in 12.45). Different countries officially designate different symbols for use as the separator. The choi ...
, list separator (and in C/C++, the
comma operator
In the C and C++ programming languages, the comma operator (represented by the token ,) is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value (and type); there i ...
), and token separator.
Examples include:
int oneMillion = 1_000_000;
int creditCardNumber = 1234_5678_9012_3456;
int socialSecurityNumber = 123_45_6789;
In
C++14
C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August ...
(2014) and the next version of
C ,
C23, the apostrophe character may be used to separate digits arbitrarily in numeric literals. The underscore was initially proposed, with an initial proposal in 1993, and again for
C++11
C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by ...
, following other languages. However, this caused conflict with
user-defined literals, so the apostrophe was proposed instead, as an "
upper comma" (which is used in some other contexts).
auto integer_literal = 1'000'000;
auto binary_literal = 0b0100'1100'0110;
auto very_long_binary_literal =
0b0000'0001'0010'0011''0100'0101'0110'0111;
Notes
References
{{reflist
Literal
Source code