In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that
calculations are performed on numbers whose
digits of
precision
Precision, precise or precisely may refer to:
Science, and technology, and mathematics Mathematics and computing (general)
* Accuracy and precision, measurement deviation from true value and its scatter
* Significant figures, the number of digit ...
are limited only by the available
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered ...
of the host system. This contrasts with the faster
fixed-precision arithmetic
In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representi ...
found in most
arithmetic logic unit (ALU) hardware, which typically offers between 8 and 64
bit
The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...
s of precision.
Several modern
programming language
A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language.
The description of a programming ...
s have built-in support for bignums, and others have libraries available for arbitrary-precision
integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
and
floating-point math. Rather than storing values as a fixed number of bits related to the size of the
processor register, these implementations typically use variable-length
arrays
An array is a systematic arrangement of similar objects, usually in rows and columns.
Things called an array include:
{{TOC right
Music
* In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of digits.
Arbitrary precision is used in applications where the speed of
arithmetic is not a limiting factor, or where
precise results with very large numbers are required. It should not be confused with the
symbolic computation provided by many
computer algebra system
A computer algebra system (CAS) or symbolic algebra system (SAS) is any mathematical software with the ability to manipulate mathematical expressions in a way similar to the traditional manual computations of mathematicians and scientists. The d ...
s, which represent numbers by expressions such as , and can thus ''represent'' any
computable number
In mathematics, computable numbers are the real numbers that can be computed to within any desired precision by a finite, terminating algorithm. They are also known as the recursive numbers, effective numbers or the computable reals or recursive ...
with infinite precision.
Applications
A common application is
public-key cryptography
Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
, whose algorithms commonly employ arithmetic with integers having hundreds of digits. Another is in situations where artificial limits and
overflows would be inappropriate. It is also useful for checking the results of fixed-precision calculations, and for determining optimal or near-optimal values for coefficients needed in formulae, for example the
that appears in
Gaussian integration
In numerical analysis, a quadrature rule is an approximation of the definite integral of a function, usually stated as a weighted sum of function values at specified points within the domain of integration. (See numerical integration for mor ...
.
Arbitrary precision arithmetic is also used to compute fundamental
mathematical constants such as
π to millions or more digits and to analyze the properties of the digit strings
[ A quote example from this article: "Such an extreme pattern is dangerous even if diluted by one of its neighbouring blocks"; this was the occurrence of the sequence 77 twenty-eight times in one block of a thousand digits.] or more generally to investigate the precise behaviour of functions such as the
Riemann zeta function where certain questions are difficult to explore via analytical methods. Another example is in rendering
fractal images with an extremely high magnification, such as those found in the
Mandelbrot set
The Mandelbrot set () is the set of complex numbers c for which the function f_c(z)=z^2+c does not diverge to infinity when iterated from z=0, i.e., for which the sequence f_c(0), f_c(f_c(0)), etc., remains bounded in absolute value.
This ...
.
Arbitrary-precision arithmetic can also be used to avoid
overflow, which is an inherent limitation of fixed-precision arithmetic. Similar to a 5-digit
odometer's display which changes from 99999 to 00000, a fixed-precision integer may exhibit ''
wraparound
Wraparound, wrap around, or wrap-around is anything that wraps around something.
It may more specifically refer to:
Apparel
* Wraparound sunglasses or goggles
* Wraparound baby sling, or wrap, a piece of cloth that supports a baby
* Wraparound ...
'' if numbers grow too large to represent at the fixed level of precision. Some processors can instead deal with overflow by ''
saturation
Saturation, saturated, unsaturation or unsaturated may refer to:
Chemistry
* Saturation, a property of organic compounds referring to carbon-carbon bonds
**Saturated and unsaturated compounds
** Degree of unsaturation
**Saturated fat or fatty aci ...
,'' which means that if a result would be unrepresentable, it is replaced with the nearest representable value. (With 16-bit unsigned saturation, adding any positive amount to 65535 would yield 65535.) Some processors can generate an
exception if an arithmetic result exceeds the available precision. Where necessary, the exception can be caught and recovered from—for instance, the operation could be restarted in software using arbitrary-precision arithmetic.
In many cases, the task or the programmer can guarantee that the integer values in a specific application will not grow large enough to cause an overflow. Such guarantees may be based on pragmatic limits: a school attendance program may have a task limit of 4,000 students. A programmer may design the computation so that intermediate results stay within specified precision boundaries.
Some programming languages such as
Lisp,
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
,
Perl
Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offic ...
,
Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
,
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
and
Raku use, or have an option to use, arbitrary-precision numbers for ''all'' integer arithmetic. Although this reduces performance, it eliminates the possibility of incorrect results (or exceptions) due to simple overflow. It also makes it possible to guarantee that arithmetic results will be the same on all machines, regardless of any particular machine's
word size
In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized datum handled as a unit by the instruction set or the hardware of the processor. The number of bits or digits in a word (the ''word s ...
. The exclusive use of arbitrary-precision numbers in a programming language also simplifies the language, because ''a number is a number'' and there is no need for multiple types to represent different levels of precision.
Implementation issues
Arbitrary-precision arithmetic is considerably slower than arithmetic using numbers that fit entirely within processor registers, since the latter are usually implemented in
hardware arithmetic whereas the former must be implemented in software. Even if the
computer lacks hardware for certain operations (such as integer division, or all floating-point operations) and software is provided instead, it will use number sizes closely related to the available hardware registers: one or two words only and definitely not N words. There are exceptions, as certain ''
variable word length'' machines of the 1950s and 1960s, notably the
IBM 1620
The IBM 1620 was announced by IBM on October 21, 1959, and marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on November 19, 1970. Modified versions of the 1620 were used as ...
,
IBM 1401
The IBM 1401 is a variable-wordlength decimal computer that was announced by IBM on October 5, 1959. The first member of the highly successful IBM 1400 series, it was aimed at replacing unit record equipment for processing data stored on pu ...
and the Honeywell ''Liberator'' series, could manipulate numbers bound only by available storage, with an extra bit that delimited the value.
Numbers can be stored in a
fixed-point format, or in a
floating-point format as a
significand multiplied by an arbitrary exponent. However, since division almost immediately introduces infinitely repeating sequences of digits (such as 4/7 in decimal, or 1/10 in binary), should this possibility arise then either the representation would be truncated at some satisfactory size or else rational numbers would be used: a large integer for the
numerator
A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
and for the
denominator
A fraction (from la, fractus, "broken") represents a part of a whole or, more generally, any number of equal parts. When spoken in everyday English, a fraction describes how many parts of a certain size there are, for example, one-half, eight ...
. But even with the
greatest common divisor
In mathematics, the greatest common divisor (GCD) of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers ''x'', ''y'', the greatest common divisor of ''x'' and ''y'' is ...
divided out, arithmetic with rational numbers can become unwieldy very quickly: 1/99 − 1/100 = 1/9900, and if 1/101 is then added, the result is 10001/999900.
The size of arbitrary-precision numbers is limited in practice by the total storage available, and computation time.
Numerous
algorithms
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
have been developed to efficiently perform arithmetic operations on numbers stored with arbitrary precision. In particular, supposing that digits are employed, algorithms have been designed to minimize the asymptotic
complexity for large .
The simplest algorithms are for
addition and
subtraction, where one simply adds or subtracts the digits in sequence, carrying as necessary, which yields an algorithm (see
big O notation).
Comparison
Comparison or comparing is the act of evaluating two or more things by determining the relevant, comparable characteristics of each thing, and then determining which characteristics of each are similar to the other, which are different, and t ...
is also very simple. Compare the high-order digits (or machine words) until a difference is found. Comparing the rest of the digits/words is not necessary. The worst case is , but usually it will go much faster.
For
multiplication, the most straightforward algorithms used for multiplying numbers by hand (as taught in primary school) require operations, but
multiplication algorithm
A multiplication algorithm is an algorithm (or method) to multiply two numbers. Depending on the size of the numbers, different algorithms are more efficient than others. Efficient multiplication algorithms have existed since the advent of the d ...
s that achieve complexity have been devised, such as the
Schönhage–Strassen algorithm
The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers. It was developed by Arnold Schönhage and Volker Strassen in 1971.A. Schönhage and V. Strassen,Schnelle Multiplikation großer Zahlen, '' ...
, based on
fast Fourier transforms, and there are also algorithms with slightly worse complexity but with sometimes superior real-world performance for smaller . The
Karatsuba multiplication is such an algorithm.
For
division
Division or divider may refer to:
Mathematics
*Division (mathematics), the inverse of multiplication
*Division algorithm, a method for computing the result of mathematical division
Military
*Division (military), a formation typically consisting ...
, see
division algorithm
A division algorithm is an algorithm which, given two integers N and D, computes their quotient and/or remainder, the result of Euclidean division. Some are applied by hand, while others are employed by digital circuit designs and software.
Div ...
.
For a list of algorithms along with complexity estimates, see
computational complexity of mathematical operations
The following tables list the computational complexity of various algorithms for common mathematical operations.
Here, complexity refers to the time complexity of performing computations on a multitape Turing machine. See big O notation for an e ...
.
For examples in
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was intr ...
assembly, see
external links
An internal link is a type of hyperlink on a web page to another page or resource, such as an image or document, on the same website or domain.
Hyperlinks are considered either "external" or "internal" depending on their target or destinatio ...
.
Pre-set precision
In some languages such as
REXX, the precision of all calculations must be set before doing a calculation. Other languages, such as
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
and
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called ...
, extend the precision automatically to prevent overflow.
Example
The calculation of
factorials can easily produce very large numbers. This is not a problem for their usage in many formulas (such as
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
) because they appear along with other terms, so that—given careful attention to the order of evaluation—intermediate calculation values are not troublesome. If approximate values of factorial numbers are desired,
Stirling's approximation
In mathematics, Stirling's approximation (or Stirling's formula) is an approximation for factorials. It is a good approximation, leading to accurate results even for small values of n. It is named after James Stirling, though a related but less p ...
gives good results using floating-point arithmetic. The largest representable value for a fixed-size integer variable may be exceeded even for relatively small arguments as shown in the table below. Even floating-point numbers are soon outranged, so it may help to recast the calculations in terms of the
logarithm
In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number to the base is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 of ...
of the number.
But if exact values for large factorials are desired, then special software is required, as in the pseudocode that follows, which implements the classic algorithm to calculate 1, 1×2, 1×2×3, 1×2×3×4, etc. the successive factorial numbers.
Constant Limit = 1000; ''% Sufficient digits.''
Constant Base = 10; ''% The base of the simulated arithmetic.''
Constant FactorialLimit = 365; ''% Target number to solve, 365!''
Array digit
:Limitof integer; ''% The big number.''
Integer carry,d; ''% Assistants during multiplication.''
Integer last,i; ''% Indices to the big number's digits.''
Array text
:Limitof character; ''% Scratchpad for the output.''
Constant tdigit
:9of character =
0","1","2","3","4","5","6","7","8","9"
BEGIN
digit:=0; ''% Clear the whole array.''
digit
=1; ''% The big number starts with 1,''
last:=1; ''% Its highest-order digit is number 1.''
for n:=1 to FactorialLimit do ''% Step through producing 1!, 2!, 3!, 4!, etc. ''
carry:=0; ''% Start a multiply by n.''
for i:=1 to last do ''% Step along every digit.''
d:=digit
n + carry; ''% The classic multiply.''
digit
=d mod Base; ''% The low-order digit of the result.''
carry:=d div Base; ''% The carry to the next digit.''
next i;
while carry > 0 ''% Store the carry in the big number. ''
if last >= Limit then croak("Overflow!"); ''% If possible!''
last:=last + 1; ''% One more digit.''
digit
ast=carry mod Base; ''% Placed.''
carry:=carry div Base; ''% The carry reduced.''
Wend ''% With n > Base, maybe > 1 digit extra.''
text:=" "; ''% Now prepare the output.''
for i:=1 to last do ''% Translate from binary to text.''
text
imit - i + 1=tdigit
igit[i; ''% Reversing the order.''
next i; ''% Arabic numerals put the low order last.''
Print text," = ",n,"!"; ''% Print the result!''
next n; ''% On to the next factorial up.''
END;
With the example in view, a number of details can be discussed. The most important is the choice of the representation of the big number. In this case, only integer values are required for digits, so an array of fixed-width integers is adequate. It is convenient to have successive elements of the array represent higher powers of the base.
The second most important decision is in the choice of the base of arithmetic, here ten. There are many considerations. The scratchpad variable must be able to hold the result of a single-digit multiply ''plus the carry'' from the prior digit's multiply. In base ten, a sixteen-bit integer is certainly adequate as it allows up to 32767. However, this example cheats, in that the value of is not itself limited to a single digit. This has the consequence that the method will fail for or so. In a more general implementation, would also use a multi-digit representation. A second consequence of the shortcut is that after the multi-digit multiply has been completed, the last value of ''carry'' may need to be carried into multiple higher-order digits, not just one.
There is also the issue of printing the result in base ten, for human consideration. Because the base is already ten, the result could be shown simply by printing the successive digits of array ''digit'', but they would appear with the highest-order digit last (so that 123 would appear as "321"). The whole array could be printed in reverse order, but that would present the number with leading zeroes ("00000...000123") which may not be appreciated, so this implementation builds the representation in a space-padded text variable and then prints that. The first few results (with spacing every fifth digit and annotation added here) are:
This implementation could make more effective use of the computer's built in arithmetic. A simple escalation would be to use base 100 (with corresponding changes to the translation process for output), or, with sufficiently wide computer variables (such as 32-bit integers) we could use larger bases, such as 10,000. Working in a power-of-2 base closer to the computer's built-in integer operations offers advantages, although conversion to a decimal base for output becomes more difficult. On typical modern computers, additions and multiplications take constant time independent of the values of the operands (so long as the operands fit in single machine words), so there are large gains in packing as much of a bignumber as possible into each element of the digit array. The computer may also offer facilities for splitting a product into a digit and carry without requiring the two operations of ''mod'' and ''div'' as in the example, and nearly all arithmetic units provide a ''carry flag'' which can be exploited in multiple-precision addition and subtraction. This sort of detail is the grist of machine-code programmers, and a suitable assembly-language bignumber routine can run much faster than the result of the compilation of a high-level language, which does not provide access to such facilities.
For a single-digit multiply the working variables must be able to hold the value (base−1) + carry, where the maximum value of the carry is (base−1). Similarly, the variables used to index the digit array are themselves limited in width. A simple way to extend the indices would be to deal with the bignumber's digits in blocks of some convenient size so that the addressing would be via (block ''i'', digit ''j'') where ''i'' and ''j'' would be small integers, or, one could escalate to employing bignumber techniques for the indexing variables. Ultimately, machine storage capacity and execution time impose limits on the problem size.
History
IBM's first business computer, the
IBM 702
The IBM 702 was an early generation tube-based digital computer produced by IBM in the early to mid-1950s. It was the company's response to Remington Rand's UNIVAC—the first mainframe computer to use magnetic tapes. As these machines ...
(a
vacuum-tube
A vacuum tube, electron tube, valve (British usage), or tube (North America), is a device that controls electric current flow in a high vacuum between electrodes to which an electric potential difference has been applied.
The type known as a ...
machine) of the mid-1950s, implemented integer arithmetic ''entirely in hardware'' on digit strings of any length from 1 to 511 digits. The earliest widespread software implementation of arbitrary-precision arithmetic was probably that in
Maclisp. Later, around 1980, the
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.
Time-sharing operating systems schedule tasks for efficient use of the system and may also i ...
s
VAX/VMS
OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing and virtual memory-based operating system. It is designed to support time-sharing, batch processing, transaction processing and workstation applications. Customers using Ope ...
and
VM/CMS
VM (often: VM/CMS) is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers.
The following versions ...
offered bignum facilities as a collection of
string functions in the one case and in the languages
EXEC 2
EXEC 2 is an interpreted, command procedure control, computer scripting language used by the EXEC 2 Processor originally supplied with the CMS component of the IBM Virtual Machine/System Product ( VM/SP) operating system.
Relation to EXEC
EXEC 2 ...
and
REXX in the other.
An early widespread implementation was available via the
IBM 1620
The IBM 1620 was announced by IBM on October 21, 1959, and marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on November 19, 1970. Modified versions of the 1620 were used as ...
of 1959–1970. The 1620 was a decimal-digit machine which used discrete transistors, yet it had hardware (that used
lookup tables) to perform integer arithmetic on digit strings of a length that could be from two to whatever memory was available. For floating-point arithmetic, the mantissa was restricted to a hundred digits or fewer, and the exponent was restricted to two digits only. The largest memory supplied offered 60 000 digits, however
Fortran compilers for the 1620 settled on fixed sizes such as 10, though it could be specified on a control card if the default was not satisfactory.
Software libraries
Arbitrary-precision arithmetic in most computer software is implemented by calling an external
library
A library is a collection of materials, books or media that are accessible for use and not just for display purposes. A library provides physical (hard copies) or digital access (soft copies) materials, and may be a physical location or a vir ...
that provides
data types and
subroutines to store numbers with the requested precision and to perform computations.
Different libraries have different ways of representing arbitrary-precision numbers, some libraries work only with integer numbers, others store
floating point
In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can b ...
numbers in a variety of bases (decimal or binary powers). Rather than representing a number as single value, some store numbers as a numerator/denominator pair (
rationals
In mathematics, a rational number is a number that can be expressed as the quotient or fraction of two integers, a numerator and a non-zero denominator . For example, is a rational number, as is every integer (e.g. ). The set of all rationa ...
) and some can fully represent
computable number
In mathematics, computable numbers are the real numbers that can be computed to within any desired precision by a finite, terminating algorithm. They are also known as the recursive numbers, effective numbers or the computable reals or recursive ...
s, though only up to some storage limit. Fundamentally,
Turing machine
A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
s cannot represent all
real number
In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...
s, as the
cardinality of
exceeds the cardinality of
.
See also
*
Fürer's algorithm
*
Karatsuba algorithm
The Karatsuba algorithm is a fast multiplication algorithm. It was discovered by Anatoly Karatsuba in 1960 and published in 1962.
Knuth D.E. (1969) ''The Art of Computer Programming. v.2.'' Addison-Wesley Publ.Co., 724 pp.
It is a div ...
*
Mixed-precision arithmetic
*
Schönhage–Strassen algorithm
The Schönhage–Strassen algorithm is an asymptotically fast multiplication algorithm for large integers. It was developed by Arnold Schönhage and Volker Strassen in 1971.A. Schönhage and V. Strassen,Schnelle Multiplikation großer Zahlen, '' ...
*
Toom–Cook multiplication Toom–Cook, sometimes known as Toom-3, named after Andrei Toom, who introduced the new algorithm with its low complexity, and Stephen Cook, who cleaned the description of it, is a multiplication algorithm for large integers.
Given two large integ ...
References
Further reading
* , Section 4.3.1: The Classical Algorithms
*
*, Chapter 9: Fast Algorithms for Large-Integer Arithmetic
External links
Chapter 9.3 of ''The Art of Assembly''by
Randall Hyde discusses multiprecision arithmetic, with examples in
x86
x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was intr ...
-assembly.
* Rosetta Code tas
Arbitrary-precision integersCase studies in the style in which over 95 programming languages compute the value of 5**4**3**2 using arbitrary precision arithmetic.
{{data types
Computer arithmetic
Computer arithmetic algorithms