arbitrary-precision arithmetic
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that
calculation A calculation is a deliberate mathematical process that transforms a plurality of inputs into a singular or plurality of outputs, known also as a result or results. The term is used in a variety of senses, from the very definite arithmetical ...
s are performed on numbers whose digits of precision are potentially limited only by the available
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
of the host system. This contrasts with the faster fixed-precision arithmetic found in most
arithmetic logic unit In computing, an arithmetic logic unit (ALU) is a Combinational logic, combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers. This is in contrast to a floating-point unit (FPU), which operates on ...
(ALU) hardware, which typically offers between 8 and 64 bits of precision. Several modern
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s have built-in support for bignums, and others have libraries available for arbitrary-precision
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
and
floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
math. Rather than storing values as a fixed number of bits related to the size of the
processor register A processor register is a quickly accessible location available to a computer's processor. Registers usually consist of a small amount of fast storage, although some registers have specific hardware functions, and may be read-only or write-onl ...
, these implementations typically use variable-length
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
of digits. Arbitrary precision is used in applications where the speed of
arithmetic Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and division. In a wider sense, it also includes exponentiation, extraction of roots, and taking logarithms. ...
is not a limiting factor, or where precise results with very large numbers are required. It should not be confused with the
symbolic computation In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions ...
provided by many
computer algebra system A computer algebra system (CAS) or symbolic algebra system (SAS) is any mathematical software with the ability to manipulate mathematical expressions in a way similar to the traditional manual computations of mathematicians and scientists. The de ...
s, which represent numbers by expressions such as , and can thus ''represent'' any computable number with infinite precision.


Applications

A common application is
public-key cryptography Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
, whose algorithms commonly employ arithmetic with integers having hundreds of digits. Another is in situations where artificial limits and overflows would be inappropriate. It is also useful for checking the results of fixed-precision calculations, and for determining optimal or near-optimal values for coefficients needed in formulae, for example the \sqrt that appears in Gaussian integration. Arbitrary precision arithmetic is also used to compute fundamental
mathematical constant A mathematical constant is a number whose value is fixed by an unambiguous definition, often referred to by a special symbol (e.g., an Letter (alphabet), alphabet letter), or by mathematicians' names to facilitate using it across multiple mathem ...
s such as π to millions or more digits and to analyze the properties of the digit strings A quote example from this article: "Such an extreme pattern is dangerous even if diluted by one of its neighbouring blocks"; this was the occurrence of the sequence 77 twenty-eight times in one block of a thousand digits. or more generally to investigate the precise behaviour of functions such as the Riemann zeta function where certain questions are difficult to explore via analytical methods. Another example is in rendering
fractal In mathematics, a fractal is a Shape, geometric shape containing detailed structure at arbitrarily small scales, usually having a fractal dimension strictly exceeding the topological dimension. Many fractals appear similar at various scale ...
images with an extremely high magnification, such as those found in the
Mandelbrot set The Mandelbrot set () is a two-dimensional set (mathematics), set that is defined in the complex plane as the complex numbers c for which the function f_c(z)=z^2+c does not Stability theory, diverge to infinity when Iteration, iterated starting ...
. Arbitrary-precision arithmetic can also be used to avoid overflow, which is an inherent limitation of fixed-precision arithmetic. Similar to an automobile's
odometer An odometer or odograph is an instrument used for measuring the distance traveled by a vehicle, such as a bicycle or car. The device may be electronic, mechanical, or a combination of the two (electromechanical). The noun derives from ancient Gr ...
display which may change from 99999 to 00000, a fixed-precision integer may exhibit '' wraparound'' if numbers grow too large to represent at the fixed level of precision. Some processors can instead deal with overflow by '' saturation,'' which means that if a result would be unrepresentable, it is replaced with the nearest representable value. (With 16-bit unsigned saturation, adding any positive amount to 65535 would yield 65535.) Some processors can generate an exception if an arithmetic result exceeds the available precision. Where necessary, the exception can be caught and recovered from—for instance, the operation could be restarted in software using arbitrary-precision arithmetic. In many cases, the task or the programmer can guarantee that the integer values in a specific application will not grow large enough to cause an overflow. Such guarantees may be based on pragmatic limits: a school attendance program may have a task limit of 4,000 students. A programmer may design the computation so that intermediate results stay within specified precision boundaries. Some programming languages such as
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
, Python,
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
,
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
,
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
and Raku use, or have an option to use, arbitrary-precision numbers for ''all'' integer arithmetic. Although this reduces performance, it eliminates the possibility of incorrect results (or exceptions) due to simple overflow. It also makes it possible to guarantee that arithmetic results will be the same on all machines, regardless of any particular machine's word size. The exclusive use of arbitrary-precision numbers in a programming language also simplifies the language, because ''a number is a number'' and there is no need for multiple types to represent different levels of precision.


Implementation issues

Arbitrary-precision arithmetic is considerably slower than arithmetic using numbers that fit entirely within processor registers, since the latter are usually implemented in hardware arithmetic whereas the former must be implemented in software. Even if the
computer A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
lacks hardware for certain operations (such as integer division, or all floating-point operations) and software is provided instead, it will use number sizes closely related to the available hardware registers: one or two words only. There are exceptions, as certain '' variable word length'' machines of the 1950s and 1960s, notably the
IBM 1620 The IBM 1620 was a model of scientific minicomputer produced by IBM. It was announced on October 21, 1959, and was then marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on N ...
,
IBM 1401 The IBM 1401 is a variable word length computer, variable-wordlength decimal computer that was announced by IBM on October 5, 1959. The first member of the highly successful IBM 1400 series, it was aimed at replacing unit record equipment for pr ...
and the Honeywell 200 series, could manipulate numbers bound only by available storage, with an extra bit that delimited the value. Numbers can be stored in a fixed-point format, or in a
floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
format as a significand multiplied by an arbitrary exponent. However, since division almost immediately introduces infinitely repeating sequences of digits (such as 4/7 in decimal, or 1/10 in binary), should this possibility arise then either the representation would be truncated at some satisfactory size or else rational numbers would be used: a large integer for the numerator and for the denominator. But even with the
greatest common divisor In mathematics, the greatest common divisor (GCD), also known as greatest common factor (GCF), of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers , , the greatest co ...
divided out, arithmetic with rational numbers can become unwieldy very quickly: 1/99 − 1/100 = 1/9900, and if 1/101 is then added, the result is 10001/999900. The size of arbitrary-precision numbers is limited in practice by the total storage available, and computation time. Numerous
algorithms In mathematics and computer science, an algorithm () is a finite sequence of mathematically rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for per ...
have been developed to efficiently perform arithmetic operations on numbers stored with arbitrary precision. In particular, supposing that digits are employed, algorithms have been designed to minimize the asymptotic
complexity Complexity characterizes the behavior of a system or model whose components interact in multiple ways and follow local rules, leading to non-linearity, randomness, collective dynamics, hierarchy, and emergence. The term is generally used to c ...
for large . The simplest algorithms are for
addition Addition (usually signified by the Plus and minus signs#Plus sign, plus symbol, +) is one of the four basic Operation (mathematics), operations of arithmetic, the other three being subtraction, multiplication, and Division (mathematics), divis ...
and
subtraction Subtraction (which is signified by the minus sign, –) is one of the four Arithmetic#Arithmetic operations, arithmetic operations along with addition, multiplication and Division (mathematics), division. Subtraction is an operation that repre ...
, where one simply adds or subtracts the digits in sequence, carrying as necessary, which yields an algorithm (see
big O notation Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
).
Comparison Comparison or comparing is the act of evaluating two or more things by determining the relevant, comparable characteristics of each thing, and then determining which characteristics of each are similar to the other, which are different, and t ...
is also very simple. Compare the high-order digits (or machine words) until a difference is found. Comparing the rest of the digits/words is not necessary. The worst case is , but it may complete much faster with operands of similar magnitude. For
multiplication Multiplication is one of the four elementary mathematical operations of arithmetic, with the other ones being addition, subtraction, and division (mathematics), division. The result of a multiplication operation is called a ''Product (mathem ...
, the most straightforward algorithms used for multiplying numbers by hand (as taught in primary school) require operations, but multiplication algorithms that achieve complexity have been devised, such as the Schönhage–Strassen algorithm, based on
fast Fourier transform A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). A Fourier transform converts a signal from its original domain (often time or space) to a representation in ...
s, and there are also algorithms with slightly worse complexity but with sometimes superior real-world performance for smaller . The Karatsuba multiplication is such an algorithm. For division, see division algorithm. For a list of algorithms along with complexity estimates, see computational complexity of mathematical operations. For examples in
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
assembly, see
external links An internal link is a type of hyperlink on a web page to another page or resource, such as an image or document, on the same website or domain. It is the opposite of an external link, a link that directs a user to content that is outside its d ...
.


Pre-set precision

In some languages such as
REXX Rexx (restructured extended executor) is a high-level programming language developed at IBM by Mike Cowlishaw. Both proprietary and open-source software, open source Rexx interpreter (computing), interpreters exist for a wide range of comput ...
and
ooRexx Object REXX is a High-level programming language, high-level, General-purpose programming language, general-purpose, Interpreter (computing), interpreted, Object-oriented programming, object-oriented (Class-based programming, class-based) program ...
, the precision of all calculations must be set before doing a calculation. Other languages, such as Python and
Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
, extend the precision automatically to prevent overflow.


Example

The calculation of
factorial In mathematics, the factorial of a non-negative denoted is the Product (mathematics), product of all positive integers less than or equal The factorial also equals the product of n with the next smaller factorial: \begin n! &= n \times ...
s can easily produce very large numbers. This is not a problem for their usage in many formulas (such as
Taylor series In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
) because they appear along with other terms, so that—given careful attention to the order of evaluation—intermediate calculation values are not troublesome. If approximate values of factorial numbers are desired, Stirling's approximation gives good results using floating-point arithmetic. The largest representable value for a fixed-size integer variable may be exceeded even for relatively small arguments as shown in the table below. Even floating-point numbers are soon outranged, so it may help to recast the calculations in terms of the
logarithm In mathematics, the logarithm of a number is the exponent by which another fixed value, the base, must be raised to produce that number. For example, the logarithm of to base is , because is to the rd power: . More generally, if , the ...
of the number. But if exact values for large factorials are desired, then special software is required, as in the pseudocode that follows, which implements the classic algorithm to calculate 1, 1×2, 1×2×3, 1×2×3×4, etc. the successive factorial numbers. constants: Limit = 1000 ''% Sufficient digits.'' Base = 10 ''% The base of the simulated arithmetic.'' FactorialLimit = 365 ''% Target number to solve, 365!'' tdigit: Array :9of character = 0","1","2","3","4","5","6","7","8","9" variables: digit: Array :Limitof 0..9 ''% The big number.'' carry, d: Integer ''% Assistants during multiplication.'' last: Integer ''% Index into the big number's digits.'' text: Array :Limitof character ''% Scratchpad for the output.'' digit := 0 ''% Clear the whole array.'' last := 1 ''% The big number starts as a single-digit,'' digit := 1 ''% its only digit is 1.'' for n := 1 to FactorialLimit: ''% Step through producing 1!, 2!, 3!, 4!, etc. '' carry := 0 ''% Start a multiply by n.'' for i := 1 to last: ''% Step along every digit.'' d := digit * n + carry ''% Multiply a single digit.'' digit := d mod Base ''% Keep the low-order digit of the result.'' carry := d div Base ''% Carry over to the next digit.'' while carry > 0: ''% Store the remaining carry in the big number.'' if last >= Limit: error("overflow") last := last + 1 ''% One more digit.'' digit ast:= carry mod Base carry := carry div Base ''% Strip the last digit off the carry.'' text := " " ''% Now prepare the output.'' for i := 1 to last: ''% Translate from binary to text.'' text imit - i + 1:= tdigit igit[i ''% Reversing the order.'' print text[Limit - last + 1:Limit">.html" ;"title="igit[i">igit[i ''% Reversing the order.'' print text[Limit - last + 1:Limit " = ", n, "!" With the example in view, a number of details can be discussed. The most important is the choice of the representation of the big number. In this case, only integer values are required for digits, so an array of fixed-width integers is adequate. It is convenient to have successive elements of the array represent higher powers of the base. The second most important decision is in the choice of the base of arithmetic, here ten. There are many considerations. The scratchpad variable must be able to hold the result of a single-digit multiply ''plus the carry'' from the prior digit's multiply. In base ten, a sixteen-bit integer is certainly adequate as it allows up to 32767. However, this example cheats, in that the value of is not itself limited to a single digit. This has the consequence that the method will fail for or so. In a more general implementation, would also use a multi-digit representation. A second consequence of the shortcut is that after the multi-digit multiply has been completed, the last value of ''carry'' may need to be carried into multiple higher-order digits, not just one. There is also the issue of printing the result in base ten, for human consideration. Because the base is already ten, the result could be shown simply by printing the successive digits of array ''digit'', but they would appear with the highest-order digit last (so that 123 would appear as "321"). The whole array could be printed in reverse order, but that would present the number with leading zeroes ("00000...000123") which may not be appreciated, so this implementation builds the representation in a space-padded text variable and then prints that. The first few results (with spacing every fifth digit and annotation added here) are: This implementation could make more effective use of the computer's built in arithmetic. A simple escalation would be to use base 100 (with corresponding changes to the translation process for output), or, with sufficiently wide computer variables (such as 32-bit integers) we could use larger bases, such as 10,000. Working in a power-of-2 base closer to the computer's built-in integer operations offers advantages, although conversion to a decimal base for output becomes more difficult. On typical modern computers, additions and multiplications take constant time independent of the values of the operands (so long as the operands fit in single machine words), so there are large gains in packing as much of a bignumber as possible into each element of the digit array. The computer may also offer facilities for splitting a product into a digit and carry without requiring the two operations of ''mod'' and ''div'' as in the example, and nearly all arithmetic units provide a ''carry flag'' which can be exploited in multiple-precision addition and subtraction. This sort of detail is the grist of machine-code programmers, and a suitable assembly-language bignumber routine can run faster than the result of the compilation of a high-level language, which does not provide direct access to such facilities but instead maps the high-level statements to its model of the target machine using an optimizing compiler. For a single-digit multiply the working variables must be able to hold the value (base−1) + carry, where the maximum value of the carry is (base−1). Similarly, the variables used to index the digit array are themselves limited in width. A simple way to extend the indices would be to deal with the bignumber's digits in blocks of some convenient size so that the addressing would be via (block ''i'', digit ''j'') where ''i'' and ''j'' would be small integers, or, one could escalate to employing bignumber techniques for the indexing variables. Ultimately, machine storage capacity and execution time impose limits on the problem size.


History

IBM's first business computer, the IBM 702 (a vacuum-tube machine) of the mid-1950s, implemented integer arithmetic ''entirely in hardware'' on digit strings of any length from 1 to 511 digits. The earliest widespread software implementation of arbitrary-precision arithmetic was probably that in
Maclisp Maclisp (or MACLISP, sometimes styled MacLisp or MacLISP) is a programming language, a dialect of the language Lisp. It originated at the Massachusetts Institute of Technology's (MIT) Project MAC (from which it derived its prefix) in the late 19 ...
. Later, around 1980, the
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s VAX/VMS and VM/CMS offered bignum facilities as a collection of string functions in the one case and in the languages EXEC 2 and
REXX Rexx (restructured extended executor) is a high-level programming language developed at IBM by Mike Cowlishaw. Both proprietary and open-source software, open source Rexx interpreter (computing), interpreters exist for a wide range of comput ...
in the other. An early widespread implementation was available via the
IBM 1620 The IBM 1620 was a model of scientific minicomputer produced by IBM. It was announced on October 21, 1959, and was then marketed as an inexpensive scientific computer. After a total production of about two thousand machines, it was withdrawn on N ...
of 1959–1970. The 1620 was a decimal-digit machine which used discrete transistors, yet it had hardware (that used
lookup table In computer science, a lookup table (LUT) is an array data structure, array that replaces runtime (program lifecycle phase), runtime computation of a mathematical function (mathematics), function with a simpler array indexing operation, in a proc ...
s) to perform integer arithmetic on digit strings of a length that could be from two to whatever memory was available. For floating-point arithmetic, the mantissa was restricted to a hundred digits or fewer, and the exponent was restricted to two digits only. The largest memory supplied offered 60 000 digits, however Fortran compilers for the 1620 settled on fixed sizes such as 10, though it could be specified on a control card if the default was not satisfactory.


Software libraries

Arbitrary-precision arithmetic in most computer software is implemented by calling an external
library A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
that provides
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s and
subroutine In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a ...
s to store numbers with the requested precision and to perform computations. Different libraries have different ways of representing arbitrary-precision numbers, some libraries work only with integer numbers, others store floating point numbers in a variety of bases (decimal or binary powers). Rather than representing a number as single value, some store numbers as a numerator/denominator pair ( rationals) and some can fully represent computable numbers, though only up to some storage limit. Fundamentally,
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...
s cannot represent all
real number In mathematics, a real number is a number that can be used to measure a continuous one- dimensional quantity such as a duration or temperature. Here, ''continuous'' means that pairs of values can have arbitrarily small differences. Every re ...
s, as the
cardinality The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thum ...
of \mathbb exceeds the cardinality of \mathbb.


See also

* Fürer's algorithm *
Karatsuba algorithm The Karatsuba algorithm is a fast multiplication algorithm for integers. It was discovered by Anatoly Karatsuba in 1960 and published in 1962. Knuth D.E. (1969) '' The Art of Computer Programming. v.2.'' Addison-Wesley Publ.Co., 724 pp ...
* Mixed-precision arithmetic * Schönhage–Strassen algorithm * Toom–Cook multiplication * Little Endian Base 128


References


Further reading

* , Section 4.3.1: The Classical Algorithms * *, Chapter 9: Fast Algorithms for Large-Integer Arithmetic


External links


Chapter 9.3 of ''The Art of Assembly''
by Randall Hyde discusses multiprecision arithmetic, with examples in
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
-assembly. * Rosetta Code tas
Arbitrary-precision integers
Case studies in the style in which over 95 programming languages compute the value of 5**4**3**2 using arbitrary precision arithmetic. {{data types Computer arithmetic Computer arithmetic algorithms Management cybernetics