In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, integer factorization is the decomposition of a
positive integer
In mathematics, the natural numbers are the numbers 0, 1, 2, 3, and so on, possibly excluding 0. Some start counting with 0, defining the natural numbers as the non-negative integers , while others start with 1, defining them as the positiv ...
into a
product of integers. Every positive integer greater than 1 is either the product of two or more integer
factors greater than 1, in which case it is a
composite number
A composite number is a positive integer that can be formed by multiplying two smaller positive integers. Accordingly it is a positive integer that has at least one divisor other than 1 and itself. Every positive integer is composite, prime numb ...
, or it is not, in which case it is a
prime number
A prime number (or a prime) is a natural number greater than 1 that is not a Product (mathematics), product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime ...
. For example, is a composite number because , but is a prime number because it cannot be decomposed in this way. If one of the factors is composite, it can in turn be written as a product of smaller factors, for example . Continuing this process until every factor is prime is called prime factorization; the result is always unique up to the order of the factors by the
prime factorization theorem
In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 is prime or can be represented uniquely as a product of prime numbers, u ...
.
To factorize a small integer using mental or pen-and-paper arithmetic, the simplest method is
trial division: checking if the number is divisible by prime numbers , , , and so on, up to the
square root
In mathematics, a square root of a number is a number such that y^2 = x; in other words, a number whose ''square'' (the result of multiplying the number by itself, or y \cdot y) is . For example, 4 and −4 are square roots of 16 because 4 ...
of . For larger numbers, especially when using a computer, various more sophisticated factorization algorithms are more efficient. A prime factorization algorithm typically involves
testing whether each factor is prime each time a factor is found.
When the numbers are sufficiently large, no efficient non-
quantum
In physics, a quantum (: quanta) is the minimum amount of any physical entity (physical property) involved in an interaction. The fundamental notion that a property can be "quantized" is referred to as "the hypothesis of quantization". This me ...
integer factorization
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
is known. However, it has not been proven that such an algorithm does not exist. The presumed
difficulty of this problem is important for the algorithms used in
cryptography
Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...
such as
RSA public-key encryption and the
RSA digital signature. Many areas of
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
and
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
have been brought to bear on this problem, including
elliptic curve
In mathematics, an elliptic curve is a smooth, projective, algebraic curve of genus one, on which there is a specified point . An elliptic curve is defined over a field and describes points in , the Cartesian product of with itself. If the ...
s,
algebraic number theory
Algebraic number theory is a branch of number theory that uses the techniques of abstract algebra to study the integers, rational numbers, and their generalizations. Number-theoretic questions are expressed in terms of properties of algebraic ob ...
, and quantum computing.
Not all numbers of a given length are equally hard to factor. The hardest instances of these problems (for currently known techniques) are
semiprime
In mathematics, a semiprime is a natural number that is the product of exactly two prime numbers. The two primes in the product may equal each other, so the semiprimes include the squares of prime numbers.
Because there are infinitely many prime n ...
s, the product of two prime numbers. When they are both large, for instance more than two thousand
bit
The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as ...
s long, randomly chosen, and about the same size (but not too close, for example, to avoid efficient factorization by
Fermat's factorization method), even the fastest prime factorization algorithms on the fastest classical computers can take enough time to make the search impractical; that is, as the number of digits of the integer being factored increases, the number of operations required to perform the factorization on any classical computer increases drastically.
Many cryptographic protocols are based on the presumed difficulty of factoring large composite integers or a related problem for example, the
RSA problem
In cryptography, the RSA problem summarizes the task of performing an RSA private-key operation given only the public key. The RSA algorithm raises a ''message'' to an '' exponent'', modulo a composite number ''N'' whose factors are not known. ...
. An algorithm that efficiently factors an arbitrary integer would render
RSA-based
public-key
Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
cryptography insecure.
Prime decomposition
By the
fundamental theorem of arithmetic
In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 is prime or can be represented uniquely as a product of prime numbers, ...
, every positive integer has a unique
prime factor
A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
ization. (By convention, 1 is the
empty product
In mathematics, an empty product, or nullary product or vacuous product, is the result of multiplication, multiplying no factors. It is by convention equal to the multiplicative identity (assuming there is an identity for the multiplication operat ...
.)
Testing whether the integer is prime can be done in
polynomial time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
, for example, by the
AKS primality test
The AKS primality test (also known as Agrawal–Kayal–Saxena primality test and cyclotomic AKS test) is a deterministic primality-proving algorithm created and published by Manindra Agrawal, Neeraj Kayal, and Nitin Saxena, computer scientist ...
. If composite, however, the polynomial time tests give no insight into how to obtain the factors.
Given a general algorithm for integer factorization, any integer can be factored into its constituent
prime factor
A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
s by repeated application of this algorithm. The situation is more complicated with special-purpose factorization algorithms, whose benefits may not be realized as well or even at all with the factors produced during decomposition. For example, if where are very large primes,
trial division will quickly produce the factors 3 and 19 but will take divisions to find the next factor. As a contrasting example, if is the product of the primes , , and , where , Fermat's factorization method will begin with which immediately yields and hence the factors and . While these are easily recognized as composite and prime respectively, Fermat's method will take much longer to factor the composite number because the starting value of for is a factor of 10 from .
Current state of the art
Among the -bit numbers, the most difficult to factor in practice using existing algorithms are those
semiprimes whose factors are of similar size. For this reason, these are the integers used in cryptographic applications.
In 2019, a 240-digit (795-bit) number (
RSA-240) was factored by a team of researchers including
Paul Zimmermann, utilizing approximately 900 core-years of computing power. These researchers estimated that a 1024-bit RSA modulus would take about 500 times as long.
The largest such semiprime yet factored was
RSA-250, an 829-bit number with 250 decimal digits, in February 2020. The total computation time was roughly 2700 core-years of computing using Intel
Xeon Gold 6130 at 2.1 GHz. Like all recent factorization records, this factorization was completed with a highly optimized implementation of the
general number field sieve
In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form
:
\begin
& ...
run on hundreds of machines.
Time complexity
No
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
has been published that can factor all integers in
polynomial time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
, that is, that can factor a -bit number in time for some constant . Neither the existence nor non-existence of such algorithms has been proved, but it is generally suspected that they do not exist.
There are published algorithms that are faster than for all positive , that is,
sub-exponential. , the algorithm with best theoretical asymptotic running time is the
general number field sieve
In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form
:
\begin
& ...
(GNFS), first published in 1993, running on a -bit number in time:
:
For current computers, GNFS is the best published algorithm for large (more than about 400 bits). For a
quantum computer
A quantum computer is a computer that exploits quantum mechanical phenomena. On small scales, physical matter exhibits properties of both particles and waves, and quantum computing takes advantage of this behavior using specialized hardware. ...
, however,
Peter Shor
Peter Williston Shor (born August 14, 1959) is an American theoretical computer scientist known for his work on quantum computation, in particular for devising Shor's algorithm, a quantum algorithm for factoring exponentially faster than the ...
discovered an algorithm in 1994 that solves it in polynomial time.
Shor's algorithm Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong ...
takes only time and space on -bit number inputs. In 2001, Shor's algorithm was implemented for the first time, by using
NMR
Nuclear magnetic resonance (NMR) is a physical phenomenon in which atomic nucleus, nuclei in a strong constant magnetic field are disturbed by a weak oscillating magnetic field (in the near and far field, near field) and respond by producing ...
techniques on molecules that provide seven qubits.
In order to talk about
complexity classes
In computational complexity theory, a complexity class is a set of computational problems "of related resource-based complexity". The two most commonly analyzed resources are time and memory.
In general, a complexity class is defined in terms of ...
such as P, NP, and co-NP, the problem has to be stated as a
decision problem
In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question on a set of input values. An example of a decision problem is deciding whether a given natura ...
.
It is known to be in both
NP and
co-NP
In computational complexity theory, co-NP is a complexity class. A decision problem X is a member of co-NP if and only if its complement is in the complexity class NP. The class can be defined as follows: a decision problem is in co-NP if and o ...
, meaning that both "yes" and "no" answers can be verified in polynomial time. An answer of "yes" can be certified by exhibiting a factorization with . An answer of "no" can be certified by exhibiting the factorization of into distinct primes, all larger than ; one can verify their primality using the
AKS primality test
The AKS primality test (also known as Agrawal–Kayal–Saxena primality test and cyclotomic AKS test) is a deterministic primality-proving algorithm created and published by Manindra Agrawal, Neeraj Kayal, and Nitin Saxena, computer scientist ...
, and then multiply them to obtain . The
fundamental theorem of arithmetic
In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 is prime or can be represented uniquely as a product of prime numbers, ...
guarantees that there is only one possible string of increasing primes that will be accepted, which shows that the problem is in both
UP and co-UP. It is known to be in
BQP because of Shor's algorithm.
The problem is suspected to be outside all three of the complexity classes P, NP-complete,
[. See in particula]
p. 583
and
co-NP-complete
In complexity theory, computational problems that are co-NP-complete are those that are the hardest problems in co-NP, in the sense that any problem in co-NP can be reformulated as a special case of any co-NP-complete problem with only polynomial ...
.
It is therefore a candidate for the
NP-intermediate
In computational complexity, problems that are in the complexity class NP but are neither in the class P nor NP-complete are called NP-intermediate, and the class of such problems is called NPI. Ladner's theorem, shown in 1975 by Richard E. Lad ...
complexity class.
In contrast, the decision problem "Is a composite number?" (or equivalently: "Is a prime number?") appears to be much easier than the problem of specifying factors of . The composite/prime problem can be solved in polynomial time (in the number of digits of ) with the
AKS primality test
The AKS primality test (also known as Agrawal–Kayal–Saxena primality test and cyclotomic AKS test) is a deterministic primality-proving algorithm created and published by Manindra Agrawal, Neeraj Kayal, and Nitin Saxena, computer scientist ...
. In addition, there are several
probabilistic algorithm
A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
s that can test primality very quickly in practice if one is willing to accept a vanishingly small possibility of error. The ease of
primality test
A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating wheth ...
ing is a crucial part of the
RSA algorithm, as it is necessary to find large prime numbers to start with.
Factoring algorithms
Special-purpose
A special-purpose factoring algorithm's running time depends on the properties of the number to be factored or on one of its unknown factors: size, special form, etc. The parameters which determine the running time vary among algorithms.
An important subclass of special-purpose factoring algorithms is the ''Category 1'' or ''First Category'' algorithms, whose running time depends on the size of smallest prime factor. Given an integer of unknown form, these methods are usually applied before general-purpose methods to remove small factors.
[
] For example, naive
trial division is a Category 1 algorithm.
*
Trial division
*
Wheel factorization
*
Pollard's rho algorithm, which has two common flavors to
identify group cycles: one by Floyd and one by Brent.
*
Algebraic-group factorization algorithms, among which are
Pollard's algorithm,
Williams' algorithm, and
Lenstra elliptic curve factorization
The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the thi ...
*
Fermat's factorization method
*
Euler's factorization method
*
Special number field sieve
Special or specials may refer to:
Policing
* Specials, Ulster Special Constabulary, the Northern Ireland police force
* Specials, Special Constable, an auxiliary, volunteer, or temporary; police worker or police officer
* Special police forces
...
*
Difference of two squares
In elementary algebra, a difference of two squares is one squared number (the number multiplied by itself) subtracted from another squared number. Every difference of squares may be factored as the product of the sum of the two numbers and the ...
General-purpose
A general-purpose factoring algorithm, also known as a ''Category 2'', ''Second Category'', or
''Kraitchik'' ''family'' algorithm,
has a running time which depends solely on the size of the integer to be factored. This is the type of algorithm used to factor
RSA numbers. Most general-purpose factoring algorithms are based on the
congruence of squares method.
*
Dixon's factorization method
*
Continued fraction factorization
''...Continued'' is the second album released by Tony Joe White. It was released on Monument Records and contained the single "Roosevelt and Ira Lee" It was recorded at Monument Studios, Nashville and Lyn-Lou Studios, Memphis, Tennessee, Memph ...
(CFRAC)
*
Quadratic sieve
*
Rational sieve
*
General number field sieve
In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form
:
\begin
& ...
*
Shanks's square forms factorization (SQUFOF)
Other notable algorithms
*
Shor's algorithm Shor's algorithm is a quantum algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. It is one of the few known quantum algorithms with compelling potential applications and strong ...
, for quantum computers
Heuristic running time
In number theory, there are many integer factoring algorithms that heuristically have expected
running time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
:
in
little-o and
L-notation
''L''-notation is an asymptotic notation analogous to big-O notation, denoted as L_n alpha,c/math> for a bound variable n tending to infinity. Like big-O notation, it is usually used to roughly convey the rate of growth of a function, such as the ...
.
Some examples of those algorithms are the
elliptic curve method
The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose computer, general-purpose ...
and the
quadratic sieve.
Another such algorithm is the class group relations method proposed by Schnorr,
Seysen,
and Lenstra,
which they proved only assuming the unproved
generalized Riemann hypothesis
The Riemann hypothesis is one of the most important conjectures in mathematics. It is a statement about the zeros of the Riemann zeta function. Various geometrical and arithmetical objects can be described by so-called global ''L''-functions, whi ...
.
Rigorous running time
The Schnorr–Seysen–Lenstra probabilistic algorithm has been rigorously proven by Lenstra and Pomerance
[ to have expected running time by replacing the GRH assumption with the use of multipliers.
The algorithm uses the ]class group
In mathematics, the ideal class group (or class group) of an algebraic number field K is the quotient group J_K/P_K where J_K is the group of fractional ideals of the ring of integers of K, and P_K is its subgroup of principal ideals. The class ...
of positive binary quadratic form
In mathematics, a quadratic form is a polynomial with terms all of degree two (" form" is another name for a homogeneous polynomial). For example,
4x^2 + 2xy - 3y^2
is a quadratic form in the variables and . The coefficients usually belong t ...
s of discriminant
In mathematics, the discriminant of a polynomial is a quantity that depends on the coefficients and allows deducing some properties of the zero of a function, roots without computing them. More precisely, it is a polynomial function of the coef ...
denoted by .
is the set of triples of integers in which those integers are relative prime.
Schnorr–Seysen–Lenstra algorithm
Given an integer that will be factored, where is an odd positive integer greater than a certain constant. In this factoring algorithm the discriminant is chosen as a multiple of , , where is some positive multiplier. The algorithm expects that for one there exist enough smooth forms in . Lenstra and Pomerance show that the choice of can be restricted to a small set to guarantee the smoothness result.
Denote by the set of all primes with Kronecker symbol
In number theory, the Kronecker symbol, written as \left(\frac an\right) or (a, n), is a generalization of the Jacobi symbol to all integers n. It was introduced by .
Definition
Let n be a non-zero integer, with prime factorization
:n=u \cdo ...
. By constructing a set of generators of and prime forms of with in a sequence of relations between the set of generators and are produced.
The size of can be bounded by for some constant .
The relation that will be used is a relation between the product of powers that is equal to the neutral element
In mathematics, an identity element or neutral element of a binary operation is an element that leaves unchanged every element when the operation is applied. For example, 0 is an identity element of the addition of real numbers. This concept is use ...
of . These relations will be used to construct a so-called ambiguous form of , which is an element of of order dividing 2. By calculating the corresponding factorization of and by taking a gcd, this ambiguous form provides the complete prime factorization of . This algorithm has these main steps:
Let be the number to be factored.
To obtain an algorithm for factoring any positive integer, it is necessary to add a few steps to this algorithm such as trial division, and the Jacobi sum test.
Expected running time
The algorithm as stated is a probabilistic algorithm
A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
as it makes random choices. Its expected running time is at most .
See also
* Aurifeuillean factorization
* Bach's algorithm for generating random numbers with their factorizations
* Canonical representation of a positive integer
In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 is prime or can be represented uniquely as a product of prime numbers, u ...
* Factorization
In mathematics, factorization (or factorisation, see American and British English spelling differences#-ise, -ize (-isation, -ization), English spelling differences) or factoring consists of writing a number or another mathematical object as a p ...
* Multiplicative partition In number theory, a multiplicative partition or unordered factorization of an integer n is a way of writing n as a product of integers greater than 1, treating two products as equivalent if they differ only in the ordering of the factors. The number ...
* -adic valuation
* Integer partition
In number theory and combinatorics, a partition of a non-negative integer , also called an integer partition, is a way of writing as a summation, sum of positive integers. Two sums that differ only in the order of their summands are considered ...
– a way of writing a number as a sum of positive integers.
Notes
References
* Chapter 5: Exponential Factoring Algorithms, pp. 191–226. Chapter 6: Subexponential Factoring Algorithms, pp. 227–284. Section 7.4: Elliptic curve method, pp. 301–313.
* Donald Knuth
Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist and mathematician. He is a professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of comp ...
. ''The Art of Computer Programming
''The Art of Computer Programming'' (''TAOCP'') is a comprehensive multi-volume monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. it consists of published volumes 1, 2, 3, 4A, and 4 ...
'', Volume 2: ''Seminumerical Algorithms'', Third Edition. Addison-Wesley, 1997. . Section 4.5.4: Factoring into Primes, pp. 379–417.
* .
*
External links
msieve
– SIQS and NFS – has helped complete some of the largest public factorizations known
* Richard P. Brent, "Recent Progress and Prospects for Integer Factorisation Algorithms", ''Computing and Combinatorics"'', 2000, pp. 3–22
* Manindra Agrawal
Manindra Agrawal (born 20 May 1966) is an Indian computer scientist and director of Indian Institute of Technology, Kanpur. He is also a professor at the Department of Computer Science and Engineering at the Indian Institute of Technology, Ka ...
, Neeraj Kayal, Nitin Saxena, "PRIMES is in P." Annals of Mathematics 160(2): 781–793 (2004)
August 2005 version PDF
* Eric W. Weisstein
“RSA-640 Factored” ''MathWorld Headline News'', November 8, 2005
Dario Alpern's Integer factorization calculator
– A web app for factoring large integers
{{Authority control
Computational hardness assumptions
Unsolved problems in computer science
Factorization