Multiple Polynomial Quadratic Sieve
   HOME

TheInfoList



OR:

The quadratic sieve
algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
(QS) is an
integer factorization In mathematics, integer factorization is the decomposition of a positive integer into a product of integers. Every positive integer greater than 1 is either the product of two or more integer factors greater than 1, in which case it is a comp ...
algorithm and, in practice, the second-fastest method known (after the
general number field sieve In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form : \begin & ...
). It is still the fastest for
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
s under 100 decimal digits or so, and is considerably simpler than the number field sieve. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties. It was invented by
Carl Pomerance Carl Bernard Pomerance (born 1944 in Joplin, Missouri) is an American number theorist. He attended college at Brown University and later received his Ph.D. from Harvard University in 1972 with a dissertation proving that any odd perfect number ...
in 1981 as an improvement to Schroeppel's linear sieve.


Basic aim

The algorithm attempts to set up a
congruence of squares In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms. Derivation Given a positive integer ''n'', Fermat's factorization method relies on finding numbers ''x'' and ''y'' satisfying the equali ...
modulo In computing and mathematics, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another, the latter being called the '' modulus'' of the operation. Given two positive numbers and , mo ...
''n'' (the integer to be factorized), which often leads to a factorization of ''n''. The algorithm works in two phases: the ''data collection'' phase, where it collects information that may lead to a congruence of squares; and the ''data processing'' phase, where it puts all the data it has collected into a
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
and solves it to obtain a congruence of squares. The data collection phase can be easily
parallelized Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different for ...
to many processors, but the data processing phase requires large amounts of memory, and is difficult to parallelize efficiently over many nodes or if the processing nodes do not each have enough memory to store the whole matrix. The
block Wiedemann algorithm The block Wiedemann algorithm for computing kernel vectors of a matrix over a finite field is a generalization by Don Coppersmith of an algorithm due to Doug Wiedemann. Wiedemann's algorithm Let M be an n\times n square matrix over some finite ...
can be used in the case of a few systems each capable of holding the matrix. The naive approach to finding a congruence of squares is to pick a random number, square it, divide by ''n'' and hope the least non-negative remainder is a perfect square. For example, 80^2 \equiv 441 = 21^2 \pmod. This approach finds a congruence of squares only rarely for large ''n'', but when it does find one, more often than not, the congruence is nontrivial and the factorization is complete. This is roughly the basis of
Fermat's factorization method Fermat's factorization method, named after Pierre de Fermat, is based on the representation of an odd integer as the difference of two squares: :N = a^2 - b^2. That difference is algebraically factorable as (a+b)(a-b); if neither factor equals on ...
. The quadratic sieve is a modification of
Dixon's factorization method In number theory, Dixon's factorization method (also Dixon's random squares method or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run- ...
. The general running time required for the quadratic sieve (to factor an integer ''n'') is conjectured to be :e^ =L_n\left /2,1\right/math> in the
L-notation ''L''-notation is an asymptotic notation analogous to big-O notation, denoted as L_n alpha,c/math> for a bound variable n tending to infinity. Like big-O notation, it is usually used to roughly convey the rate of growth of a function, such as the ...
. The constant ''e'' is the base of the natural logarithm.


The approach

To factorize the integer ''n'', Fermat's method entails a search for a single number ''a'', , such that the remainder of ''a''2 divided by ''n'' is a square. But these ''a'' are hard to find. The quadratic sieve consists of computing the remainder of ''a''2/''n'' for several ''a'', then finding a subset of these whose product is a square. This will yield a congruence of squares. For example, consider attempting to factor the number 1649. We have: 41^2 \equiv 32, 42^2 \equiv 115, 43^2 \equiv 200 \pmod. None of the integers 32, 115, 200 is a square, but the product 32 \cdot 200 = 80^2 is a square. We also had :32 \cdot 200 \equiv 41^2 \cdot 43^2 = (41 \cdot 43)^2 \equiv 114^2 \pmod since 41 \cdot 43 \equiv 114 \pmod. The observation that 32 \cdot 200 = 80^2 thus gives a
congruence of squares In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms. Derivation Given a positive integer ''n'', Fermat's factorization method relies on finding numbers ''x'' and ''y'' satisfying the equali ...
:114^2 \equiv 80^2 \pmod. Hence 114^2 - 80^2 = (114 + 80) \cdot (114 - 80) = 194 \cdot 34 = k \cdot 1649 for some integer k. We can then factor : 1649 = \gcd( 194, 1649 ) \cdot \gcd( 34, 1649 ) = 97 \cdot 17 using the
Euclidean algorithm In mathematics, the Euclidean algorithm,Some widely used textbooks, such as I. N. Herstein's ''Topics in Algebra'' and Serge Lang's ''Algebra'', use the term "Euclidean algorithm" to refer to Euclidean division or Euclid's algorithm, is a ...
to calculate the
greatest common divisor In mathematics, the greatest common divisor (GCD), also known as greatest common factor (GCF), of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers , , the greatest co ...
. So the problem has now been reduced to: given a set of integers, find a subset whose product is a square. By the
fundamental theorem of arithmetic In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 is prime or can be represented uniquely as a product of prime numbers, ...
, any positive integer can be written uniquely as a product of
prime power In mathematics, a prime power is a positive integer which is a positive integer power of a single prime number. For example: , and are prime powers, while , and are not. The sequence of prime powers begins: 2, 3, 4, 5, 7, 8, 9, 11, 13, 16, 1 ...
s. We do this in a vector format; for example, the prime-power factorization of 504 is 23325071, it is therefore represented by the exponent vector (3,2,0,1). Multiplying two integers then corresponds to adding their exponent vectors. A number is a square when its exponent vector is even in every coordinate. For example, the vectors (3,2,0,1) + (1,0,0,1) = (4,2,0,2), so (504)(14) is a square. Searching for a square requires knowledge only of the parity of the numbers in the vectors, so it is sufficient to compute these vectors mod 2: (1,0,0,1) + (1,0,0,1) = (0,0,0,0). So given a set of (0,1)-vectors, we need to find a subset which adds to the
zero vector In mathematics, a zero element is one of several generalizations of the number zero to other algebraic structures. These alternate meanings may or may not reduce to the same thing, depending on the context. Additive identities An '' additive id ...
mod 2. This is a
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as :a_1x_1+\cdots +a_nx_n=b, linear maps such as :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrix (mathemat ...
problem since the
ring (The) Ring(s) may refer to: * Ring (jewellery), a round band, usually made of metal, worn as ornamental jewelry * To make a sound with a bell, and the sound made by a bell Arts, entertainment, and media Film and TV * ''The Ring'' (franchise), a ...
\mathbb/2\mathbb can be regarded as the
Galois field In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtr ...
of order 2, that is we can divide by all non-zero numbers (there is only one, namely 1) when calculating modulo 2. It is a theorem of linear algebra that with more vectors than each vector has entries, a
linear dependency In the theory of vector spaces, a set of vectors is said to be if there exists no nontrivial linear combination of the vectors that equals the zero vector. If such a linear combination exists, then the vectors are said to be . These concepts ...
always exists. It can be found by
Gaussian elimination In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can a ...
. However, simply squaring many random numbers mod ''n'' produces a very large number of different
prime A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
factors, and thus very long vectors and a very large matrix. The trick is to look specifically for numbers ''a'' such that ''a''2 mod ''n'' has only small prime factors (they are
smooth number In number theory, an ''n''-smooth (or ''n''-friable) number is an integer whose prime factors are all less than or equal to ''n''. For example, a 7-smooth number is a number in which every prime factor is at most 7. Therefore, 49 = 72 and 15750 = 2 ...
s). They are harder to find, but using only smooth numbers keeps the vectors and matrices smaller and more tractable. The quadratic sieve searches for smooth numbers using a technique called
sieving A sieve (), fine mesh strainer, or sift is a tool used for separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a woven mesh or net or perforated sheet mate ...
, discussed later, from which the algorithm takes its name.


The algorithm

To summarize, the basic quadratic sieve algorithm has these main steps: # Choose a smoothness bound ''B''. The number π(''B''), denoting the number of prime numbers less than ''B'', will control both the length of the vectors and the number of vectors needed. # Use sieving to locate π(''B'') + 1 numbers ''ai'' such that ''bi'' = (''ai''2 mod ''n'') is ''B''-smooth. #Factor the ''bi'' and generate exponent vectors mod 2 for each one. # Use linear algebra to find a subset of these vectors which add to the zero vector. Multiply the corresponding ''ai ''together and give the result mod ''n'' the name ''a''; similarly, multiply the ''bi'' together which yields a ''B''-smooth square ''b''2. #We are now left with the equality ''a''2 = ''b''2 mod ''n'' from which we get two square roots of (''a''2 mod ''n''), one by taking the square root in the integers of ''b''2 namely ''b'', and the other the ''a'' computed in step 4. # We now have the desired identity: (a+b)(a-b)\equiv0 \pmod n. Compute the GCD of ''n'' with the difference (or sum) of ''a'' and ''b''. This produces a factor, although it may be a trivial factor (''n'' or 1). If the factor is trivial, try again with a different linear dependency or different ''a''. The remainder of this article explains details and extensions of this basic algorithm.


How QS optimizes finding congruences

The quadratic sieve attempts to find pairs of integers ''x'' and ''y''(''x'') (where ''y''(''x'') is a function of ''x'') satisfying a much weaker condition than ''x''2 ≡ ''y''2 (mod ''n''). It selects a set of
primes A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
called the ''factor base'', and attempts to find ''x'' such that the least absolute remainder of ''y''(''x'') = ''x''2 mod ''n'' factorizes completely over the factor base. Such ''y'' values are said to be ''smooth'' with respect to the factor base. The factorization of a value of ''y''(''x'') that splits over the factor base, together with the value of ''x'', is known as a ''relation''. The quadratic sieve speeds up the process of finding relations by taking ''x'' close to the square root of ''n''. This ensures that ''y''(''x'') will be smaller, and thus have a greater chance of being smooth. :y(x)=\left(\left\lceil\sqrt\right\rceil+x\right)^2-n\hboxx\hbox :y(x)\approx 2x\left\lceil\sqrt\right\rceil This implies that ''y'' is on the order of 2''x''[]. However, it also implies that ''y'' grows linearly with ''x'' times the square root of ''n''. Another way to increase the chance of smoothness is by simply increasing the size of the factor base. However, it is necessary to find at least one smooth relation more than the number of primes in the factor base, to ensure the existence of a linear dependency.


Partial relations and cycles

Even if for some relation ''y''(''x'') is not smooth, it may be possible to merge two of these ''partial relations'' to form a full one, if the two ''y''s are products of the same prime(s) outside the factor base. ote that this is equivalent to extending the factor base.For example, if the factor base is and ''n'' = 91, there are partial relations: : : Multiply these together: : and multiply both sides by (11−1)2 modulo 91. 11−1 modulo 91 is 58, so: :(58\cdot 21\cdot 29)^2\equiv 2^1\cdot7^1\pmod :14^2\equiv 2^1\cdot7^1\pmod producing a full relation. Such a full relation (obtained by combining partial relations) is called a ''cycle''. Sometimes, forming a cycle from two partial relations leads directly to a congruence of squares, but rarely.


Checking smoothness by sieving

There are several ways to check for smoothness of the ''y''s. The most obvious is by
trial division Trial division is the most laborious but easiest to understand of the integer factorization algorithms. The essential idea behind trial division tests to see if an integer , the integer to be factored, can be divided by each number in turn that ...
, although this increases the running time for the data collection phase. Another method that has some acceptance is the
elliptic curve method The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose computer, general-purpose ...
(ECM). In practice, a process called ''sieving'' is typically used. If ''f''(''x'') is the
polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
f(x)=x^2-n we have :\begin f(x)&=x^2-n \\ f(x+kp) &= (x+kp)^2-n \\ &= x^2+2xkp+(kp)^2-n \\ &= f(x)+2xkp+(kp)^2\equiv f(x)\pmod \end Thus solving ''f(x)'' ≡ 0 (mod ''p'') for ''x'' generates a whole sequence of numbers ''y'' for which ''y''=''f''(''x''), all of which are divisible by ''p''. This is finding a square root modulo a prime, for which there exist efficient algorithms, such as the Shanks–Tonelli algorithm. (This is where the quadratic sieve gets its name: ''y'' is a quadratic polynomial in ''x'', and the sieving process works like the
Sieve of Eratosthenes In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit. It does so by iteratively marking as composite number, composite (i.e., not prime) the multiples of each prime, starting with ...
.) The sieve starts by setting every entry in a large array ''A''[] of bytes to zero. For each ''p'', solve the quadratic equation mod ''p'' to get two roots ''α'' and ''β'', and then add an approximation to log(''p'') to every entry for which ''y''(''x'') = 0 mod ''p'' ... that is, ''A'' 'kp'' + ''α''and ''A'' 'kp'' + ''β'' It is also necessary to solve the quadratic equation modulo small powers of ''p'' in order to recognise numbers divisible by small powers of a factor-base prime. At the end of the factor base, any ''A''[] containing a value above a threshold of roughly log(''x''2−''n'') will correspond to a value of ''y''(''x'') which splits over the factor base. The information about exactly which primes divide ''y''(''x'') has been lost, but it has only small factors, and there are many good algorithms for factoring a number known to have only small factors, such as trial division by small primes, SQUFOF, Pollard rho, and ECM, which are usually used in some combination. There are many ''y''(''x'') values that work, so the factorization process at the end doesn't have to be entirely reliable; often the processes misbehave on say 5% of inputs, requiring a small amount of extra sieving.


Example of basic sieve

This example will demonstrate standard quadratic sieve without logarithm optimizations or prime powers. Let the number to be factored N = 15347, therefore the ceiling of the square root of N is 124. Since N is small, the basic polynomial Y(X) = (X + 124)^2 - 15347 is enough.


Data collection

Since N is small, only four primes are necessary. The first four primes p for which 15347 has a square root mod p are 2, 17, 23, and 29 (in other words, 15347 is a
quadratic residue In number theory, an integer ''q'' is a quadratic residue modulo operation, modulo ''n'' if it is Congruence relation, congruent to a Square number, perfect square modulo ''n''; that is, if there exists an integer ''x'' such that :x^2\equiv q \pm ...
modulo each of these primes). These primes will be the basis for sieving. Now we construct our sieve V_X of Y(X) = (X + \lceil\sqrt\rceil)^2 - N = (X + 124)^2 - 15347 and begin the sieving process for each prime in the basis, choosing to sieve the first 0 \le X < 100 of Y(X): : \beginV &= \begin Y(0) & Y(1) & Y(2) & Y(3) & Y(4) & Y(5) & \cdots & Y(99) \end \\ & =\begin 29 & 278 & 529 & 782 & 1037 & 1294 & \cdots & 34382 \end\end The next step is to perform the sieve. For each p in our factor base \lbrace 2, 17, 23, 29\rbrace solve the equation :Y(X) \equiv (X + \lceil\sqrt\rceil)^2 - N \equiv 0 \pmod to find the entries in the array V which are divisible by p. For p=2 solve (X + 124)^2 - 15347 \equiv 0 \pmod to get the solution X \equiv \sqrt - 124 \equiv 1 \pmod. Thus, starting at X = 1 and incrementing by 2, each entry will be divisible by 2. Dividing each of those entries by 2 yields : V = \begin 29 & 139 & 529 & 391 & 1037 & 647 & \cdots & 17191 \end Similarly for the remaining primes p in \lbrace 17, 23, 29\rbrace the equationX \equiv \sqrt - 124 \pmod is solved. Note that for every p > 2, there will be 2 resulting linear equations due to there being 2 modular square roots. : \begin X \equiv \pm\sqrt - 124 \equiv && 8 - 124 & \equiv 3 & \pmod \\ \equiv && -8 - 124 & \equiv 4 & \pmod \\ X \equiv \pm\sqrt - 124 \equiv && 11 - 124 & \equiv 2 & \pmod \\ \equiv &&\,-11 - 124 & \equiv 3 & \pmod \\ X \equiv \pm\sqrt - 124 \equiv && 8 - 124 & \equiv 0 & \pmod \\ \equiv && -8 - 124 & \equiv 13& \pmod \\ \end Each equation X \equiv a \pmod results in V_x being divisible by p at x = a and each ''p''th value beyond that. Dividing V by p at a, a + p, a + 2p, a + 3p, etc., for each prime in the basis finds the smooth numbers which are products of unique primes (first powers). : V = \begin 1 & 139 & 23 & 1 & 61 & 647 & \cdots & 17191 \end Any entry of V that equals 1 corresponds to a smooth number. Since V_0, V_3, and V_ equal one, this corresponds to:


Matrix processing

Since smooth numbers Y have been found with the property Y \equiv Z^2 \pmod, the remainder of the algorithm follows equivalently to any other variation of
Dixon's factorization method In number theory, Dixon's factorization method (also Dixon's random squares method or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run- ...
. Writing the exponents of the product of a subset of the equations : \begin 29 &= 2^0 \cdot 17^0 \cdot 23^0 \cdot 29^1 \\ 782 &= 2^1 \cdot 17^1 \cdot 23^1 \cdot 29^0 \\ 22678 &= 2^1 \cdot 17^1 \cdot 23^1 \cdot 29^1 \\ \end as a matrix\pmod yields: : S \cdot \begin 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \end \equiv \begin 0 & 0 & 0 & 0 \end \pmod A solution to the equation is given by the left null space, simply : S = \begin1 & 1 & 1 \end Thus the product of all three equations yields a square modulo N. : 29 \cdot 782 \cdot 22678 = 22678^2 and : 124^2 \cdot 127^2 \cdot 195^2 = 3070860^2 So the algorithm found : 22678^2 \equiv 3070860^2 \pmod Testing the result yields \gcd(3070860 - 22678, 15347) = 103, a nontrivial factor of 15347, the other being 149. This demonstration should also serve to show that the quadratic sieve is only appropriate when N is large. For a number as small as 15347, this algorithm is overkill.
Trial division Trial division is the most laborious but easiest to understand of the integer factorization algorithms. The essential idea behind trial division tests to see if an integer , the integer to be factored, can be divided by each number in turn that ...
or Pollard rho could have found a factor with much less computation.


Multiple polynomials

In practice, many different
polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
s are used for ''y'' so that when ''y''(''x'') starts to become large, resulting in poor density of smooth ''y'', this growth can be reset by switching polynomials. As usual, we choose ''y''(''x'') to be a square modulo ''n'', but now with the form y(x)=(Ax+B)^2-n \qquad A,B\in\mathbb. B is chosen such that B^2 = n \pmod A, so B^2 - n = AC for some C. The polynomial y(x) can then be written as y(x) = A\cdot(Ax^2 + 2Bx + C). If ''A'' is a square or a smooth number, then only the factor (Ax^2 + 2Bx + C) has to be checked for smoothness. This approach, called Multiple Polynomial Quadratic Sieve (MPQS), is ideally suited for
parallelization Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different for ...
, since each processor involved in the factorization can be given ''n'', the factor base and a collection of polynomials, and it will have no need to communicate with the central processor until it has finished sieving with its polynomials.


Large primes


One large prime

If, after dividing by all the factors less than ''A'', the remaining part of the number (the cofactor) is less than ''A''2, then this cofactor must be prime. In effect, it can be added to the factor base, by sorting the list of relations into order by cofactor. If ''y''(''a'') = 7⋅11⋅23⋅137 and ''y''(''b'') = 3⋅5⋅7⋅137, then ''y''(''a'')''y''(''b'') = 3⋅5⋅72⋅11⋅23⋅1372. This works by reducing the threshold of entries in the sieving array above which a full factorization is performed.


More large primes

Reducing the threshold even further, and using an effective process for factoring y(x) values into products of even relatively large primes—ECM is superb for this—can find relations with most of their factors in the factor base, but with two or even three larger primes. Cycle finding then allows combining a set of relations sharing several primes into a single relation.


Parameters from realistic example

To illustrate typical parameter choices for a realistic example on a real implementation including the multiple polynomial and large prime optimizations, the too
msieve
was run on a 267-bit
semiprime In mathematics, a semiprime is a natural number that is the product of exactly two prime numbers. The two primes in the product may equal each other, so the semiprimes include the squares of prime numbers. Because there are infinitely many prime n ...
, producing the following parameters: * Trial factoring cutoff: 27 bits * Sieve interval (per polynomial): 393216 (12 blocks of size 32768) * Smoothness bound: 1300967 (50294 primes) * Number of factors for polynomial ''A'' coefficients: 10 ''(see Multiple polynomials above)'' * Large prime bound: 128795733 (26 bits) ''(see Large primes above)'' * Smooth values found: 25952 by sieving directly, 24462 by combining numbers with large primes * Final matrix size: 50294 × 50414, reduced by filtering to 35750 × 35862 * Nontrivial dependencies found: 15 * Total time (on a 1.6 GHz UltraSPARC III): 35 min 39 seconds * Maximum memory used: 8 MB


Factoring records

Until the discovery of the
number field sieve In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form : \begin & ...
(NFS), QS was the asymptotically fastest known general-purpose factoring algorithm. Now,
Lenstra elliptic curve factorization The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the thi ...
has the same asymptotic running time as QS (in the case where ''n'' has exactly two prime factors of equal size), but in practice, QS is faster since it uses
single-precision Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floati ...
operations instead of the
multi-precision In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are pot ...
operations used by the elliptic curve method. On April 2, 1994, the factorization of
RSA-129 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories ...
was completed using QS. It was a 129-digit number, the product of two large primes, one of 64 digits and the other of 65 digits. The factor base for this factorization contained 524339 primes. The data collection phase took 5000 MIPS-years, done in distributed fashion over the Internet. The data collected totaled 2 GB. The data processing phase took 45 hours on
Bellcore iconectiv supplies communications providers with network planning and management services. The company’s cloud-based information as a service network and operations management and numbering solutions span trusted communications, digital identi ...
's (now
Telcordia Technologies iconectiv supplies communications providers with network planning and management services. The company’s cloud-based information as a service network and operations management and numbering solutions span trusted communications, digital identi ...
)
MasPar MasPar Computer Corporation (later NeoVista Software, Inc.) was a minisupercomputer vendor that was founded in 1987 by Jeff Kalb. The company was based in Sunnyvale, California. History While Kalb was the vice-president of the division of Digita ...
(massively parallel) supercomputer. This was the largest published factorization by a general-purpose algorithm, until NFS was used to factor
RSA-130 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories ...
, completed April 10, 1996. All RSA numbers factored since then have been factored using NFS. The current QS factorization record is the 140-digit (463-bit) RSA-140, which was factored by Patrick Konsor in June 2020 using approximately 6,000 core hours over 6 days.


Implementations


PPMPQS and PPSIQS

SIMPQS
is a fast implementation of the self-initialising multiple polynomial quadratic sieve written by William Hart. It provides support for the large prime variant and uses Jason Papadopoulos' block Lanczos implementation for the linear algebra stage. SIMPQS is accessible as the qsieve command in the
SageMath SageMath (previously Sage or SAGE, "System for Algebra and Geometry Experimentation") is a computer algebra system (CAS) with features covering many aspects of mathematics, including algebra, combinatorics, graph theory, group theory, differentia ...
computer algebra package, is part of the
Fast Library for Number Theory The Fast Library for Number Theory (FLINT) is a C library for number theory applications. The two major areas of functionality currently implemented in FLINT are polynomial arithmetic over the integers and a quadratic sieve. The library is de ...
, or can be downloaded in source form. SIMPQS is optimized for use on Athlon and Opteron machines, but will operate on most common 32- and 64-bit architectures. It is written entirely in C. *
factoring applet
by Dario Alpern, that uses the quadratic sieve if certain conditions are met. * The
PARI/GP PARI/GP is a computer algebra system with the main aim of facilitating number theory computations. Versions 2.1.0 and higher are distributed under the GNU General Public License. It runs on most common operating systems. System overview The P ...
computer algebra package includes an implementation of the self-initialising multiple polynomial quadratic sieve implementing the large prime variant. It was adapted by Thomas Papanikolaou and Xavier Roblot from a sieve written for the LiDIA project. The self-initialisation scheme is based on an idea from the thesis of Thomas Sosnowski. * A variant of the quadratic sieve is available in the
MAGMA Magma () is the molten or semi-molten natural material from which all igneous rocks are formed. Magma (sometimes colloquially but incorrectly referred to as ''lava'') is found beneath the surface of the Earth, and evidence of magmatism has also ...
computer algebra package. It is based on an implementation of Arjen Lenstra from 1995, used in his "factoring by email" program.
msieve
an implementation of the multiple polynomial quadratic sieve with support for single and double large primes, written by Jason Papadopoulos. Source code and a Windows binary are available.
YAFU
written by Ben Buhrow, is probably the fastest available implementation of the quadratic sieve. For example,
RSA-100 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories ...
was factored in less than 15 minutes on four cores of a 2.5 GHz Xeon 6248 CPU. All of the critical subroutines make use of
AVX2 Advanced Vector Extensions (AVX, also known as Gesher New Instructions and then Sandy Bridge New Instructions) are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They w ...
or
AVX-512 AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200 (Knights Landing), and then ...
SIMD Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
instructions for AMD or Intel processors. It uses Jason Papadopoulos' block Lanczos code. Source code and binaries for Windows and Linux are available.
Ariel
a simple Java implementation of the quadratic sieve for didactic purposes. * Th
java-math-library
contains probably the fastest quadratic sieve written in Java (the successor of PSIQS 4.0).
Java QS
an open source Java project containing basic implementation of QS. Released at February 4, 2016 by Ilya Gazman
C Quadratic Sieve
a factorizer written entirely in C containing a self-initializing Quadratic Sieve implementation. The software released in 2022 can factor numbers in batches with JSON or CSV output. Its source code by Michel Leonard does not rely on an external library to factor 240-bit numbers in a minute and 300-bit numbers in 2 hours. * Th
RcppBigIntAlgos
package by Joseph Wood, provides an efficient implementation of the multiple polynomial quadratic sieve for the
R programming language R is a programming language for statistical computing and data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is extended by a large number of so ...
. It is written in C++ and is capable of comfortably factoring 100-digit semiprimes. For example, a 300-bit semiprime (91 digits) was factored in 1 hour and the
RSA-100 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories ...
was factored in under 10 hours on a MacBook Air with an Apple M2 processor.


See also

*
Lenstra elliptic curve factorization The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the thi ...
*
primality test A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating wheth ...


References

* * * Describes many practical implementation details.


Other external links

* Reference pape
"The Quadratic Sieve Factoring Algorithm"
by Eric Landquist {{DEFAULTSORT:Quadratic Sieve Integer factorization algorithms