MPQS
   HOME

TheInfoList



OR:

The quadratic sieve
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
(QS) is an
integer factorization In number theory, integer factorization is the decomposition of a composite number into a product of smaller integers. If these factors are further restricted to prime numbers, the process is called prime factorization. When the numbers are suf ...
algorithm and, in practice, the second fastest method known (after the
general number field sieve In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form :\exp\left( ...
). It is still the fastest for
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
s under 100 decimal digits or so, and is considerably simpler than the number field sieve. It is a general-purpose factorization algorithm, meaning that its running time depends solely on the size of the integer to be factored, and not on special structure or properties. It was invented by
Carl Pomerance Carl Bernard Pomerance (born 1944 in Joplin, Missouri) is an American number theorist. He attended college at Brown University and later received his Ph.D. from Harvard University in 1972 with a dissertation proving that any odd perfect number h ...
in 1981 as an improvement to Schroeppel's linear sieve.


Basic aim

The algorithm attempts to set up a
congruence of squares In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms. Derivation Given a positive integer ''n'', Fermat's factorization method relies on finding numbers ''x'' and ''y'' satisfying the equali ...
modulo In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the '' modulus'' of the operation). Given two positive numbers and , modulo (often abbreviated as ) is t ...
''n'' (the integer to be factorized), which often leads to a factorization of ''n''. The algorithm works in two phases: the ''data collection'' phase, where it collects information that may lead to a congruence of squares; and the ''data processing'' phase, where it puts all the data it has collected into a
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
and solves it to obtain a congruence of squares. The data collection phase can be easily parallelized to many processors, but the data processing phase requires large amounts of memory, and is difficult to parallelize efficiently over many nodes or if the processing nodes do not each have enough memory to store the whole matrix. The
block Wiedemann algorithm The block Wiedemann algorithm for computing kernel vectors of a matrix over a finite field is a generalization by Don Coppersmith of an algorithm due to Doug Wiedemann. Wiedemann's algorithm Let M be an n\times n square matrix over some finite ...
can be used in the case of a few systems each capable of holding the matrix. The naive approach to finding a congruence of squares is to pick a random number, square it, divide by ''n'' and hope the least non-negative remainder is a
perfect square ''Perfect Square'' is a 2004 concert film of the alternative rock Musical ensemble, band R.E.M. (band), R.E.M., filmed on July 19, 2003, at the bowling green, Bowling Green in Wiesbaden, Germany. It was released by Warner Reprise Video on March 9, ...
. For example, 80^2 \equiv 441 = 21^2 \pmod. This approach finds a congruence of squares only rarely for large ''n'', but when it does find one, more often than not, the congruence is nontrivial and the factorization is complete. This is roughly the basis of
Fermat's factorization method Fermat's factorization method, named after Pierre de Fermat, is based on the representation of an odd integer as the difference of two squares: :N = a^2 - b^2. That difference is algebraically factorable as (a+b)(a-b); if neither factor equals one, ...
. The quadratic sieve is a modification of
Dixon's factorization method In number theory, Dixon's factorization method (also Dixon's random squares method or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run- ...
. The general running time required for the quadratic sieve (to factor an integer ''n'') is :e^ =L_n\left /2,1\right/math> in the
L-notation ''L''-notation is an asymptotic notation analogous to big-O notation, denoted as L_n alpha,c/math> for a bound variable n tending to infinity. Like big-O notation, it is usually used to roughly convey the rate of growth of a function, such as the ...
. The constant ''e'' is the base of the natural logarithm.


The approach

To factorize the integer ''n'', Fermat's method entails a search for a single number ''a'', , such that the remainder of ''a''2 divided by ''n'' is a square. But these ''a'' are hard to find. The quadratic sieve consists of computing the remainder of ''a''2/''n'' for several ''a'', then finding a subset of these whose product is a square. This will yield a congruence of squares. For example, consider attempting to factor the number 1649. We have: 41^2 \equiv 32, 42^2 \equiv 115, 43^2 \equiv 200 \pmod. None of the integers 32, 115, 200 is a square, but the product 32 \cdot 200 = 80^2 is a square. We also had :32 \cdot 200 \equiv 41^2 \cdot 43^2 = (41 \cdot 43)^2 \equiv 114^2 \pmod since 41 \cdot 43 \equiv 114 \pmod. The observation that 32 \cdot 200 = 80^2 thus gives a
congruence of squares In number theory, a congruence of squares is a congruence commonly used in integer factorization algorithms. Derivation Given a positive integer ''n'', Fermat's factorization method relies on finding numbers ''x'' and ''y'' satisfying the equali ...
:114^2 \equiv 80^2 \pmod. Hence 114^2 - 80^2 = (114 + 80) \cdot (114 - 80) = 194 \cdot 34 = k \cdot 1649 for some integer k. We can then factor : 1649 = \gcd( 194, 1649 ) \cdot \gcd( 34, 1649 ) = 97 \cdot 17 using the
Euclidean algorithm In mathematics, the Euclidean algorithm,Some widely used textbooks, such as I. N. Herstein's ''Topics in Algebra'' and Serge Lang's ''Algebra'', use the term "Euclidean algorithm" to refer to Euclidean division or Euclid's algorithm, is an effi ...
to calculate the
greatest common divisor In mathematics, the greatest common divisor (GCD) of two or more integers, which are not all zero, is the largest positive integer that divides each of the integers. For two integers ''x'', ''y'', the greatest common divisor of ''x'' and ''y'' is ...
. So the problem has now been reduced to: given a set of integers, find a subset whose product is a square. By the
fundamental theorem of arithmetic In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 can be represented uniquely as a product of prime numbers, up to the ord ...
, any positive integer can be written uniquely as a product of
prime power In mathematics, a prime power is a positive integer which is a positive integer power of a single prime number. For example: , and are prime powers, while , and are not. The sequence of prime powers begins: 2, 3, 4, 5, 7, 8, 9, 11, 13, 16, 17 ...
s. We do this in a vector format; for example, the prime-power factorization of 504 is 23325071, it is therefore represented by the exponent vector (3,2,0,1). Multiplying two integers then corresponds to adding their exponent vectors. A number is a square when its exponent vector is even in every coordinate. For example, the vectors (3,2,0,1) + (1,0,0,1) = (4,2,0,2), so (504)(14) is a square. Searching for a square requires knowledge only of the
parity Parity may refer to: * Parity (computing) ** Parity bit in computing, sets the parity of data for the purpose of error detection ** Parity flag in computing, indicates if the number of set bits is odd or even in the binary representation of the r ...
of the numbers in the vectors, so it is sufficient to compute these vectors mod 2: (1,0,0,1) + (1,0,0,1) = (0,0,0,0). So given a set of (0,1)-vectors, we need to find a subset which adds to the
zero vector In mathematics, a zero element is one of several generalizations of the number zero to other algebraic structures. These alternate meanings may or may not reduce to the same thing, depending on the context. Additive identities An additive identi ...
mod 2. This is a
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...
problem since the
ring Ring may refer to: * Ring (jewellery), a round band, usually made of metal, worn as ornamental jewelry * To make a sound with a bell, and the sound made by a bell :(hence) to initiate a telephone connection Arts, entertainment and media Film and ...
\mathbb/2\mathbb can be regarded as the
Galois field In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subtra ...
of order 2, that is we can divide by all non-zero numbers (there is only one, namely 1) when calculating modulo 2. It is a theorem of linear algebra that with more vectors than each vector has entries, a
linear dependency In the theory of vector spaces, a set of vectors is said to be if there is a nontrivial linear combination of the vectors that equals the zero vector. If no such linear combination exists, then the vectors are said to be . These concepts are ...
always exists. It can be found by
Gaussian elimination In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of operations performed on the corresponding matrix of coefficients. This method can also be used ...
. However, simply squaring many random numbers mod ''n'' produces a very large number of different
prime A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
factors, and so very long vectors and a very large matrix. The trick is to look specifically for numbers ''a'' such that ''a''2 mod ''n'' has only small prime factors (they are
smooth number In number theory, an ''n''-smooth (or ''n''-friable) number is an integer whose prime factors are all less than or equal to ''n''. For example, a 7-smooth number is a number whose every prime factor is at most 7, so 49 = 72 and 15750 = 2 × 32 × ...
s). They are harder to find, but using only smooth numbers keeps the vectors and matrices smaller and more tractable. The quadratic sieve searches for smooth numbers using a technique called
sieving A sieve, fine mesh strainer, or sift, is a device for separation process, separating wanted elements from unwanted material or for controlling the particle size distribution of a sample, using a screen such as a warp and weft, woven mesh or n ...
, discussed later, from which the algorithm takes its name.


The algorithm

To summarize, the basic quadratic sieve algorithm has these main steps: # Choose a smoothness bound ''B''. The number π(''B''), denoting the number of prime numbers less than ''B'', will control both the length of the vectors and the number of vectors needed. # Use sieving to locate π(''B'') + 1 numbers ''ai'' such that ''bi'' = (''ai''2 mod ''n'') is ''B''-smooth. #Factor the ''bi'' and generate exponent vectors mod 2 for each one. # Use linear algebra to find a subset of these vectors which add to the zero vector. Multiply the corresponding ''ai ''together and give the result mod ''n'' the name ''a''; similarly, multiply the ''bi'' together which yields a ''B''-smooth square ''b''2. #We are now left with the equality ''a''2 = ''b''2 mod ''n'' from which we get two square roots of (''a''2 mod ''n''), one by taking the square root in the integers of ''b''2 namely ''b'', and the other the ''a'' computed in step 4. # We now have the desired identity: (a+b)(a-b)\equiv0 \pmod n. Compute the GCD of ''n'' with the difference (or sum) of ''a'' and ''b''. This produces a factor, although it may be a trivial factor (''n'' or 1). If the factor is trivial, try again with a different linear dependency or different ''a''. The remainder of this article explains details and extensions of this basic algorithm.


The algorithm's pseudo code

algorithm Quadratic sieve is Choose smoothness bound B let t = \pi(B) for i \in do Choose x \in \ a_i = x + \lfloor\sqrt\rfloor b_i = a_i^2 - n (where b_i = \prod_^t p_j^ ) while (check-p_t-smooth(b_i) = false) do Let v_i = (v_, v_, ..., v_) = (e_ \bmod 2, e_ \bmod 2, ..., e_ \bmod 2) Find T \subseteq \: \sum_ v_i = 0 \in \mathbb_2 let x = \prod_ a_i \bmod n let y = \prod_^t p_j^ \bmod n if x \not\equiv -y \bmod n and x \not\equiv y then return gcd(x - y, n) , gcd(x + y, n) else : return to main loop.


How QS optimizes finding congruences

The quadratic sieve attempts to find pairs of integers ''x'' and ''y''(''x'') (where ''y''(''x'') is a function of ''x'') satisfying a much weaker condition than ''x''2 ≡ ''y''2 (mod ''n''). It selects a set of
primes A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
called the ''factor base'', and attempts to find ''x'' such that the least absolute remainder of ''y''(''x'') = ''x''2 mod ''n'' factorizes completely over the factor base. Such ''y'' values are said to be ''smooth'' with respect to the factor base. The factorization of a value of ''y''(''x'') that splits over the factor base, together with the value of ''x'', is known as a ''relation''. The quadratic sieve speeds up the process of finding relations by taking ''x'' close to the square root of ''n''. This ensures that ''y''(''x'') will be smaller, and thus have a greater chance of being smooth. :y(x)=\left(\left\lceil\sqrt\right\rceil+x\right)^2-n\hboxx\hbox :y(x)\approx 2x\left\lceil\sqrt\right\rceil This implies that ''y'' is on the order of 2''x''[]. However, it also implies that ''y'' grows linearly with ''x'' times the square root of ''n''. Another way to increase the chance of smoothness is by simply increasing the size of the factor base. However, it is necessary to find at least one smooth relation more than the number of primes in the factor base, to ensure the existence of a linear dependency.


Partial relations and cycles

Even if for some relation ''y''(''x'') is not smooth, it may be possible to merge two of these ''partial relations'' to form a full one, if the two ''y''s are products of the same prime(s) outside the factor base. ote that this is equivalent to extending the factor base.For example, if the factor base is and ''n'' = 91, there are partial relations: : : Multiply these together: : and multiply both sides by (11−1)2 modulo 91. 11−1 modulo 91 is 58, so: :(58\cdot 21\cdot 29)^2\equiv 2^1\cdot7^1\pmod :14^2\equiv 2^1\cdot7^1\pmod producing a full relation. Such a full relation (obtained by combining partial relations) is called a ''cycle''. Sometimes, forming a cycle from two partial relations leads directly to a congruence of squares, but rarely.


Checking smoothness by sieving

There are several ways to check for smoothness of the ''y''s. The most obvious is by
trial division Trial division is the most laborious but easiest to understand of the integer factorization algorithms. The essential idea behind trial division tests to see if an integer ''n'', the integer to be factored, can be divided by each number in turn ...
, although this increases the running time for the data collection phase. Another method that has some acceptance is the
elliptic curve method The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the thi ...
(ECM). In practice, a process called ''sieving'' is typically used. If ''f''(''x'') is the
polynomial In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An exa ...
f(x)=x^2-n we have :\begin f(x)&=x^2-n \\ f(x+kp) &= (x+kp)^2-n \\ &= x^2+2xkp+(kp)^2-n \\ &= f(x)+2xkp+(kp)^2\equiv f(x)\pmod \end Thus solving ''f(x)'' ≡ 0 (mod ''p'') for ''x'' generates a whole sequence of numbers ''y'' for which ''y''=''f''(''x''), all of which are divisible by ''p''. This is finding a square root modulo a prime, for which there exist efficient algorithms, such as the Shanks–Tonelli algorithm. (This is where the quadratic sieve gets its name: ''y'' is a quadratic polynomial in ''x'', and the sieving process works like the
Sieve of Eratosthenes In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit. It does so by iteratively marking as composite (i.e., not prime) the multiples of each prime, starting with the first prime n ...
.) The sieve starts by setting every entry in a large array ''A''[] of bytes to zero. For each ''p'', solve the quadratic equation mod ''p'' to get two roots ''α'' and ''β'', and then add an approximation to log(''p'') to every entry for which ''y''(''x'') = 0 mod ''p'' ... that is, ''A'' 'kp'' + ''α''and ''A'' 'kp'' + ''β'' It is also necessary to solve the quadratic equation modulo small powers of ''p'' in order to recognise numbers divisible by small powers of a factor-base prime. At the end of the factor base, any ''A''[] containing a value above a threshold of roughly log(''x''2−''n'') will correspond to a value of ''y''(''x'') which splits over the factor base. The information about exactly which primes divide ''y''(''x'') has been lost, but it has only small factors, and there are many good algorithms for factoring a number known to have only small factors, such as trial division by small primes,
SQUFOF Shanks's square forms factorization is a method for integer factorization devised by Daniel Shanks as an improvement on Fermat's factorization method. The success of Fermat's method depends on finding integers x and y such that x^2-y^2=N, where N ...
, Pollard rho, and ECM, which are usually used in some combination. There are many ''y''(''x'') values that work, so the factorization process at the end doesn't have to be entirely reliable; often the processes misbehave on say 5% of inputs, requiring a small amount of extra sieving.


Example of basic sieve

This example will demonstrate standard quadratic sieve without logarithm optimizations or prime powers. Let the number to be factored ''N'' = 15347, therefore the ceiling of the square root of ''N'' is 124. Since ''N'' is small, the basic polynomial is enough: ''y''(''x'') = (''x'' + 124)2 − 15347.


Data collection

Since ''N'' is small, only 4 primes are necessary. The first 4 primes ''p'' for which 15347 has a square root mod ''p'' are 2, 17, 23, and 29 (in other words, 15347 is a
quadratic residue In number theory, an integer ''q'' is called a quadratic residue modulo ''n'' if it is congruent to a perfect square modulo ''n''; i.e., if there exists an integer ''x'' such that: :x^2\equiv q \pmod. Otherwise, ''q'' is called a quadratic no ...
modulo each of these primes). These primes will be the basis for sieving. Now we construct our sieve V_X of Y(X) = (X + \lceil\sqrt\rceil)^2 - N = (X+124)^2-15347 and begin the sieving process for each prime in the basis, choosing to sieve the first 0 ≤ X < 100 of Y(X): : \beginV &= \begin Y(0) & Y(1) & Y(2) & Y(3) & Y(4) & Y(5) & \cdots & Y(99) \end \\ & =\begin 29 & 278 & 529 & 782 & 1037 & 1294 & \cdots & 34382 \end\end The next step is to perform the sieve. For each ''p'' in our factor base \lbrace 2, 17, 23, 29\rbrace solve the equation :Y(X) \equiv (X + \lceil\sqrt\rceil)^2 - N \equiv 0 \pmod to find the entries in the array ''V'' which are divisible by ''p''. For p=2 solve (X + 124)^2 - 15347 \equiv 0 \pmod to get the solution X \equiv \sqrt-124 \equiv 1 \pmod. Thus, starting at X=1 and incrementing by 2, each entry will be divisible by 2. Dividing each of those entries by 2 yields : V = \begin 29 & 139 & 529 & 391 & 1037 & 647 & \cdots & 17191 \end Similarly for the remaining primes ''p'' in \lbrace 17, 23, 29\rbrace the equationX \equiv \sqrt - 124 \pmod is solved. Note that for every ''p'' > 2, there will be 2 resulting linear equations due to there being 2 modular square roots. : \begin X & \equiv \sqrt - 124 & \equiv 8 - 124 & \equiv 3\pmod \\ & & \equiv 9 - 124 & \equiv 4\pmod \\ X & \equiv \sqrt - 124 & \equiv 11 - 124 & \equiv 2\pmod \\ & & \equiv 12 - 124 & \equiv 3\pmod \\ X & \equiv \sqrt - 124 & \equiv 8 - 124 & \equiv 0\pmod \\ & & \equiv 21 - 124 & \equiv 13\pmod \\ \end Each equation X \equiv a \pmod results in V_x being divisible by ''p'' at ''x''=''a'' and each ''p''th value beyond that. Dividing ''V'' by ''p'' at ''a'', ''a''+''p'', ''a''+2''p'', ''a''+3''p'', etc., for each prime in the basis finds the smooth numbers which are products of unique primes (first powers). : V = \begin 1 & 139 & 23 & 1 & 61 & 647 & \cdots & 17191 \end Any entry of ''V'' that equals 1 corresponds to a smooth number. Since V_0, V_3, and V_ equal one, this corresponds to:


Matrix processing

Since smooth numbers ''Y'' have been found with the property Y \equiv Z^2 \pmod, the remainder of the algorithm follows equivalently to any other variation of
Dixon's factorization method In number theory, Dixon's factorization method (also Dixon's random squares method or Dixon's algorithm) is a general-purpose integer factorization algorithm; it is the prototypical factor base method. Unlike for other factor base methods, its run- ...
. Writing the exponents of the product of a subset of the equations : \begin 29 &= 2^0 \cdot 17^0 \cdot 23^0 \cdot 29^1 \\ 782 &= 2^1 \cdot 17^1 \cdot 23^1 \cdot 29^0 \\ 22678 &= 2^1 \cdot 17^1 \cdot 23^1 \cdot 29^1 \\ \end as a matrix \pmod yields: : S \cdot \begin 0 & 0 & 0 & 1 \\ 1 & 1 & 1 & 0 \\ 1 & 1 & 1 & 1 \end \equiv \begin 0 & 0 & 0 & 0 \end \pmod A solution to the equation is given by the
left null space In mathematics, the kernel of a linear map, also known as the null space or nullspace, is the linear subspace of the domain of the map which is mapped to the zero vector. That is, given a linear map between two vector spaces and , the kernel of ...
, simply : S = \begin1 & 1 & 1 \end Thus the product of all 3 equations yields a square (mod N). : 29 \cdot 782 \cdot 22678 = 22678^2 and : 124^2 \cdot 127^2 \cdot 195^2 = 3070860^2 So the algorithm found : 22678^2 \equiv 3070860^2 \pmod Testing the result yields GCD(3070860 - 22678, 15347) = 103, a nontrivial factor of 15347, the other being 149. This demonstration should also serve to show that the quadratic sieve is only appropriate when ''n'' is large. For a number as small as 15347, this algorithm is overkill.
Trial division Trial division is the most laborious but easiest to understand of the integer factorization algorithms. The essential idea behind trial division tests to see if an integer ''n'', the integer to be factored, can be divided by each number in turn ...
or Pollard rho could have found a factor with much less computation.


Multiple polynomials

In practice, many different
polynomial In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An exa ...
s are used for ''y'', since only one polynomial will not typically provide enough (''x'', ''y'') pairs that are smooth over the factor base. The polynomials used must have a special form, since they need to be squares modulo ''n''. The polynomials must all have a similar form to the original ''y''(''x'') = ''x''2 − ''n'': :y(x)=(Ax+B)^2-n \qquad A,B\in\mathbb Assuming B^2-n is a multiple of A, so that B^2-n = AC the polynomial y(x) can be written as y(x) = A\cdot(Ax^2+2Bx+C). If ''A'' is a square, then only the factor (Ax^2+2Bx+C) has to be considered. This approach (called MPQS, Multiple Polynomial Quadratic Sieve) is ideally suited for
parallelization Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different fo ...
, since each
processor Processor may refer to: Computing Hardware * Processor (computing) **Central processing unit (CPU), the hardware within a computer that executes a program *** Microprocessor, a central processing unit contained on a single integrated circuit (I ...
involved in the factorization can be given ''n'', the factor base and a collection of polynomials, and it will have no need to communicate with the central processor until it is finished with its polynomials.


Large primes


One large prime

If, after dividing by all the factors less than ''A'', the remaining part of the number (the cofactor) is less than ''A''2, then this cofactor must be prime. In effect, it can be added to the factor base, by sorting the list of relations into order by cofactor. If y(a) = 7*11*23*137 and y(b) = 3*5*7*137, then y(a)y(b) = 3*5*11*23 * 72 * 1372. This works by reducing the threshold of entries in the sieving array above which a full factorization is performed.


More large primes

Reducing the threshold even further, and using an effective process for factoring y(x) values into products of even relatively large primes - ECM is superb for this - can find relations with most of their factors in the factor base, but with two or even three larger primes. Cycle finding then allows combining a set of relations sharing several primes into a single relation.


Parameters from realistic example

To illustrate typical parameter choices for a realistic example on a real implementation including the multiple polynomial and large prime optimizations, the too
msieve
was run on a 267-bit
semiprime In mathematics, a semiprime is a natural number that is the product of exactly two prime numbers. The two primes in the product may equal each other, so the semiprimes include the squares of prime numbers. Because there are infinitely many prime nu ...
, producing the following parameters: * Trial factoring cutoff: 27 bits * Sieve interval (per polynomial): 393216 (12 blocks of size 32768) * Smoothness bound: 1300967 (50294 primes) * Number of factors for polynomial ''A'' coefficients: 10 ''(see Multiple polynomials above)'' * Large prime bound: 128795733 (26 bits) ''(see Large primes above)'' * Smooth values found: 25952 by sieving directly, 24462 by combining numbers with large primes * Final matrix size: 50294 × 50414, reduced by filtering to 35750 × 35862 * Nontrivial dependencies found: 15 * Total time (on a 1.6 GHz UltraSPARC III): 35 min 39 seconds * Maximum memory used: 8 MB


Factoring records

Until the discovery of the
number field sieve In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form :\exp\left( ...
(NFS), QS was the asymptotically fastest known general-purpose factoring algorithm. Now,
Lenstra elliptic curve factorization The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the th ...
has the same asymptotic running time as QS (in the case where ''n'' has exactly two prime factors of equal size), but in practice, QS is faster since it uses
single-precision Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. A floating- ...
operations instead of the multi-precision operations used by the elliptic curve method. On April 2, 1994, the factorization of
RSA-129 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories in ...
was completed using QS. It was a 129-digit number, the product of two large primes, one of 64 digits and the other of 65. The factor base for this factorization contained 524339 primes. The data collection phase took 5000 MIPS-years, done in distributed fashion over the Internet. The data collected totaled 2 GB. The data processing phase took 45 hours on
Bellcore iconectiv is a supplier of network planning and network management services to telecommunications providers. Known as Bellcore after its establishment in the United States in 1983 as part of the break-up of the Bell System, the company's name ...
's (now
Telcordia Technologies iconectiv is a supplier of network planning and network management services to telecommunications providers. Known as Bellcore after its establishment in the United States in 1983 as part of the break-up of the Bell System, the company's name ...
)
MasPar MasPar Computer Corporation was a minisupercomputer vendor that was founded in 1987 by Jeff Kalb. The company was based in Sunnyvale, California. History While Kalb was the vice-president of the division of Digital Equipment Corporation (DEC) t ...
(massively parallel) supercomputer. This was the largest published factorization by a general-purpose algorithm, until NFS was used to factor
RSA-130 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories in ...
, completed April 10, 1996. All
RSA number In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories in ...
s factored since then have been factored using NFS. The current QS factorization record is the 140 digit (463 bit) RSA-140, which was factored by Patrick Konsor in June 2020 using approximately 6,000 core hours over 6 days.


Implementations


PPMPQS and PPSIQS

mpqs

SIMPQS
is a fast implementation of the self-initialising multiple polynomial quadratic sieve written by William Hart. It provides support for the large prime variant and uses Jason Papadopoulos' block Lanczos code for the linear algebra stage. SIMPQS is accessible as the qsieve command in the
SageMath SageMath (previously Sage or SAGE, "System for Algebra and Geometry Experimentation") is a computer algebra system (CAS) with features covering many aspects of mathematics, including algebra, combinatorics, graph theory, numerical analysis, numbe ...
computer algebra package or can be downloaded in source form. SIMPQS is optimized for use on Athlon and Opteron machines, but will operate on most common 32- and 64-bit architectures. It is written entirely in C. *
factoring applet
by Dario Alpern, that uses the quadratic sieve if certain conditions are met. * The
PARI/GP PARI/GP is a computer algebra system with the main aim of facilitating number theory computations. Versions 2.1.0 and higher are distributed under the GNU General Public License. It runs on most common operating systems. System overview The P ...
computer algebra package includes an implementation of the self-initialising multiple polynomial quadratic sieve implementing the large prime variant. It was adapted by Thomas Papanikolaou and Xavier Roblot from a sieve written for the LiDIA project. The self initialisation scheme is based on an idea from the thesis of Thomas Sosnowski. * A variant of the quadratic sieve is available in the
MAGMA Magma () is the molten or semi-molten natural material from which all igneous rocks are formed. Magma is found beneath the surface of the Earth, and evidence of magmatism has also been discovered on other terrestrial planets and some natural sa ...
computer algebra package. It is based on an implementation of Arjen Lenstra from 1995, used in his "factoring by email" program.
msieve
an implementation of the multiple polynomial quadratic sieve with support for single and double large primes, written by Jason Papadopoulos. Source code and a Windows binary are available.
YAFU
written by Ben Buhrow, is similar to msieve but is faster for most modern
processors A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
. It uses Jason Papadopoulos' block Lanczos code. Source code and binaries for Windows and Linux are available.
Ariel
a simple Java implementation of the quadratic sieve for didactic purposes. * Th
java-math-library
contains probably the fastest quadratic sieve written in Java (the successor of PSIQS 4.0).
Java QS
an open source Java project containing basic implementation of QS. Released at February 4, 2016 by Ilya Gazman
C Quadratic Sieve
a factorizer written entirely in C containing implementation of self-initialising Quadratic Sieve. The project is mathematically inspired by a William Hart's FLINT factorizer. The source released in 2022 by Michel Leonard does not rely on external library, it is capable of factoring 220-bit numbers in a minute. * Th
RcppBigIntAlgos
package by Joseph Wood, provides an efficient implementation of the multiple polynomial quadratic sieve for the
R programming language R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinform ...
. It is written in C++ and is capable of comfortably factoring 100 digit
semiprimes In mathematics, a semiprime is a natural number that is the product of exactly two prime numbers. The two primes in the product may equal each other, so the semiprimes include the squares of prime numbers. Because there are infinitely many prime ...
. For example,
RSA-100 In mathematics, the RSA numbers are a set of large semiprimes (numbers with exactly two prime factors) that were part of the RSA Factoring Challenge. The challenge was to find the prime factors of each number. It was created by RSA Laboratories in ...
was factored in under 16 hours on a 2.8 GHz Quad-Core Intel Core i7 personal computer.


See also

*
Lenstra elliptic curve factorization The Lenstra elliptic-curve factorization or the elliptic-curve factorization method (ECM) is a fast, sub-exponential running time, algorithm for integer factorization, which employs elliptic curves. For general-purpose factoring, ECM is the th ...
*
primality test A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating whet ...


References

* Section 6.1: The quadratic sieve factorization method, pp. 227–244. *


Other external links

* Reference pape
"The Quadratic Sieve Factoring Algorithm"
by Eric Landquist {{DEFAULTSORT:Quadratic Sieve Integer factorization algorithms