Integer Factoring - Quadratic Sieve

SPOILER ALERT!!! Long and boring article, quite technical, it is in preparation to a Java implementation of the QS that will be included in the Factoring App presented in this site. So far the Java library is not yet ready, but it is under development.

In previous posts we saw something about integer factorization, in particular Trial Division and Pollard’s Rho method. These methods require a time exponential on the size of the problem (the size of the problem “factorize n” is log n, while these methods require on average O(√n) and O(∜n) steps, that means O(2^n/2) and O(2^n/4) complexity).

We know that PRIMES require polynomial time (PRIMES is the problem of determine if a number is prime or composite, not necessarily finding some factor of it) but we don’t know a strict upper bound for the execution time of FACTOR. Since eighties some sub-exponential time algorithm to solve FACTOR have been found, so far all of them are super-polynomial (so an f(x) that is asymptotically o(2^n/a) for each a, but each x^a if asymptotically o(f(x)) for each a, an example of such a function is n^{log n}).

The Quadratic Sieve (QS) is one of the main algorithms for large numbers factoring, maybe the more suitable for numbers with less the 120 digits. It was introduced by Pomerance in 1982 and it has been the first factoring algorithm with sub-exponential heuristic execution time.

Here we’ll try to explain it, let us recall first some concepts. We assume that the reader is familiar with modular algebra. In particular:

p | n means that p divides n, or n = pq
a ≡ b (mod n) means that a = kn + b (b is the remainder of the division between a and n)
quadratic residue is a fundamental fact in this algorithm, we refer to Wikipedia for it. Here for grafical reasons it will be written as (a/p).

1.1 Fermat’s method

Let n be a composite number n = ab. Then we can always write it as a difference of squares:

n = ab = [1/2 (a+b)]² – [1/2 (a-b)]² = x² – y²

The Fermat’s Method tries to create a difference of squares considering the sequence x₀ = ceil(√n), x₁ = ceil(√n)+1, … , x_k = ceil(√n)+k and finding a x_i² – n = y².

In this case we immediately find n = x_i² – y² = (x_i+y)(x_i–y).

This method has a worst case of O(n) cycles (and in each cycle it require a multiplication, a subtraction and a square root extraction!), that is much worst than the O(√n) divisions required in the trial division worst case, nevertheless finding a difference of squares if the approach of QS and Number Field Sieve, the two main modern integer factoring algorithms.

1.2 y-smooth numbers

An y-smooth number is a composite number whose factors are less than or equal to y. For example 120 = 2² * 3 * 5
is 5-smooth.

1.3 Modular difference of squares

Fermat’s Method tries to directly find a relation n = x² – y² to
factor n in (x+y)(x–y). Now consider instead

(1) x² ≡ y² (mod n).

(2) x ≢ ± y (mod n)

we can obtain n from gcd(n, x–y). Infact from (1) we know

n = k (x² – y²) = k (x+y) (x–y)

but from (2) n can be divided neither by (x+y) or by (x–y), so it must share some factor with each term.

1.4 Obtaining a modular difference of squares

If we have a set of y-smooth numbers (mod n), like

u₁² ≡ abc (mod n)
u₂² ≡ ac (mod n)
u₃² ≡ b (mod n)

We can combine them in

(u₁u₂u₃)² ≡ (abc)² (mod n)

and than try to factor n as in paragraph 1.3. Similarly to y-smooth numbers, we can define

u² ≡ v (mod n) a y-smooth relation if v is y-smooth.

1.5 The Quadratic Sieve

The Quadratic Sieve proposes an efficient way to find y-smooth modular relation. In [2] there is a great explanation of the method and its complexity. An important parameter is the bound for y-smooth relations, the bound is B, so we are search B-smooth relations.

QS sieves the polynomial f(x) = x² – n similrly to the sieve of Eratosthenes to fastly find B-smooth relations and try to mix them in a modular difference of squares.

If f(x) is divisible by p then also all f(x+ kp) are divisible by p.

If we have an array of M elements representing the indexes of x_i, if we know that p divides x_i, than we can mark M_i and all M_i+kp. Using sieving in place of trial dividing all f(x) make us save a lot of computational resources.

If after sieving for all primes less than y a cell i of M result totally factored, than f(x_i) is y-smooth.

For x_k = ceil(√n)+k, k>0, f(x_i) = x_i² – n = u_i can be thought as x_i² ≡
u_i (mod n), so if sieving f(x) gives us a sufficient number of modular B-smooth relations, we can try to combine them in a modular difference of squares like in 1.4.

This explains the name of the algorithm, it sieves a quadratic polynomial to construct a modular difference of squares.

Some enhancements

Here we list a series of enhancements for QS, in this post we don’t go in detail on them, but in future we’ll hopefully show them in a real Java implementation, comparing their effectiveness.

2.1 Factor base

When sieving we don’t need to use all the primes p < B, but only for those for which a solution to X² – n ≡ 0 exists, this are quadratic residue of n, and we can precompute them and reduce the sieving effort (computing the solutions of X² ≡ n (mod p)). These numbers form the factor base (FB).

2.2 Multiplicative factor

The candidate n can be multiplied by a small factor a to obtain a factor base with smaller primes (and so an higher probability to find B-smooth relations).

2.3 Combination of the B-smooth relations

After collecting at least |FB|+1 B-smooth relations we can try to combine them in a square. Consider this example:

u₁² ≡ a³bc (mod n)
u₂² ≡ ac (mod n)
u₃² ≡ bd² (mod n)

We need to combine them in

(u₁u₂u₃)² ≡ (a²bcd²)² (mod n)

The simpliest approach is the Gaussian Reduction, with some nice semplifications and optimizations, but it will require O(B³) operations
(actually the big O here hides a constant much smaller than 1), or in alternative with other methods, like Lanczos’s Block Method, that requires O(B²) operations.

Using Gaussian Reduction we can represent each relation as a vector containing the exponents of each number of the factor base found during sieving. In particular since we are studying the parity of exponents, we just need to know if it is 0 or 1, so it is a binary vector. It can be store as raw bits in memory (so 1 bytes stores 8 numbers of FB), moreover the operations are simple XOR, so using int or long for storing data we can process 32 or 64 numbers at once.
Many number of factor base want be used at all during sieving, so we can make an a priori reduction of the matrix, removing all columns without 1 values.

2.4 Large Prime Variation and Double Large Prime Variation.

Suppose that during the sieving some relation has unfactored part less than B². As we saw in Erathostenes’s sieve, this remainder is prime. We can save these relations for a later usage. Infact if two relations has the same large final factor c we can recombine them in a relation useful for our search:

u₁² ≡ a³bc (mod n)
u₂² ≡ ac (mod n)

Become u₁²u₂² ≡ a⁴bc² (mod n) and can be used with the other relations in the algebra phase. This approach seems cumbersome, but when factoring large candidates as sieving proceeds the very large part of relations found is due to large prime variation. Moreover it allows to use a smaller B, and a smaller factor base also implies that we need to found less B-smooth relations.

Double and triple large primes variations aim to use a yet smaller B, but to use them we need a fast way to factor relatively large numbers, a suitable method could be the Elliptic-curve Method.

2.5 Small Prime Variation

Small primes in factor base like 2, 3, 5 etc. give a small contribution in the factorization of f(x), nevertheless they are responsible for a large amount of the sieve work, so we can consider to decide a threshold under which we don ‘t use numbers in factor base and instead we use a corrective value.

2.6 Multiple Polynomials

The function f(x) = x² – n with x near √n is about √n, but as x grows the value of f(x) gets bigger and bigger and the relations B-smooth become more sparse. We can try to use different polynomial in order to work always with values about √n, with the care of creating polynomial with the some factor base of the original one. For example if we would simply use f_k(x) = x² – kn for x near √kn, they wouldn’t work since they have a different factor base (they would work if we use every prime less then B and not only the factor base).

Here we don’t give details on how to change polynomial, but the fact of using always values about √n give a great speedup respect to using only polynomial with values that get bigger and bigger. The name of this technique is Multiple Polynomials Quadratic Sieve (MPQS).

2.7 Self Initializing Multiple Polynomials

Similar to the MPQS but the initialization work for a polnynomial allows to create several polynomial so that we can change them more often, so f(x) is in average smaller and the relations more probable.

Bibliography

[1] C. Pomerance. Analysis and comparison of some integer factoring algorithms. In H. Lenstra, Jr. and R. Tijdeman, editors, Computational methods in number theory, Part I, volume 154 of Math. Centre Tracts, pages 89-139, Math. Centrum, 1982

[2] R. Crandall, C. Pomerance. Prime Numbers, a Computational Perspective, II edition, Springer, 2005