The polymath blog

October 27, 2009

(Research thread V) Determinstic way to find primes

Filed under: finding primes, research — Terence Tao @ 10:25 pm

It’s probably time to refresh the previous thread for the “finding primes” project, and to summarise the current state of affairs.

The current goal is to find a deterministic way to locate a prime in an interval [z,2z] in time that breaks the “square root barrier” of \sqrt(z) (or more precisely, z^{1/2+o(1)}).  Currently, we have two ways to reach that barrier:

  1. Assuming the Riemann hypothesis, the largest prime gap in [z,2z] is of size z^{1/2+o(1)}.  So one can simply test consecutive numbers for primality until one gets a hit (using, say, the AKS algorithm, any number of size z can be tested for primality in time z^{o(1)}.
  2. The second method is due to Odlyzko, and does not require the Riemann hypothesis.  There is a contour integration formula that allows one to write the prime counting function \pi(z) up to error z^{1+o(1)}/T in terms of an integral involving the Riemann zeta function over an interval of length O(T), for any 1 \leq T \leq z.  The latter integral can be computed to the required accuracy in time about z^{o(1)} T.  With this and a binary search it is not difficult to locate an interval of width z^{1+o(1)}/T that is guaranteed to contain a prime in time z^{o(1)} T.  Optimising by choosing T = z^{1/2} and using a sieve (or by testing the elements for primality one by one), one can then locate that prime in time z^{1/2+o(1)}.

Currently we have one promising approach to break the square root barrier, based on the polynomial method, but while individual components of this approach fall underneath the square root barrier, we have not yet been able to get the whole thing below (or even matching) the square root.  I will sketch the approach (as far as I understand it) below; right now we are needing some shortcuts (e.g. FFT, fast matrix multiplication, that sort of thing) that can cut the run time further.

– The polynomial method –

The polynomial method begins with the following observation: in order to quickly find a prime in [z,2z], it suffices to be able to quickly solve the prime decision problem: given a subinterval [a,b] of [z,2z], decide whether such an interval contains a prime or not.  If one can solve this problem in, say, z^{0.499+o(1)} time, then one can find a prime in this time also by binary search.

Actually, using Odlyzko’s method we can already narrow down to an interval of length z^{0.501+o(1)} with a lot of primes in it in z^{0.499+o(1)} time, so we only need to break the square root barrier for the decision problem for intervals [a,b] of length z^{0.501+o(1)} or less.

The decision problem is equivalent to determining whether the prime polynomial

f(x) := \sum_{a \leq p \leq b} x^p (1)

is non-trivial or not, where p ranges over primes in the interval [a,b].

Now, the prime polynomial, as it stands, has a high complexity; the only obvious way to compute it is to enumerate all the primes from a to b, which could take z^{0.501+o(1)} time in the worst case.  But we can improve matters by working modulo 2; note that as the coefficients of f are either 1 or 0, it suffices to decide whether f is non-trivial modulo 2.

The reason we do this is the observation that if n is a natural number, then the number of solutions to the diophantine equation n=lm with 1 \leq l < m is odd when n is prime, and usually even when n is composite.  (There are some rare exceptions to this latter fact, when n contains square factors, but it seems likely that one can deal with these latter cases by Möbius inversion, exploiting the convergence of the sum \sum_{d=1}^\infty \frac{1}{d^2}.)  So, the prime polynomial f modulo 2 is morally equal to the variant polynomial

\tilde f(x) := \sum_{1 \leq l < m: a \leq lm \leq b} x^{lm}.  (2)

So a toy problem would be to decide whether (2) was non-zero modulo 2 or not in time z^{0.499+o(1)} or better.

The reason that (2) is more appealing than (1) is that the primes have disappeared from the problem.  Instead, one is computing a sum over a fairly simple region \Omega := \{ (l,m): 1 \leq l < m, a \leq lm \leq b \} bounded by two hyperbolae and two lines.

The point is now this: if \tilde f(x) vanishes modulo 2, then it also vanishes modulo (2,g(x)) for any low-degree polynomial g (degree z^{0.01+o(1)} or better), and more generally $\tilde f(x^n)$ vanishes modulo (2,g(x)).  Conversely (if one is lucky), if \tilde f(x^n) vanishes modulo (2,g(x)) for sufficiently many n, g, then it should be that \tilde f vanishes.  So this leads to the following strategy:

  • Goal 1: Find a collection of n,g such that if \tilde f(x^n) vanishes modulo (2,g(x)) for all the pairs n,g, then \tilde f vanishes modulo 2.
  • Goal 2: Find a way to decide whether \tilde f(x^n) vanishes modulo (2,g(x)) for all the required n,g in time z^{0.499+o(1)} or better.

One way to achieve Goal 1 is to forget about n, and choose the g so that the least common multiple of all the g (modulo 2) cannot divide \tilde f.  Since \tilde f is basically a polynomial of degree z^{0.501+o(1)} shifted by a monomial, one obvious way to proceed would be to pick more than z^{0.501+o(1)} polynomials g.  But then it looks unlikely that one can beat the square root barrier in Goal 2.  Similarly if one varies n as well as g.

On the other hand, we have a partial result in Goal 2: for any fixed n and g, we can compute \tilde f(x^n) \hbox{ mod } (2, g(x)) in time below the square root barrier, e.g. in time z^{0.49+o(1)}.  For instance, setting n=1 and g(x)=x-1, we can compute \tilde f(1) in this time.  Unfortunately a single n,g is not nearly enough to solve Goal 2 yet, so we either need a further advance on Goal 1, or some very efficient way to test non-vanishing of \tilde f(x^n) modulo (2,g) for many pairs (n,g) at a time (e.g. by an FFT type approach).

The partial result is based on the fact that \tilde f(x) has an arithmetic circuit complexity below the square root level, i.e. it can be expressed in terms of O(z^{0.48+o(1)}) (say) arithmetic operations.  As such, for any low-degree g (say degree O(z^{o.01+o(1)})), \tilde f(x^n) \mod (2,g(x) ) can be computed in O(z^{0.49+o(1)}) time (using fast multiplication for mod g arithmetic if necessary).

Let’s sketch how the circuit complexity result works.  Recall that (2) is a sum over the geometric region \Omega.  Using the geometric series formula, one can convert this sum over \Omega to a sum over what is basically the boundary of \Omega.  This boundary has O( \sqrt{z} ) points, so this shows that \tilde f has an arithmetic circuit complexity of z^{1/2} already.  But one can do better by using the Farey sequence to represent the discrete hyperbolae that bound \Omega by line segments.  The sum over each line segment is basically a quadratic sum of the form \sum_{n=n_0}^{n_1} x^{a n^2 + bn + c} for various coefficients a,b,c.  It seems that one can factorise this sum as a matrix product and use ideas from the Strassen fast multiplication algorithm to give this a slightly better circuit complexity than the crude bound of O( n_1 - n_0 ); see these notes; there may also  be other approaches to computing this quickly (e.g. FFT).

Where we’re still stuck right now is scaling up this single case of Goal 2 to the more general case we need.  Alternatively, we need to strengthen Goal 1 by cutting down substantially the number of pairs (n,g) we need to test…

5 Comments »

  1. [...] http://polymathprojects.org/2009/10/27/research-thread-v-determinstic-way-to-find-primes/ Possibly related posts: (automatically generated)You’ll Always Find Your Way Back Home New ClipSOME BEST WAYS 2 FIND NEW MUSICRavelry Finds ~ New Ways to Warm UpNo Way Home: The Rules [...]

    Pingback by New thread for deterministic way to find primes « Euclidean Ramsey Theory — October 28, 2009 @ 5:08 pm | Reply

  2. Regarding writing the prime generating function f(X) as a sum \sum_i \alpha_i(X)\beta_i(X), where the \alpha_i and \beta_i are sparse polynomials (say, m terms each): I had a conversation with a friend of mine this past weekend who is an expert on real algebraic geometry, and who was in town for the FOCS conference here at Georgia Tech. He said that in the R[x] context (the \alpha_i, \beta_i are polynomials with real coefficients) this problem is a well-studied in terms of trying to generalize Descarte’s Rule of Sign. If a single polynomial has only m terms, Descarte’s rule implies that it can have at most O(m) non-zero real roots, and apparently it is a studied open problem to extend this to a sum of products of two sparse polynomials like we have (but in R[x]). Perhaps there are some techniques that algebraic geometers have developed that would be useful in bounding the number of roots of our analogous polynomials in F_2[x], or at least good conjectures. I didn’t get the chance to ask him what is known on this problem, but I will soon, and report back if I discover anything on it…

    Incidentally, *he* initiated the discussion of this problem — not me — as it pertained to one of the FOCS talks on arithmetic circuit complexity.

    Comment by Ernie Croot — October 28, 2009 @ 6:40 pm | Reply

    • Oops… I meant to say “bounding the number of roots… among 1,x,x^2,...,x^N, N \sim z^{0.49} in F_2[x]/(2,g(x))”.

      Comment by Ernie Croot — October 28, 2009 @ 7:01 pm | Reply

    • It seems that the result on Descartes Rule uses special properties of the reals, and probably won’t extend to the complex (or finite field) setting. Here is the problem I asked my friend:

      —-

      Problem. Suppose that f_1, ..., f_k, g_1,...,g_k are all polynomials
      in F_2[X] having m terms each. Let g(X) be an irreducible polynomial,
      such that the order of x (mod 2,g(x)) is at least m^2 k^2, say. Let

      f(X) = f_1(X)g_1(X) + ... + f_k(X)g_k(X).

      Show that f(1), f(x), f(x^2), ..., f(x^(2mk)) can’t all be 0 in F_2[X]/(g(x)).

      —-

      And here was his response:

      One remark about the problem. The fact that there exists any bound at all in terms of m and k in the real case has to do with the fact that we are bounding the number of real roots. Of course, no such bounds exist for complex roots — this makes me suspect that standard tools of algebraic geometry are probably not very useful in this context.

      An exponential bound (in m as well as k) in the real case follows from Khovansky’s theory Pfaffian functions (you might want to look at the book “Fewnomials” by Khovansky).
      But I do not think it can have applications in the finite field case where an exponential bound is probably obvious.

      Comment by Ernie Croot — October 29, 2009 @ 10:19 pm | Reply

  3. I am not sure of this would help but would an approach on de randomizing a probabilistic algorithm for finding primes help? I attended midwest theory day and it seemed it may apply.
    http://pages.cs.wisc.edu/~jkinne/research/KvMS_full_manuscript.pdf

    Comment by Joshua Herman — December 17, 2009 @ 4:01 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.