The polymath blog

August 28, 2009

(Research Thread IV) Determinstic way to find primes

Filed under: finding primes,research — Terence Tao @ 1:43 am

This post will be somewhat abridged due to my traveling schedule.

The previous research thread for the “finding primes” project is now getting quite full, so I am opening up a fresh thread to continue the project.

Currently we are up against the “square root barrier”: the fastest time we know of to find a k-digit prime is about \sqrt{10^k} (up to \exp(o(k)) factors), even in the presence of a factoring oracle (though, thanks to a method of Odlyzko, we no longer need the Riemann hypothesis).  We also have a “generic prime” razor that has eliminated (or severely limited) a number of potential approaches.

One promising approach, though, proceeds by transforming the “finding primes” problem into a “counting primes” problem.  If we can compute prime counting function \pi(x) in substantially less than \sqrt{x} time, then we have beaten the square root barrier.

Currently we have a way to compute the parity (least significant bit) of \pi(x) in time x^{1/2+o(1)}, and there is hope to improve this (especially given the progress on the toy problem of counting square-frees less than x).  There are some variants that also look promising, for instance to work in polynomial extensions of finite fields (in the spirit of the AKS algorithm) and to look at residues of \pi(x) in other moduli, e.g. \pi(x) mod 3, though currently we can’t break the x^{2/3+o(1)} barrier for that particular problem.


  1. I worked out a way to leverage a power savings improvement (over the hyperbola method) in computing sums of divisors, to break the square-root barrier for computing the parity of \pi(x), assuming of course I haven’t made any errors. I have written a two page note on it, which can be found by going to my homepage at

    Click to access squarerootbarrier.pdf

    (I wrote it in terms of the way I first explained how to compute parity of \pi(x), not Terry’s cleaner version — I am sure the algorithm can be rephrased in terms of the cleaner approach.)

    As to breaking to the square-root barrier for computing divisor sums, I think one can just use some linear approximations to curves y = N/x, along with Pick’s Theorem to count lattice points in polygonal regions — if so, this would give a much more elegant approach than I wrote on before (the `crude method’). There are some issues with the approximating lines not having endpoints that are lattice points, and issues with points between the approximating lines and the curve, but I think these can perhaps be dealt with. I may write more on this in a day or two.

    Comment by Ernie Croot — August 28, 2009 @ 1:55 pm | Reply

  2. [[ Parity may be all we need! ]]

    It occurred to me just now that computing the parity of \pi(x) is almost sufficient to solve the “finding primes” problem by itself, without need for further bits of \pi(x).

    Indeed, suppose we have a parity oracle, and have somehow found an interval [a,b] in which \pi(a) and \pi(b) have different parity. Then for any c between a and b, the parity of \pi(c) must differ from either that of \pi(a) or that of \pi(b). So with one query to the parity oracle, we can pass to an interval of half the size in which the parities of the endpoints differ. Iterating this a logarithmic number of times, we end up at a prime.

    So, with a parity oracle, all we need to do to win is to find two k-digit numbers a,b in which the parities of \pi(a), \pi(b) differ. I’m stuck on how one is supposed to do precisely that deterministically (it’s trivial probabilistically, of course), but it does seem to be a reduction in difficulty in the problem…

    Comment by Terence Tao — August 29, 2009 @ 5:01 am | Reply

    • “(it’s trivial probabilistically, of course),”

      Comment by Gil Kalai — August 29, 2009 @ 9:48 am | Reply

      • Huh. I thought it was obvious that the parity of pi(x) was equidistributed (i.e. approximately half of all numbers have an even number of primes less than them) but now that I think about it, I can’t get this without some conjectures on prime gaps on the level of GUE. I suppose the primes could conceivably come in pairs with small separation between them, so that one parity class overwhelms the other in size, though clearly this is not what actually happens.

        Anyway, the only way things can go wrong with a parity oracle is of all the constructible k-digit numbers in the world have the same parity of pi(x), which really is bizarre but I can’t see how to rule it out yet.

        Comment by Terence Tao — August 29, 2009 @ 9:52 am | Reply

      • Actually, now that I think about it, for every prime p, p and p-1 have opposite pi-parity, so the probability of hitting a given pi-parity has to be at least 1/log x by the prime number theorem, so one can get two different parities in polynomial time probabilistically (but this is not so different from the probabilistic algorithm to find a prime in the first place).

        Comment by Terence Tao — August 29, 2009 @ 9:53 pm | Reply

        • That’s a nice idea, finding two numbers where the parity of \pi(x) differs (though, I knew about it, and it was sort of implicit in one of the postings I wrote earlier, that if the parity of \pi(x) on some long interval [a,b] is odd, as provided by an oracle, then we are done). The same idea works for the F_2[x] approach — just find a long interval, say, [a,2a], such that the the generating function for the primes in this interval is not divisible by a small-degree polynomial — then, some binary searching can be used to do the rest (you can just cut the interval in half, and mod the two generating functions out by the small degree polynomial, and one of these congruences must be non-zero, then iterate).

          The F_2[x] approach is looking a lot more promising to me, especially since I think that the methods of breaking the square-root barrier for finding the parity of pi(x) can be transferred to the polynomial ring setting (I will need to look at Harald’s posting below carefully to see if this is the case). Just the other day, an idea occurred to me, and though it was wrong, it had an insight that I think might lead somewhere. The idea was this: not only can we compute the prime generating function f(x) mod (2,g(x))g(x) has low degree — very quickly, but we can likewise compute f(x^q) mod (2,g(x)) very quickly. THe fleeting thought that passed through my mind was that if both of these turned out to be 0 mod (2,g(x)), then g(x^q) divides f(x), which is impossible for large q. Although I realized half a second later that this is nonsense, it led me to the following: not only can be compute f(x) mod (2, g(x)) quickly, but we can compute any number of compositions f(f_1(f_2...(x)))) (mod 2, g(x)) quickly, at least if the f_i(x) have very few terms (i.e. they are lacunary). It seems to me that one can somehow utilize this (as I had attempted with the `nonsense idea’), choosing the polynomials g(x) to mod out by, to gain control over an exponential explosion of roots of f(x); if so, then I can imagine that this can be used to obtain a fast algorithm (breaking the square-root barrier) to locate primes. I’ll think about it…

          I once wrote a note to Michael Filaseta (or maybe I just told it to him? can’t remember) that may have some useful ideas — I’ll have to dig it up from my computer account if it exists. As I recall, he had an algorithm to quickly determine whether certain lacunary polynomials are irreducible. Upon telling me about his method, I found a completely different approach, that worked by seeing how the ideal (2) splits in finite extensions of Z (and I have totally forgotten how the algorithm worked). By Kummer’s prime ideal factorization theorem, this is related to seeing how polynomials split in F_2[x] — so, there may be some ideas in that note that are relevant to our problem. I’ll keep you posted if anything develops.

          Comment by Ernie Croot — August 30, 2009 @ 3:07 am | Reply

  3. Regarding our very basic problem let me point out that there are two variants:

    1) (effective) find a constant A and a deterministic algorithm that find a k-digit prime wose running time is bounded above by A k^{10}.

    2) (non effective) Prove that there is a deterministic time algorithm whose running time is bounded above by A k^{10} for some A.

    As far as I can see Cramer’s conjecture gives only a non effective algorithm (version 2).

    Comment by Gil Kalai — August 29, 2009 @ 9:24 am | Reply

    • Does the distinction between effective and non effective version of the problem make sense? What can we expect from complexity theoretic assumptions of hardness? (Or does the distinction extend to them.)What is the status of the existing (exponential) deterministic algorithms.

      Comment by Gil Kalai — August 31, 2009 @ 6:16 am | Reply

  4. It seems to me that I can compute \sum_{n<=N} \tau(n) in time O(N^{1/3} (\log N)^3). The argument is similar to elementary proofs (such as the one waved at in the exercises to Chapter 3 of Vinogradov's Elements of Number Theory) of the fact that the number of points under the hyperbola equals (main term) + O(N^{1/3} (\log N)^2). Thanks to Ernie Croot for the inspiration – all mistakes below are my own.

    What we must do is compute \sum_{n <= N^{ 1/2 } } \lfloor N/n \rfloor in time O(N^{1/3} (\log N)^3). Assume that N/x^2  =  a/q + \eta/q^2, \hbox{gcd}(a,q)=1, \eta<=1/\sqrt{5}. Assume furthermore that
    q<=Q, where Q=N^{\theta-1/3}/10. Then the sum

    \sum_{x<=n<x+q} \{N/n\}

    can be computed in time O(\log x) with an error term < 1/2.

    Before I prove this lemma, let me show why it is enough for attaining our goal (namely, computing \sum_{n\leq \sqrt{N}} \lbrack N/n\rbrack with *no* error term). We know that

    \sum_{x<=n<=x+q} \lbrack N/n \rbrack = \sum_{x<=n<x+q} N/n - \sum_{x<=n<x+q} \{N/n\} = N*(\log(x+q)-\log(x)) - \sum_{x<=n<x+q} \{N/n\}.

    We also know that \sum_{x<=n<=x+q} \lbrack N/n \rbrack is an integer. Thus,
    it is enough to compute \sum_{x<=n<x+q} \{N/n\} with an error term <1/2
    in order to compute \sum_{x<=n<x+q} \lbrack N/n\rbrack exactly.

    We now partition the range n=\{N^\theta,...,2 N^\theta\}, 1/3<=\theta<=1/2, into intervals of the form x<=n<x+q, where q is the denominator of a good approximation to N/x^2, that is to say, an approximation of the form a/q, \hbox{gcd}(a,q)=1, q<=Q with an error term <= 1/\sqrt{5} q^2. Such good approximations are provided to us by Hurwitz's approximation theorem. Moreover, it shouldn't be hard to show that, as x varies, the q's will be fairly evenly distributed in [1,Q]. (Since Hurwitz's approximation is either one of the ends of the interval containing N/x^2 in the Farey series with upper bound Q/2 or the new Farey fraction produced within that interval, it is enough to show that Dirichlet's more familiar approximations have fairly evenly distributed denominators.) This means that 1/q should be about (\log Q)/Q in the average.

    Thus, the number of intervals of the form x<=n<x+q into which \{N^\theta,\ldots ,2 N^\theta\} has been partitioned should be about (\log Q) N / Q. Since the contribution of each interval to the sum \sum_{N^\theta<=n<=2N^\theta} \lfloor N/n\rfloor can (by Lemma 1 and the paragraph after its statement) be computed exactly in time O(\log x), we can compute the entire sum in \sum_{N^\theta n<= 2 N^\theta} \lfloor N/n\rfloor in time O((\log x) (\log Q) N^\theta/Q) = O((\log N)^2 N^{1/3}).
    (There are bits of the sum (at the end and the beginning) that belong to two truncated intervals, but those can be computed in time O(Q) \ll O(N^{1/6}).)

    We partition \{1,2,\ldots ,\sqrt{N}\} into O(\log N) intervals of the form \{N^\theta,\ldots ,2 N^\theta\}, and obtain a total running time of O((\log N)^3 N^{1/3}),
    as claimed.

    The proof of Lemma 1 is in my next post.

    Comment by H A Helfgott — August 29, 2009 @ 10:29 pm | Reply

    • Harald, that’s fantastic! I will look at it very closely. It will certainly save me a lot of work trying to figure out how to use Pick’s theorem in just the right way, and in any case it N^{1/3} or thereabouts is about the limit of how good I thought that my methods could possibly work.

      Comment by Ernie Croot — August 30, 2009 @ 3:12 am | Reply

  5. Proof of Lemma 1.- We can write

    N/n = N/x - N/x^2 t + \eta_2 t^2 / N^{3\theta-1}     = N/x - (a/q + \eta/q^2) t + \eta_2 t^2 / N^{3\theta-1},

    where n = x + t, 0<=t<q an integer, |\eta|<=1/\sqrt{5} and |\eta_2|<=1fs independent of n. Since q<=Q and Q=N^{\theta-1/3}/10, we have
    \eta t^2 / N^{3\theta-1} < 1/1000q. We also have \eta/q^2 t <= 1/\sqrt{5} q.
    Thus, \eta t^2 / N^{3\theta-1} + \eta/q^2 t < 1/2 q. It follows that

    |\{N/n\} - \{N/x - at/q\}| < 1/(2 q)

    except when \{N/x-at/q\}  1 - 1/(2 q). That exception
    can happen for only one value of t=0 \ldots q-1 (namely, when at is congruent mod q to
    the integer closest to \{N/x\} q) and we can easily find that t (and isolate it
    and compute its term exactly) in time O(\log n) by taking the inverse of a \mod q.

    Thus, we get the sum \sum_{x<=n<x+q} \{N/n\} in time O(\log n) with an error term less than (1/2 q) * q = 1/2 once we know the sum

    \sum_{0<=t<q} \{N/x - at/q\}

    exactly. But this sum is equal to \sum_{0<=r<q} \{r/q + \epsilon/q\}, where
    \epsilon = \{qN/x\}, and that sum is simply (q-1)/2 + \epsilon. Thus, we have
    computed the sum \sum_{x<=n<x+q} \{N/n\} in time O(\log n) QED.

    Comment by H A Helfgott — August 29, 2009 @ 10:49 pm | Reply

  6. Obvious, unrelated and fairly empty remark: we do not really need to know pi(x) mod q – we need is a fast algorithm for determining whether q|(\pi(x+t)-\pi(x)) (a yes or no question, rather than a value mod q, though it really amounts to much the same). Since primes between x and 2x do exist, there is a q between 2 and log(x) that does not divide \pi(2x)-\pi(x). We fix that q, and start to refine the interval \pi(2x)-\pi(x)
    in a binary fashion as Terry described.

    Comment by H A Helfgott — August 30, 2009 @ 1:53 am | Reply

  7. According to Titchmarsh’s Theory of the Zeta Function, in the same paper that Vornoi obtains the x^{1/3+\epsilon} error term for \sum_{x\leq n} \tau_{2}(x) he obtains a x^{1/2+\epsilon} error term for \sum_{x\leq n} \tau_{3}(x). There is actually a proof of this in Titchmarsh’s book (chapter 12), but the argument there is analytic and not combinatorial like Helfgott’s.

    I haven’t taken a close look, but Vornoi’s 1903 (French) article is online at: .

    Of course, the hope is that this could be modified to improve our current method for computing \sum_{x\leq n} \tau_{3}(x), which runs in O(x^{2/3}) steps.

    Comment by Mark Lewko — August 31, 2009 @ 7:32 am | Reply

  8. The link to Voronoi’s article seems to be broken; can you fix it?

    Comment by H A Helfgott — August 31, 2009 @ 5:42 pm | Reply

  9. Thanks!

    Comment by H A Helfgott — August 31, 2009 @ 7:18 pm | Reply

  10. […] the Polymath blog. But please be polite: read some background first, and take a look at some of the research threads to get a feel for how things work, and what’s already […]

    Pingback by Michael Nielsen » Finding Primes: A Fun Subproblem — September 1, 2009 @ 9:12 pm | Reply

  11. I just typed up a short note this morning with an idea for how to push the F_2[x] idea forward (I did’t have to teach this morning because of a university convocation). I decided to write it up as a latex/pdf file, which can be accessed by going to

    Click to access f2_idea.pdf

    Comment by Ernie Croot — September 3, 2009 @ 3:31 pm | Reply

    • Looks promising! Unfortunately I am swamped with my ongoing lecture series but I do like the idea of extending our current parity algorithm to a polynomial ring; polynomialisation worked for AKS, so we can hope it does so for our problem also…

      Comment by Terence Tao — September 4, 2009 @ 1:53 pm | Reply

      • It looks like it took a lot of time to put together those slides for your talks. They are very nicely done. I especially liked the Cosmic Distance Ladder slides.

        I found a way to rewrite the generating function for the primes in F_2[x] in terms of a sum of a small number of polynomials to high powers. That should make it much easier to attack my goal 1 in the note above, and probably also goals 2 and 3 (actually, thinking a little bit ahead, it seems to me that if goal 2 could be solved, quite likely goal 3 can be as well — the reasons are a little complicated, and maybe I will write about it later).

        The new idea I had is based on the following observation: in F_2[x] we have that

        (x+1)^{2^k-1} = 1 + x + x^2 + \cdots + x^{2^k-1}.

        So, for example, we can write

        1 + x + \cdots + x^{23} = (x+1)^{15} + x^{16} (x+1)^7.

        In general, a geometric series 1 + x + \cdots + x^n can be written in this way using at most \log_2(n+1) terms — the number of terms is the number of 1’s in the base-2 expansion of n+1. So, replacing the (x^{ad}-1)/(x^d-1) by the appropriate sums of these $x^c (1+x)^{2^k-1}$’s will only increase the number of terms in our formula for the generating function by a factor O(\log n) — quite small. And, of course, it gives us the flexibility of working with sparse polynomials. I haven’t tried yet to see if this makes Goal 1 tractable (perhaps now some kind of Vandermonde identity will polish it off), but it sure looks a lot easier to me now!

        Comment by Ernie Croot — September 5, 2009 @ 2:24 pm | Reply

      • And here is a further idea that builds on the new new way to write the generating function (that uses the (x+1)^{2^k-1} identity), though I haven’t had time yet to try it out (maybe at the end of the week): we have reduced the generating function to a sparse sum of terms like x^c (x^d + 1)^{2^k-1}, and we would like to show that as we successively replace x with x^2, x^3, ..., x^N, N = n^{1/2+o(1)}, say, that we must have at least one of these sums is non-zero \pmod{2,g(x)}. If we just had sums of powers of x by themselves, or just sums of (x^d-1)^{2^k-1} by themselves, then probably it would be much easier to show that one of these sums was non-zero. Well, it turns out that there is a way to combine together both of these types of sums in a nice way, and perhaps this will lead to the non-vanishing result we seek.

        The idea is to use matrix tensors: first, suppose that A is the matrix whose jth row is the vector (x^{j c_i})_{i=1,...,t} (where t \sim N or so), where the c_i are the exponents appearing in the terms x^{c_i} (x^d + 1)^{2^k-1} that we have; and let B be the matrix whose jth row is the vector ((x^{jd_i}+1)^{2^{k_i}-1})_{i=1,...,t}. Of course as it stands, these matrices will have 0 determinant, because some columns will be duplicated — never mind that, I know how to fix that, I think. Now, if we take the tensor A \bigotimes B, within this t^2 \times t^2 matrix will be a sub-matrix consisting of the terms x^{j c_i} (x^{jd_i}+1)^{2^{k_i}-1}. There is some hope that this can be used to show that if the tensor is non-singular, then not all of our sums of x^{jc} (x^{jd}+1)^{2^k-1} vanish \pmod{2,g(x)}, especially since there are lots of other sub-matrices within the tensor that contain usable information about our generating function for the primes (and perhaps there is a way to relate the rank of sub-matrices of the tensor to ranks of sub-matrices of A and B — I would need to brush up on my multilinear algebra).

        Let’s assume all this can be made to work somehow. Then how do we show that the tensor is non-singular? Well, that is where the beautiful fact that the determinant of the tensor of A with B is a power of the determinant of A times a power of the determinant of B. In other words, if A and B are non-singular, then so is the tensor. Now, A is a Vandermonde, so is well-understood, and B looks a lot simpler to analyze than the matrix having terms x^c (x^d+1)^{2^k-1} that corresponds to our prime generating function.

        Well, all that’s just a thought… I’ll have to see if it can be made into something usable.


        On another front, it occurred to me how we might approach the problem of solving goal 2, of rapidly computing the generating function in F_2[x]/(g(x)) at several different powers of x: what we could do is temporarily ascend out of the F_2[x] context, and work in Z[x]/(g(x)) instead (assume g(x) is monic). The advantage to doing this is that there is some hope of using Fourier analysis in the complex numbers, which gets around the problem of our finite field not having just the right kinds of roots of unity to apply discrete FFTs. Thinking ahead, I see that going to Z[x] will have its own problems, however, due to the fact that when we mod out by g(x), unless g(x) is a cyclotomic polynomial (or at least a monic polynomial with really low Mahler measure) the coefficients of the residue will be enormous, requiring very high precision to compute accurately enough for this to be viable. There may be some ways around this, though.

        There are two other possibilities for attacking goal 2 that I know about: first, I think perhaps one can express it in terms of a matrix multiplication problem, though maybe I am mistaken — I haven’t had time to think about it. If this is the case, then there is Strassen’s algorithm, and various improvements that might work just well enough to give us a quick algorithm to locate primes (quicker than Odlyzko’s). And the other approach is one I remembered seeing in Knuth’s book “Semi-numerical Algorithms” (in my opinion, the best math and CS books that exists!) ages ago, in the section on fast multiplication of integers. As I recall, Knuth presents the standard FFT algorithm for this, but then also presents a simple recursive algorithm that requires no Fourier series at all, yet achieves a running time almost as good as what FFTs can deliver. Perhaps there are some ideas in that algorithm that can be transferred to our problem.

        Comment by Ernie Croot — September 8, 2009 @ 2:26 am | Reply

        • Actually, I guess that one can easily force large submatrices of a tensor to be singular (and one doesn’t need to use tensor matrices, but can just use tensor products — I don’t know why I didn’t think of that first). Still, I wonder whether the non-vanishing problem can be made into essentially a simple linear algebra problem somehow, that uses as input such data as the fact that Vandermondes are non-singular.

          Comment by Ernie Croot — September 8, 2009 @ 2:10 pm | Reply

    • Here I will discuss a new way that FFTs can be used to vastly speed up our algorithm for locating primes using the F_2[x] prime generating functions: basically, we begin by applying Odlyzko’s algorithm to locate, in time x^{0.49}, an interval

      I\ :=\ [z, z+n^{0.51}]\  \subseteq\ [n,2n],

      that has at least one prime number. Recall that the F_2[x] approach requires that we compute sums like

      f(X)\ :=\ \sum_{d \leq \sqrt{n}} {x^{d a_d} - 1 \over x^d - 1}

      at the values

      X\ =\ x,x^2,x^3,...,x^N,\ N \sim n^{1/2}.

      The trivial algorithm to do this takes time n \sim N^2 or so. Assuming that we can perform FFT’s quickly using polynomials in F_2[x]/(g(x)), where g is irreducible and of degree (\log n)^{O(1)}, it is easy to see how to speed this process up; and, in fact, we should get an algorithm that runs in time n^{0.51}.

      Well, it turns out that there is a somewhat less obvious way to use FFT’s to speed up the process, which seems more portable (to other variations of our prime search algorithm) than the easy-to-see application of FFTs, and which is the point of this posting: suppose that d \leq \sqrt{n} has the property that de is the largest multiple of d that lands in our interval I, and suppose that d is near to \sqrt{z}, but just slightly smaller. Then, we know that

      (d-1)(e+1),\ (d-2)(e+2),\ ...,\ (d-k)(e+k),

      all belong to I as well, at least for k  0 \atop a^2 + b^2 \leq n} X^{a^2 + b^2}

      to deal with. Well, these are quite similar to

      \left ( \sum_{a \leq \sqrt{n}} X^{a^2} \right )^2;

      and so, if the inner sum here can be evaluated at X=x much quicker than by the trivial algorithm, then it gives us a way to speed up the F_2[x] algorithm.

      There is yet another way that we can use this idea to speed up the search for primes, which involves passing to arithmetic progressions (perhaps related to one of GIl’s suggestions), and which I will write on in the next posting.

      Comment by Ernie Croot — September 12, 2009 @ 10:08 pm | Reply

      • I see there was a “formula does not parse”. Here it is again, rewritten slightly:

        Here I will discuss a new way that FFTs can be used to vastly speed up our algorithm for locating primes using the F_2[x] prime generating functions: basically, we begin by applying Odlyzko’s algorithm to locate, in time x^{0.49}, an interval

        I\ :=\ [z, z+n^{0.51}]\  \subseteq\ [n,2n],

        that has at least one prime number. Recall that the F_2[x] approach requires that we compute sums like

        f(X)\ :=\ \sum_{d \leq \sqrt{n}} {x^{d a_d} - 1 \over x^d - 1}

        at the values

        X\ =\ x,x^2,x^3,...,x^N,\ N \sim n^{1/2}.

        The trivial algorithm to do this takes time n \sim N^2 or so. Assuming that we can perform FFT’s quickly using polynomials in F_2[x]/(g(x)), where g is irreducible and of degree (\log n)^{O(1)}, it is easy to see how to speed this process up; and, in fact, we should get an algorithm that runs in time n^{0.51}.

        Well, it turns out that there is a somewhat less obvious way to use FFT’s to speed up the process, which seems more portable (to other variations of our prime search algorithm) than the easy-to-see application of FFTs, and which is the point of this posting: suppose that d \leq \sqrt{n} has the property that de is the largest multiple of d that lands in our interval I, and suppose that d is near to \sqrt{z}, but just slightly smaller. Then, we know that

        (d-1)(e+1),\ (d-2)(e+2),\ ...,\ (d-k)(e+k),

        all belong to I as well, at least for k smaller than n^{1/4}, say. If d is somewhat smaller than \sqrt{z}, then there will be a similar idea that one can use, and for the purposes of this note we will not discuss these other, slightly more complicated cases.

        Let us look at the generating function for these new numbers that land in our interval I. It is given as

        X^{de} + X^{(d-1)(e+1)} + X^{(d-2)(e+2)} + \cdots + X^{(d-k)(e+k)};

        and unfortunately, the exponent here is quadratic in k, making it difficult to evaluate quickly. But there is actually a way to do it using FFTs, which will allow us to quickly evaluate it at even just one value for X, which is rather surprising!

        Here is the idea: let’s simplify matters a little, and suppose the generating function we wish to evaluate instead is

        h_1(X)\ :=\ 1 + X + X^4 + X^9 + \cdots + X^{(k-1)^2}.

        Let D = \sqrt{k}, where we assume k is a square, and let

        h_0(Y)\ :=\ 1 + x Y + x^4 Y^2 + x^9 Y^3 + \cdots + x^{(D-1)^2} Y^D\ \in\ F_2[x]/(g(x)).

        Note that we treat these x^{j^2} as coefficients, and of course Y as the variable. Then, it is easy to check that

        h_1(x)\ =\ h_0(1) + x^{D^2} h_0(x^{2D}) + x^{4D^2} h_0(x^{4D}) + \cdots + x^{ (D-1)^2 D^2} h_0(x^{2(D-1)D})

        So, if we can evaluate

        h_0(1),\ h_0(x^{2D}),\ ...,\ h_0(x^{2(D-1)D})\ \in\ F_2[x]/(g(x))

        quickly, then we can find h_1(x) quickly. And, evaluating h_0 at these powers of x is exactly what FFTs are used for!

        What’s more is that perhaps one can somehow use this to quickly compute the generating function for the primes that are 1 mod 4. The idea is that we will have generating functions like

        \sum_{a,b > 0 \atop a^2 + b^2 \leq n} X^{a^2 + b^2}

        to deal with. Well, these are quite similar to

        \left ( \sum_{a \leq \sqrt{n}} X^{a^2} \right )^2;

        and so, if the inner sum here can be evaluated at X=x much quicker than by the trivial algorithm, then it gives us a way to speed up the F_2[x] algorithm.

        There is yet another way that we can use this idea to speed up the search for primes, which involves passing to arithmetic progressions (perhaps related to one of GIl’s suggestions), and which I will write on in the next posting.

        Comment by Ernie Croot — September 12, 2009 @ 10:12 pm | Reply

    • In case we don’t have FFTs available in the F_2[x]/(g(x)) context, I have figured out how to use Strassen’s algorithm as a substitute. I think I wrote about this before, but it took me a little thinking to see how to make it work. I have typed up a short 2-page note on how to use Strassen to evaluate the sum

      \sum_{0 \leq d \leq k-1} x^{d^2} \pmod{2,g(x)}

      substantially faster than time k (recall that FFTs allow us to do this in time k^{1/2} or so; Strassen is somewhat worse, running in time about k^{1-c} for some 0 < c < 1/2, but still it is much better than time k.)

      This note can be found by going here:

      Click to access fast_strassen.pdf

      I believe this means that if we combine this result with the ideas I wrote about before, we can evaluate the prime generating function for an interval [z,z+z^{0.51}] in time substantially faster than z^{1/2}. In other words, we have now rigorously crossed the square-root barrier in evaluating the generating function!

      Furthermore, in light of these recent improvements, we now know that the prime generating function has somewhat lower “arithmetic complexity'' that it first had seemed. This *should* mean that we no longer need to evaluate it at

      x, x^2, ..., x^{N},\ N \sim n^{1/2}

      in order to achieve the non-vanishing we need in order to locate primes. In fact, my preliminary calculations, based on a certain conjecture of mine (that I have not written about), implies that

      x, x^2, ..., x^{M},\ M \sim n^{7/16}

      should suffice. Of course, this will involve solving a stronger version of my Goal 1 in a previous note. I have some ideas on this, and will hammer away at them… hopefully I will have something to report next week.

      Comment by Ernie Croot — September 14, 2009 @ 12:28 am | Reply

  12. Regarding the parity of pi(n) I wonder what is known about the following simple (non algorithmic) questions:

    1) Consider an arithmetic progression an+b (think about a being a large integer).

    We can ask what is the smallest n before we witness two members in the progression with opposite parity of pi.
    The thought that the parity of pi(n) will behave rather randomly for far apart integers suggests that that n=K log(a) log log (a) or something like that would suffice (given a for all b’s).

    When n is exponential in k then the gaps between terms of the arithmetic progression will become so much smaller than the gaps between primes so we should be able to find two terms with opposite signs of pi. (Actually I am not sure that even this heuristic argument can be turned to a proof. Even if it does there is a huge gap between log k and exp (k).)

    2) A slightly more general question is: given a polynomial an^2 +bn +c are there two terms with opposite signs of pi? Here beside the feeling that the sign of pi will behave like a random string of +1 -1 I do not see any potential argument why the sign of pi could not be the same for all terms. Is it known?

    BTW could it be easier to compute pi(n) (mod 4) rather then mod 3? (Maybe this is related to Ernie’s approach.)

    Comment by Gil — September 3, 2009 @ 6:46 pm | Reply

  13. A couple of very qualitative observations.

    The bad case on parity is where the primes in the initially chosen range (a,b) come in pairs which are ‘close together’ so that most of the integers in (a,b) come between prime pairs. So it may be worth looking at the distribution of ‘isolated primes’, or trying to identify where these may be found.

    Another possible strategy to find a parity change in the first place would be to choose an interval (a,b) and extend it by not too much to find a parity change, rather than by working to reduce the length of the interval.

    Comment by Mark Bennet — September 3, 2009 @ 10:01 pm | Reply

  14. If you tried to find a k digit prime, then a quantum computer would be able to do it in 2^c(k^1/2), c is a constant. I think. The exponent in k would become one in square root of k. In the wiki the question is asked if p = bqp how does that affect the problem. Might we get a similiar reduction from an exponent in k to an exponent in the square root of the square root of k? If p=bqp then is there some way to simulate a quantum turing machine with a turing machine? How does the time it takes to do a problem change when going from the quatum computer to its simulation?

    Comment by Kristal Cantwell — September 6, 2009 @ 3:49 am | Reply

    • I think a quantum computer can produce random numbers. If it could produce a k digit random number x in polynomial(k) time then we could test the numbers around this number for primes if the expected prime free gap is k and the deviation is k, which I think happens in a poisson process the testing say 100k numbers around x ought to give a probability of less than 1/3 and since the testing time is a polynomial function of k the entire process should take place in polynomial time and that would put this problem in bqp and also make the problem polynomial if p = bqp.

      Comment by Kristal Cantwell — October 2, 2009 @ 4:18 am | Reply

      • The above doesn’t work. The distribution is not known to be poisson. However for the above method to fail I think at least 1/3 of the numbers in the interval must be in clumps of 100k or more. This is a lot of clumping. Could it somehow interfere with what is known about the distribution of prime numbers. If so the above method could possibly be made to work.

        Comment by Kristal Cantwell — October 3, 2009 @ 2:28 am | Reply

        • I think I can fix the problem noted above. Here it is: If a quantum computer could produce a k digit random number x in polynomial(k) time then we could test x to see if it is prime. It has probability of 1/k of being prime. If we repeat this process 2k times the probability of not finding a prime is e^-2 which is less than 1/3. Since the testing time is a polynomial function of k the entire process should take place in polynomial time and that would put this problem in bqp and also make the problem polynomial if p = bqp.

          Comment by Kristal Cantwell — October 3, 2009 @ 3:44 pm | Reply

          • In regards to the above. I think everything except possibly the random number generator in the quantum computer will work. That is because there is a discussion on the wiki about finding primes with O(k) random bits:


            So it looks if we have the random number generator we get what we want. There are also various random number generators available using various physical phenomena. The algorithm could be implemented with them but there is a problem that they may not be provably random.

            Comment by Kristal Cantwell — October 4, 2009 @ 6:03 pm | Reply

            • So from the above we need a random number generator for a quantum computer
              to get this problem in BQP. I have a reference for one in the paragraph below

              In the paper “Quantum theory, the Church-Turing principle and the universal quantum computer” by David Deutsch
              in section 3, “Properties of the universal computer” after the first paragraph there is a subsection, “Random numbers and discrete stochastic systems” a program for a quantum computer generating true random numbers is given. The paper is available at:

              Click to access quantum_theory.pdf

              Comment by Kristal Cantwell — October 5, 2009 @ 3:16 pm | Reply

  15. I’ve tried to write up the overview of Ernie’s “polynomial strategy” (Thread 12) at

    But I was not able to understand Ernie’s point about how being able to quickly sum quadratic sums such as \sum_{a \leq \sqrt{n}} x^{a^2} helps us sum the prime counting function \sum_{p \leq n} x^p \approx \sum_{d \leq \sqrt{n}} (x^{da_d}-1)/(x^d-1) any faster… could this point be elaborated?

    The squares, by the way, are the only numbers with an odd divisor function, so one can express \sum_{a \leq \sqrt{n}} x^{a^2} as \sum_{ab \leq n} x^{ab} modulo 2. Not sure if this is helpful for anything…

    Comment by Terence Tao — September 19, 2009 @ 5:44 am | Reply

    • It will take me a little time to write it up carefully, but let me just try to say right now what the idea is, in more detail: Basically, by first applying Odlyzko’s algorithm to restrict to an interval [z,z+z^{0.51}] (or maybe the z^{0.51} needs to be z^{0.501} to get a gain), what it buys you is the fact that the contribution to the prime generating function for this interval of the divisors d near to \sqrt{z} is small — that’s just because for any d near \sqrt{z} the geometric series

      z^{da} + z^{d(a+1)} + \cdots + z^{db},\ a = \lceil z/d \rfloor,\ b = \lfloor (z+z^{0.51})/d \rfloor,

      has only about z^{0.1} terms. So, rather than working out the contributions of the generating functions for these large d one-by-one, one gets a savings by working with several at once.

      Now let us suppose that d = \lfloor \sqrt{z} \rfloor, and let

      a = \lceil z/d \rceil,\ b = \lfloor (z+z^{0.51})/d \rfloor.

      A fragment of our generating function for the primes is the following sum

      \sum_{\sqrt{z} - z^{1/4} < e \leq \sqrt{z}} \sum_{z \leq ef \leq z + z^{0.51}} X^{ef}.\ \ \ (1)

      Now, as we said, each of these inner sums contain about z^{0.1} terms. And, what we will do is rewrite it as a sum of about z^{0.1} sums, where each sum itself has about z^{1/4} terms. *One* of these z^{0.1} sums will be

      \sum_{0 \leq j \leq z^{1/4}} X^{(d-j)(a + j)}.\ \ \ (2)

      Note that the exponent here is quadratic in j, though it certainly is *not* the toy sum \sum_{j \leq z^{1/4}} X^{j^2}. And, the Strassen (or perhaps FFT's used in the right way) algorithm can evaluate it at X=x substantially faster than z^{1/4} computations.

      Now, this sum (2) has z^{1/4} terms; and so, if it, and the other z^{0.1} sums that make up (1), can be evaluated using fewer than z^{1/4-0.1} = z^{0.15} compuatations, then we have beaten in the “trivial'' (sum over the (x^{da} - x^{db})/(x^d - 1)) way to evaluate (1) above. Ok, this seems a little too much to ask for (even given our Strassen algorithm); but, replacing the 0.1 by 0.01 or 0.001, it surely is true.

      When one works with d's that are somewhat smaller than z - z^{1/4}, one needs to work instead with d's that are in certain arithmetic progressions (instead of d, d-1, d-2, ... one would consider d - r, d-2r, d-3r,...); that is, for these smaller d, by working with arithmetic progressions one can get the d's to dovetail, so that the same d isn't considered twice — I know that sounds too vague… I'll try to write it up in the next day or two.

      Comment by Ernie Croot — September 20, 2009 @ 8:50 pm | Reply

      • Ok after this, I’m going to just write everything up in latex and post on my website, instead of having to deal with “formula does not parse” again. Let me try to write up the missing “formula does not parse”:

        X^{da} + X^{d(a+1)} + \cdots + X^{db},


        a = \lceil z/d \rceil,


        b = \lfloor (z+z^{0.51})/d \rfloor.

        [I fixed the bad LaTeX. Yes, the inability for commenters to edit their own comments is an annoying technical flaw in this format. It’s definitely a must-have for any future project of this type. -T.]

        Comment by Ernie Croot — September 20, 2009 @ 8:54 pm | Reply

  16. I have started writing up the algorithm to evaluate the prime generating function quickly in (F_2[x]/(g(x)))[X] at X=x. But since it may take me a day or two to complete it, and since some people who want to think about it might not want to wait that long, I thought I would list out the link to what I have written so far (3 pages) — it will take shape over the next few days, and should become a polished draft by Wednesday (Sept 23):

    Click to access fast_primes.pdf


    Also, I thought I would explain why I didn’t post anything yesterday, as I said I would: I spent a great deal of time trying to work through two very promising approaches to solving Goal 1 in the note I wrote (and in the note on the polymath wiki). It looked like I had solved the problem completely, but alas these ideas fell apart on closer inspection. Perhaps in a few days I will post a link to a note describing several different approaches I have had, that turned out not to work, in the event somebody can see how to push them through.

    Comment by Ernie Croot — September 21, 2009 @ 1:35 am | Reply

    • Great! I’ll try to have a look at it soon.

      It occurs to me that even if we don’t manage to formally prove Goal 1, we may be able to come up with a heuristic probabilistic argument or something (e.g. numerics) that would indicate that Goal 1 is true (with perhaps finitely many exceptions). If we then complete Goal 2 and Goal 3, we would then have a deterministic algorithm that works and reaches the square root barrier conditional on a concrete number-theoretic hypothesis, with a good chance of breaking that barrier. That’s already a reasonable result, though of course we would like to make the result unconditional if at all possible.

      Comment by Terence Tao — September 21, 2009 @ 4:20 am | Reply

    • The version of fast_primes.pdf up on my webpage right now basically has all the ideas in place. There may be some fussing about moving this or that constant around, and of course polishing it somewhat to make it more readable or getting the best-possible dependencies among various parameters, but all the key ideas are there. Unfortunately, I will not be able to complete the paper or polish it until much later in the week, as I have too many other things to do right now.

      Comment by Ernie Croot — September 22, 2009 @ 3:02 pm | Reply

    • OK, I read through the draft and I think I understand better what you are doing now… you are starting with the task of quickly summing multiple geometric series such as \sum_{d \leq \sqrt{n}} \sum_{j \leq \lfloor n/d\rfloor} x^{jd} modulo some stuff, and use the geometric series formula to reduce matters to computing stuff like \sum_{d \leq \sqrt{n}} x^{d \lfloor n/d \rfloor}. Then you use Harald’s rational approximation ideas to chop this sum into a bunch of quadratic sums \sum_{j \leq J} x^{aj^2+bj+c} of non-negligible length, so that any improvement in one’s ability to sum quadratic sums leads to an improvement in the original summation.

      So this seems to establish Goals 2, 3 as long as t is small, but we still need to work out Goal 1, is that right?

      Also, the original Odlyzko step to reduce to an interval of size n^{0.51} seems to not yet be used, though I can imagine it would be useful for Goal 1 by improving the sparsity of various key polynomials… would you agree?

      Comment by Terence Tao — September 23, 2009 @ 7:39 am | Reply

      • Exactly right (I guess it really is some polynomial analogue of Harald’s idea, though there are some extra subtleties), including the last point about using Odlyzko’s idea (one can still quickly compute the prime generating function, its just that it may have no terms at all). Curiously, if one doesn’t restrict to the interval of width n^{0.51} at the outset, then one cannot use this polynomial analogue of Harald’s idea (at least I don’t see how to do it so as to run faster than z^{1/2}) that I used to obtain a speedup.

        And yes, Goal 1 is the main task to complete right now. One idea for this is to find a way to compute the prime generating function for the interval [z,z+z^{0.51}], with the primes restricted to certain arithmetic progressions a \pmod{d}, where d is fixed and a varies. The point would be that (a) there are a lot fewer primes in each of these progressions than in the whole interval (there should be about z^{0.51}/d of them), meaning that one doesn’t have to evaluate the prime generating function at x,x^2,x^3,...,x^{z^{0.51}}, but instead only x,x^2,x^3,...,x^{z^{0.51}/d} (even just using the Vandermonde idea will show this); and (b) of course you don’t know in advance which of the progressions a \pmod{d} contains primes, but there might be a way to relate the generating function for one progression a \pmod{d} to all the others b \pmod{d} (or at least to some of the b‘s), thereby greatly speeding things up — I don’t have time to explain why I think this might be possible right now, but maybe I will write this up this weekend instead of finishing the writeup on the fast evaluation of the prime generating function, since I think anyone can see now how it works.

        Another idea for how to attack Goal 1 is to attempt to rewrite the matrix with i,jth entry equal to the geometric series (x^{ij a_i} - 1)/(x^i - 1), that we wish to show doesn’t have the vector (1,1,1,...,1) in its kernel. I spent some time trying to look at variants of this matrix that still would be useful to our problem (such as permuting entries on each row, which doesn’t affect the problem of whether (1,1,...,1) is in its kernel), and I was inspired by some identities such as the fact that the 2 variable Wronskian factors nicely into a product of two 1 variable Wronskians (as is used to show non-vanishing of certain partial derivatives in the usual proof of the Thue-Siegel-Dyson theorem). I spent a fair amount of time trying various other types of related identities, but couldn’t quite get what I needed. I suppose this idea is related to the way that I applied Strassen to represent polynomial evaluation at multiple points by a matrix product problem (involving Vandermondes or quasi-Vandermondes).

        Another way to try to show that the matrix doesn’t have (1,1,1,...,1) in its kernel, or has no kernel at all, is to think in terms of interpolation. For example, one way to show that Vandermondes don’t vanish — in a way quite different from the approach of just computing their determinant — is to show that one can construct a polynomial of a given degree d that takes on any values at all, at any given set of d+1 points, which can be done using, say, the Lagrange interpolation formula. Might there be some equivalent of the interpolation idea for our matrix of geometric series?

        Another idea is to use the fact that we have control of the polynomial g(x) — we can take it to be whatever we want, and so why not take it to be something that makes the task of evaluating the generating function at lot easier. I played around with this some. One that I tried was x^m + x + 1. My reasoning for thinking about this one was that when you mod a high power of x out by it, you get a product of a power of x and a power of (x+1); and then, you can use the binomial theorem to think about the coefficients of that power of (x+1) (as in the formula (x+1)^{2^j-1} = 1 + x + ... + x^{2^j - 1}).

        I will write up several more ideas I have had this weekend (including a way to look for collisions between two generating functions for subsets of primes in [z,z+z^{0.51}]) …

        Comment by Ernie Croot — September 23, 2009 @ 3:45 pm | Reply

        • OK, thanks for the clarifications about the role of Odlyzko’s method

          I quite like the first approach to Goal 1. It now seems that for any a, b in [x, x+x^{0.51}], the polynomial f(x) = \sum_{a \leq p \leq b} x^p \hbox{ mod } 2 has an arithmetic circuit complexity of O(x^{0.49+o(1)}), so in particular for any g(x) of degree at most x^{0.001} (say), f(x) \hbox{ mod } 2, g(x) can be computed in time O( x^{0.491+o(1)} ) (say). Setting g(x) to be the polynomial x^q-1, this allows us to compute the parity of the number of primes in [a,b] equal to a given residue class modulo q. We can do this for all q \leq x^{0.001} and still stay under the square root barrier in cost.

          Probabilistically, it seems extremely unlikely that one could ever find an interval [a,b] in [x,x+x^{0.51}] with a non-zero number of primes, such that the set of such primes intersects every residue class mod q for q \leq x^{0.001} in an even number of elements, but I don’t see currently how to rule out this possibility.

          What made the AKS argument work, by the way, was a multiplicativity property that greatly amplified the various algebraic identities they had; more precisely, if (X+a)^m = X^m + a and (X+a)^n = X^n + a, where X is a root of unity of order coprime to n,m, then also (X+a)^{nm} = X^{nm} + a. This allowed them to exponentiate a polylogarithmic number of algebraic identities into a superpolynomial number. We don’t seem to have discovered an analogous amplification trick here yet, but perhaps we will eventually.

          Comment by Terence Tao — September 23, 2009 @ 10:52 pm | Reply

  17. Here I will talk about the first idea from my last posting. Recall that the idea was to take the interval I := [z, z + z^{0.51}], and consider the generating function for the primes in it that are a \pmod{q}, for some q \geq 2. In what follows I will be thinking q as a small power of z, though hopefully in later version of the arguments it can be taken closer to \sqrt{z}.

    Essentially, I hope to show (a) how the work one does on computing the prime generating function for primes that are a \pmod{q} can be used to speed up the process for other progressions b \pmod{q}; and, (b) find a way to skip past certain progressions a \pmod{q} whose generating function f(X) vanishes at X=x mod (2,g(x)). Completing this second task would be especially important for producing an algorithm to locate primes that breaks the square-root barrier.

    Let us first note that without attempting these tasks, working with a \pmod{q} already gives us a speedup over the “trivial algorithm” used to evaluate the prime generating function at X=1,x,x^2,x^3,...,x^t, where t is the number of primes under consideration (\ll z^{0.51}/\log z in the case where we work with all the primes in $I$). The speedup comes from two directions: first, when one goes to compute the generating function, only at most about z^{0.51}/q or so of the divisors d \leq z^{1/2} contribute a term (x^{d a_d} - 1)/(x^d - 1). Basically, what happens is that quite often none of the products

    d\lfloor 1 + z/d \rfloor,\ d \lfloor 2 + z/d \rfloor,\ ...,\ d \lfloor (z + z^{0.51}/d \rfloor

    are a \pmod{q}, at least it is the case when d is sufficiently near to \sqrt{z}. And the second place where we get a speedup is the fact that if we work with the primes that are a \pmod{q}, we only need evaluate the generating function at X=1,x,x^2,...,x^{z^{0.51}/q}, and check for non-vanishing mod (2,g(x)). The reason is that even just using the “Vandermonde method”, since the generating function has at most about z^{0.51}/q terms, one only needs to check non-vanishing for that many powers of x.

    I had thought that I could solve task (b) this past weekend, but it didn’t seem to work out. Basically, what I tried to do was to show that there exists a polynomial h(X,Y), of not too large degree (say of degree z^{1/2} in each variable), such that for certain small values of j=0,1,2,...,t, when one evaluates h(x^j, x^{j^2}) \pmod{2,g(x)}, one gets the prime generating function for the primes in our interval that lie in a certain arithmetic progression a_j \pmod{q}. The hope then was to show that by solving for j making h(x^j,x^{j^2}) \not \equiv 0 \pmod{2,g(x)}, one could “skip ahead” to a progression a_j \pmod{q} containing loads of primes. I still think this method has a decent chance of succeeding, and I will try to take another pass at proving it in a few days, when I am less busy.


    On another front, I discovered that one can factor the matrix of geometric series (x^{i j a_j} - 1)/(x^j - 1) that comes up in our problem — it can be written as a product AB, where the matrix A is z^{1/2} \times z^{0.51} (or thereabouts), and where B is z^{0.51} \times z^{1/2}. A is basically a Vandermonde, whose ith row is

    (x^{im}, x^{i (m+1)}, ..., x^{i (m+k)}),

    where the integers in our interval I are m,m+1,...,m+k. B, on the other hand, is a 0-1 matrix, with a 1 in the ith row and jth column when j divides the ith number of our interval. Although this matrix factorization is not as non-trivial as one would like, it might at least allow one to more easily see such things as the fact that our matrix AB of geometric series has large rank (assuming it does — surely it does).

    I can certainly imagine how to show that B has large rank, and of course we know that A has large rank from the fact that it is a fragment of a Vandermonde. If we could somehow relate the kernel of A to the image of B, we would be done as far as showing that AB has large rank.

    I feel that other kinds of matrix factorization can be found, some more useful than the one above. The reason I say this is that the Chinese Remainder Theorem can be encoded as a matrix product: say you have two 0-1 vectors of length n. For the first one, put a 1 in the ith position if i \equiv a \pmod{q}, and 0 otherwise; and, do the same for the second vector, but put 1 in the ith position if i \equiv a' \pmod{q'}. Taking the dot product of these two vectors gives you the number of integers up to n that are in both of these progressions at once. One can imagine several ways of using this fact (actually, generalizations of it, where the 1‘s are replaced by powers of x) to produce factorizations of matrices like our AB

    Comment by Ernie Croot — September 28, 2009 @ 3:24 pm | Reply

    • One more thing: the polynomial h(X,Y) I tried to construct above, although of large degree (say at most z^{1/2} in each variable), was to be very sparse, say having fewer than z^{1/2-\epsilon} terms. Without the sparseness, it would be useless as far as allowing one to “skip ahead”.

      Comment by Ernie Croot — September 28, 2009 @ 3:30 pm | Reply

  18. It occurred to me that one can write the geometric series (x^{da_d} - 1)/(x^d - 1) in a different way than a sum of a few polynomials x^c (x^d+1)^{2^m-1} that is nicer from the point of view of an eventual application of FFTs or related methods. First, let me discuss the problem with using the geometric series formula, and expressions such as x^c (x^d+1)^{2^m-1}: First, the x^d-1 in the denominator of the geometric series formula is quite nasty to deal with, both for FFTs and probably also methods to show that certain matrices (that I have discussed in previous postings) are nonsingular. And the trouble with the x^c(x^d+1)^{2^m-1} is that it is the composition of two not-so-trivial polynomials, namely the (2^m-1)th power map, for m not too small, and the translated power map x \to x^d + 1 — compositions of this sort seem quite hard for FFTs to handle, though it doesn’t seem as bad as the geometric series formulas (at least not to me). If in place of the power (2^m-1), we had something quite a bit smaller — bounded, in fact — while perhaps increasing the number of terms in our lacunary x^d + 1 a little bit (but not by too much), it would seem to me to be much easier for an FFT approach to get somewhere, for a variety of reasons I don’t want to get into. Well, there *are* identities that we can use in place of these that are better in this sense; specifically, consider the “base m generating function” identity

    1 + x^d + x^{2d} + \cdots + x^{d ((m-1) + (m-1)m + \cdots + (m-1) m^k)}

    \ \ \ =\ \prod_{i=0}^k (1 + x^{d m^i} + x^{2d m^i} + \cdots + x^{d (m-1) m^i}).

    If, say, we use d=1 and perhaps k=20, then if our geometric progression is to go up to about \sim z, we basically can use m \approx z^{1/21} — that is, we have the product of 21 polynomials with few terms.

    Identities of this sort might also be easier to deal with when showing that our various geometric series matrices are nonsingular, the reason being that in the matrices that come up, each entry will be the product of only 21 lacunary polynomials. It is not difficult (at least I don’t think so) to relate the non-singularity of matrices with lacunary polynomial entries, evaluated down at column at 1,x,x^2,...,x^n, with the non-singularity of certain Vandermonde matrices; and if the same can be said for matrices where each entry is a small number of products of lacunary polynomials, then we are in business!

    Comment by Ernie Croot — September 30, 2009 @ 4:28 am | Reply

    • Let me add to this, in case I wasn’t clear: We know how to compute the prime generating function at X=1,x,x^2,...,x^{\lfloor z^{0.1}\rfloor} altogether in time below the square-root barrier. But it would be nice to be able to take this all the way up to about x^{\lfloor \sqrt{z}\rfloor} or so. If the generating function were just a polynomial having about z^{0.5} terms, then perhaps an FFT method would do it; but we don’t have that, so we have to hunt around for identities to put the generating function in a form that is “good enough” for some generalization of the standard FFT methods to work.

      Comment by Ernie Croot — September 30, 2009 @ 8:09 pm | Reply

  19. Ernie –

    I’ve been away from this page for a short while, and I suspect I am not the only one who is having difficulty following – you seem to be doing all of the work right now. Would you mind typing up a sketch of your overall strategy, including why you think working with polynomials is best?

    Comment by H A Helfgott — October 1, 2009 @ 4:01 pm | Reply

    • Here is a link to an article about the polynomial strategy:

      Comment by Kristal Cantwell — October 1, 2009 @ 5:36 pm | Reply

    • Harald — I have been preoccupied lately writing up my NSF grant proposal, which is due in 2 days, so I don’t have time right now to do much. As Kristal pointed out below, I think Terry has put up some of it on the webpage, but there is still lots more to put up there.

      Right now, the main problem is to confront Goal 1 in my previous postings. One idea is to try to maneuver around it altogether by restricting to arithmetic progressions in [z, z + z^{0.51}], but I am feeling less optimistic about that approach just now. Last night, before going to sleep, it occurred to me that one can change Goal 1 into a different sort of problem related to inner products — instead of trying to show that a certain Vandermonde-like matrix is non-singular, one can attempt to show that in a certain sequence of square matrices, one induces an inner product in which (1,1,...,1) is not self-orthogonal. I know that’s nowhere near a good enough description — when this grant business is all over with, I will be able to devote a lot more time to writing things up properly…

      Comment by Ernie Croot — October 1, 2009 @ 8:07 pm | Reply

    • Yeah, I should be doing more of this, but I have been preoccupied with my Australia trip recently. But thanks for reminding me about it. Some of this is indeed on the wiki, but I can try to condense it here:

      Basically, Ernie has a reasonably quick way of computing the expression \sum_{a \leq p \leq b} x^p \hbox{ mod } 2, g(x) for intervals [a,b] \subset [1,N] of length at most N^{0.51+o(1)} and polynomials g of low degree (degree N^{o(1)}. This expression can be computed in something like N^{0.49+o(1)} time. This recovers the earlier result that the parity of the number of primes in [a,b] can be computed in this time, but we can now also do things like the parity of the number of primes in [a,b] that is equal to r mod q for any q = N^{o(1)}.

      (The way this is computed, by the way, is basically to rewrite \sum_{a \leq p \leq b} x^p mod 2 as \sum_{d,m: d<m, a \leq dm \leq b} x^{dm}, and use Farey sequences to represent the domain of summation of (d,m) as a polygon. After one application of the geometric series formula, it reduces to the task of evaluating quadratic sums such as \sum_{1 \leq j \leq J} x^{aj^2+bj+c} in o(J) time.)

      Ideally, we would like to have a N^{0.49+o(1)} algorithm for determining whether an interval [a,b] of length O(N^{0.51+o(1)}) is completely devoid of primes, as this would then (with a binary search) locate a prime in about the same amount of time. [Using Odlyzko's Riemann zeta method one can locate an interval of length N^{0.51} with at least one prime in time N^{0.49+o(1)}.] The problem is that currently, to do this, we need to test \sum_{a \leq p \leq b} x^p \hbox{ mod } 2, g(x) for far too many choices of g – about N^{0.51}, in fact, leading to a total run time closer to N than to \sqrt{N}. The hope is to use an FFT method (or, failing that, something weaker such as Strassen fast multiplication) to cut down on the computational cost, or to use some number theoretic or Vandermonde determinant magic to cut down on the number of g one needs to test. (Note that the latter route is basically the secret to the AKS algorithm.)

      Comment by Terence Tao — October 1, 2009 @ 11:36 pm | Reply

  20. This is slightly tangential to the main program, but it occurred to me just now that there was an analogy here between what we are doing and polynomial identity testing. In particular, if we can find a way to compute \sum_{a \leq p \leq b} x^p \hbox{ mod } 2, g(x) in polynomial time for g of logarithmic size, then the problem of determining whether [a,b] contains a prime should be a BPP problem (if it does, then the above expression will be non-zero for a non-trivial fraction of the g’s, I think, by the large sieve or something). So this would solve the original finding primes problem subject to P=BPP (which we were not able to do before because of a pseudoprime counterexample).

    Comment by Terence Tao — October 2, 2009 @ 12:57 am | Reply

  21. I recently came across the following paper that looks like it might be highly relevant to our problem of quickly evaluating the prime generating function at many points:

    Click to access U07.pdf

    I am particularly interested in the multi-point, multi-variable polynomial composition evaluation algorithm, and I think there is a chance that it can be used to evaluate our generating function at many points as quickly as we could hope for.

    On another matter, I spent the past two weeks trying about a dozen different ideas for attacking Goal 1 (including the one I mentioned above in response to Harald’s question), though most didn’t work out. What I am trying to do right now is prove not only Goal 1, but in fact a strengthening of it, where we only need evaluate the generating function at X=1,x,x^2,...,x^D, D \sim z^{0.49}, in order to find a point where it doesn’t vanish. My idea is to try to rewrite our generating function f(X) in the form

    f(X) = \sum_i \alpha_i(X)\beta_i(X),\ \ (*)

    where the \alpha_i and \beta_i are sparse polynomials such that the total number of terms among all of them is at most, say, z^{0.49}. If this can be accomplished, then our strengthening of Goal 1 would following from the following conjecture:

    Click to access good_rank2.pdf

    The contribution of the geometric series

    1 + X^d + X^{2d} + \cdots

    for d < z^{0.49}, say, can be put into this form (*) using the ideas from comment 19 above. What I attempted to do for those d satisfying

    z^{0.49} \leq d < z^{1/2}

    was to unwind our Strassen algorithm approach to evaluating sums like \sum_j x^{j^2}, and I *was* able to put express this as a sum of a small number of \alpha_i(x)\beta_i(x), but unfortunately the total number of terms among these \alpha_i and \beta_i together, was too big.

    The next approach I tried was to make use of the fact that the particular Strassen matrices we use have a certain self-similarity — turns out that each of the two matrices that arise can be broken up into blocks, where each block can be expressed in terms of the others. The hope, then, is to develop an alternative to Strassen's method, using the special properties of these matrices (which is not shared by matrices that would come up in more general multi-point evaluation problems). So far I have not succeeded…

    Another idea for how to put f(X) into the form (*) is to unwind the algorithms in the above paper I linked to (containing the multi-point algorithm), just like I unwound the Strassen approach. I don't think will work, however, because the multi-point algorithms make special use of the polynomial g(x) to achieve its speedup — I think, for various reasons, we need a method that is independent of the g(x).

    On another front, I have some ideas for how to prove the Conjecture liked to above (the like to my homepage). In particular, there are slight generalizations of Vanermonde determinant formulas that look like they might be relevant, for various reasons. Specifically, let f(x) be a polynomial with \ell terms, let a_1,...,a_k be any distinct integers, and let M be the matrix whose i,j entry is x^{(i-1) a_j} f(x^{i-1}). Then, the determinant of M is something like

    \pm \prod_i f(x^{i-1}) \prod_{i < j} (x^{a_i} - x_{a_j}).

    And now, one can think of the rows of this matrix M as corresponding to a product f(x) h(x), where h(x) = \sum_i x^{a_i} — basically, writing this polynomial as \sum_i x^{a_i} f(x), each term corresponds to a column of the matrix. Ok, so that's just one polynomial — maybe the matrix corresponding to a sum of products like this can likewise be analyzed.

    Comment by Ernie Croot — October 21, 2009 @ 10:07 pm | Reply

    • Looking closer at the paper I linked to in the previous post, I am not as enthusiastic now that it will help us to evaluate f(X) quickly at lots of points. But it gave me an idea that I think might lead somewhere: Suppose that g(X) has degree d, where 2^d \approx \sqrt{z}, say. Then, the order of x mod (2,g(x)) is about \sqrt{z}. What this means is that if h_1(X) is a polynomial

      h_1(X) = X^{a_1} + X^{a_2} + \cdots + X^{a_k},

      then if we let

      h_2(X) = X^{b_1} + X^{b_2} + \cdots + X^{b_k},\ b_i = a_i\ mod\ 2^d,

      then we have that for all integers j,

      h_1(x^j)\ \equiv\ h_2(x^j) \pmod{2,g(x)}.

      So, if h_1(X) is our prime generating function, then if we can find the corresponding polynomial h_2(X) quickly enough (and we can try to do this in C[x], say, where things might be easier), which has degree at most \sqrt{z}, we can use an F_2 FFT to evaluate it at X=1,x,x^2,...,x^{2^d} in time z^{1/2} (\log z)^{O(1)}. And, since the field F_2[x]/(g(x)) (we assume g was irreducible) contains $2^d$ roots of unity we don’t have to even be clever about how exactly to use FFTs for our fast evaluation problem, as these powers of x *are* our 2^dth roots of unity.

      There is one point to worry about here, concerning how this would fit in with Goal 1: In crafting Goal 1 we had assumed that x had quite large order, say order z^2 or something (see the original note where I introduced Goal 1), yet for the above idea to work we need to work with much smaller values for the order of x. Well, it’s probably ok to do this — I haven’t thought through yet the consequences of using a g(x) of such low degree. If a stronger version of Goal 1 could be proved, where one only needs to evaluate f(X) at X=1,x,x^2,...,x^{\lfloor z^{0.49}\rfloor} or so, then maybe we can even take g(x) of degree $d$ satisfying 2^d \approx z^{0.49} or so.

      Comment by Ernie Croot — October 22, 2009 @ 5:33 am | Reply

      • Hmm, this is interesting; it is the first time I have seen this approach give (or have the potential to give) a run time close to z^{0.5}, rather than z^{0.49} \times z^{0.49} which seemed to be what the previous approaches were giving (each individual evaluation of \sum_{a \leq p \leq b} x^p \mod 2, g(x) could be evaluated below the square root barrier, but the number of such evaluations was also close to the square root).

        I’m still a bit worried though that when scanning an interval [a,b] of length z^{0.51} for primes, one may need to evaluate the polynomial at about z^{0.51-o(1)} points, because otherwise I don’t see how to prevent the polynomial \sum_{a \leq p \leq b} x^p from factorising as the product of the minimal polynomials of all the points being tested. (Here I am assuming that the minimal polynomials have degree z^{o(1)}.) This would keep the run time above z^{0.51-o(1)} unless there was some way to test for non-vanishing at these points with sublinear efficiency.

        It would of course be highly unlikely that this polynomial would be so smooth, but one would need some algebraic miracle or something to rigorously show that this doesn’t happen…

        Comment by Terence Tao — October 22, 2009 @ 6:13 am | Reply

      • One observation following your comment: suppose in the interval [a,b] there were exactly two primes, and they differed by a multiple of 2^m for some m. Then \sum_{a \leq p \leq b} x^p \mod 2,g(x) would vanish for any g of degree at most m. So it does seem that we need g of moderately large degree (larger than \log_2 z^{0.51}) in order to deterministically detect primes.

        Comment by Terence Tao — October 22, 2009 @ 6:38 am | Reply

  22. It occurred to me that we might be able to show that the task of determining whether a prime lies in [a,b] could lie in BPP, at least. (This is not a triviality; on the wiki page , there is a reasonable case that one could concoct a set of pseudoprimes for which one could not solve this problem quickly even when P=BPP.) The point is that testing polynomial identities over finite fields is in BPP, because such an identity either holds everywhere or fails for a large fraction of inputs, and so can be placed in BPP by random sampling. And this whole approach is basically polynomial identity testing.

    Actually, now that I think about it, we won’t get BPP, but rather BP-z^{0.49}, which isn’t nearly as impressive, but still non-trivial at least…

    Comment by Terence Tao — October 22, 2009 @ 6:22 am | Reply

    • If the a quantum computer has a random number generator and can simulate a turing machine couldn’t it solve BQP problems?

      In fact from

      BQP contains BPP

      Furthermore since we have a source of randomness in BPP and the density is 1/log b for primes then using the random source to get a random candidate will give a number that has 1/log b probability of being prime repeating this 2log b times will give success with probability 1-(1-1/log b)^(2log b) or roughly 1-(e^-2) which means probability of failure is less than 1/3. I may be missing seeing something but it looks like the problem is in both BQP and BPP.

      Comment by Kristal Cantwell — October 22, 2009 @ 4:56 pm | Reply

      • Kristal, complexity classes such as P, BPP, NP, BQP, etc. apply only to decision problems (“Yes/No” questions), not search problems (“Find an X”). Finding a prime is a search problem, not a decision problem, and so does not lie in P, BPP, etc.

        A relevant decision problem, though, is “Does there exist a prime in the interval [a,b]?”. If one can solve this decision problem quickly, one can solve the search problem quickly, by a binary search starting from [x,2x].

        It is still open whether this decision problem is in BPP. The problem is that the density of primes in [a,b] could be very low, and so it is not clear even after polylogarithmically many searches that one has a 2/3 chance or more of finding a prime (the needle in the haystack problem). But it seems from Ernie’s work that this problem is at least in BP-z^{0.49} when a,b = O(z), by which I mean that there is a probabilistic algorithm which after O(z^{0.49}) work will correctly determine whether there is a prime or not in [a,b] with a failure probability of at most 1/3 in either case.

        Comment by Terence Tao — October 22, 2009 @ 5:40 pm | Reply

        • You are right BPQ is for decisions. There is FBPQ see

          I think there should also be a FBPP but I can’t find it.
          I think the problem is is FBPQ. We are given 10^k to 10^k+1
          with a quantum number computer we have random numbers and can guess randomly each guess and test is polynomial in k
          and and has probability of success 1/k so if we repeat this test 2k times we should have probabliity of success 1-e^2
          and we should have the problem in FBQP. This looks better than any search involving z^.49 because I think z is exponential in k. If FBPP were defined as FBPQ then something similar should show the problem is in FBPP.

          Comment by Kristal Cantwell — October 26, 2009 @ 7:14 pm | Reply

          • That should be BQP and FBQP in the above instead of BPQ and FBPQ. Also the problem is in BQP rather than is is BQP. Sorry about these errors.

            Comment by Kristal Cantwell — October 26, 2009 @ 7:28 pm | Reply

  23. I finally got around to looking up the fast multiplication algorithms in Knuth’s book “Semi-numerical Algorithms” that I had talked about before, and it turns out that one of these is called “Karatsuba’s identity”, which is somewhat similar to Strassen’s identity, only it doesn’t involve matrices is a lot easier to use. In fact, I think there is a chance of using it in combination with FFTs to get a quick-running algorithm to evaluate our prime generating function. Here is the idea: Suppose that we can write

    f(X)\ =\ \sum_{i=1}^k \alpha_i(X) \beta_i(X),

    where f(X) is our generating function for the primes in [z,z+z^{0.501}], and where the \alpha_i and \beta_i are polynomials such that the total number of terms among all of them is at most, say, z^{0.49} or so. What we would like to do is quickly evaluate f(X) at, say,

    X=1,x,x^2,...,x^N \pmod{2,g(x)}\ \ \ \ (*),

    where N \sim z^{0.49} or so; and, we’d like to be able to do this using a lot fewer than z^{1/2} operations.

    Now let D be some integer (that we choose later), and write each of the \alpha_i(X) as

    \alpha_i(X)\ =\ \alpha_{i,1}(X) + X^D \alpha_{i,2}(X),


    \alpha_{i,1}(X)\ =\ \sum_{j \leq D-1} c_j X^j,


    X^D \alpha_{i,2}(X)\ =\ \sum_{j \geq D} c_j X^j,

    where c_0,c_1,... are the coefficients of \alpha_i(X); and, write \beta_i(X) similarly — that is,

    \beta_i(X)\ =\ \beta_{i,1}(X) + X^D \beta_{i,2}(X).

    Ideally, we want D to be such that each \alpha_{i,1}(X), \alpha_{i,2}(X) have about half as many terms as \alpha_i(X), and we want the analogous to hold for the \beta_i(X) — of course, in order to guarantee that this is possible, we would have to be able to select that decomposition for f(X) above carefully.

    Now, assuming that we can find such a D (and have such a decomposition for f(X)), we then observe the following, which is basically Karatsuba’s identity:

    f(X)\ =\  \Sigma_1 + \Sigma_2 + \Sigma_3,


    \Sigma_1\ =\ (1 - X^D) \sum_{i=1}^k \alpha_{i,1}(X)\beta_{i,1}(X),

    \Sigma_2\ =\ (X^{2D} - X^D) \sum_{i=1}^k \alpha_{i,2}(X) \beta_{i,2}(X),

    \Sigma_3\ =\ X^D \sum_{i=1}^k (\alpha_{i,1}(X) + \alpha_{i,2}(X))(\beta_{i,1}(X) + \beta_{i,2}(X)).

    The point here is that we have replaced on sum of products of two polynomials (the \alpha_i(X)\beta_i(X)) with three sums of such products, but where each polynomial has degree about half what we had before — each of these sums (forgetting the factors (1-X^D) and so on) involves products of two polynomials of degree at most D. The idea is then to iterate the above process, starting with *these* sums, and then replacing each by three sums of products of polynomials each of degree D/2, and so on. Eventually, we get down to sums of products of polynomials that we can just expand out by trivial methods, as they will have few terms to begin with, and then apply FFTs to evaluate them at the points (*) above.

    Unfortunately, it is not always the case that the new polynomials produced at each iteration have fewer terms, since the \alpha_i and \beta_i may have been sparse to begin with (in fact, likely are). But what we can hope for is that the initial polynomials \alpha_i and \beta_i can be chosen carefully so that after a small number of iterations, we get “mixing” — that is, the number of terms in, say, each of the polynomials

    (\alpha_{i,1}(X) + \alpha_{i,2}(X))\ \ and\ \ (\beta_{i,1}(X) + \beta_{i,2}(X))

    is not much bigger than the number in each of

    \alpha_{i,1}(X), \alpha_{i,2}(X), \beta_{i,1}(X), \beta_{i,2}(X).

    In other words, we will want that sufficiently many iterations into the above process we have that, say, \alpha_{i,1}(X) shares many terms in common with \alpha_{i,2}(X), and the same for the \beta_{i,j}(X)‘s.

    Comment by Ernie Croot — October 25, 2009 @ 3:31 am | Reply

  24. […] Terence Tao started a polymath blog here, where he initiated four discussion threads (1, 2, 3 and 4) on deterministic ways to find primes (I’m not quite sure how that’s proceeding — […]

    Pingback by Polymath again « What Is Research? — October 26, 2009 @ 10:44 pm | Reply

  25. Hate to be boring about it, but you can attain deterministic time bound O(10^(k/2 + o(1))), the same time
    bound you got, by simply sieving an array that many bits long. You need the RH to assure that any this-size region will contain a prime, however. All multiples of numbers up to this bound, are removed by the sieving.

    Well, you probably already thought of that. (I don’t know how to use this blog thing to tell what happened before. Where’s the how-to guide?) Sure is low-tech.

    As another comment, I’m not sure this whole topic has much interest. To explain, we have easy randomized algorithms that will do the job in polynomial(k) time with high probability. These algorithms are very very very simple. Any deterministic algorithm you invent is almost certainly, by comparison, going to be utterly worthless, way more complicated and slower. That’s the first problem. The second problem is, suppose I derandomize that algorithm by simply using certain deterministic pseudorandom number generators (I have some which I can prove, starting from a random seed, will pass every possible polytime statistical test provided that ANY generator exists that does… furthermore I can prove they are nearly as immune to randomness tests as it is possible for any algorithm to be…). So, in that case, I’ve solved your problem OR there exists no such pseudo-random number generator. So in the event I haven’t solved your problem, a way, way, way more important and profound result is at hand. And very weak randomness properties are required in your application.

    So all this suggests to me, that this was a fairly stupid problem to choose.

    Mind you, I won’t say it’s totally stupid since I admit it would feel interesting to know the answer to your problem. It’s kind of like the AKS primality test was allegedly a great development… and yes it was in some sense… but from a practical point of view, it is a total waste of time.

    My final comment is: are you aware of the following papers?
    Janos Pintz, William L. Steiger, Endre Szemeredi: Two Infinite Sets of Primes with Fast Primality Tests,
    STOC 1988 pp 504-509, journal version is
    Maths of Computation Vol. 53, No. 187, July 1989 pp.399-406.

    Eric Bach: How to Generate Factored Random Numbers, SIAM J. Computing 17 (1988) pp.179-193

    These were done pre-AKS, but are better than AKS for many purposes of your ilk.

    Comment by Warren D Smith (PhD) — May 8, 2011 @ 11:48 pm | Reply

  26. 1 мар 2011 Коль скоро вы крыться придерживаться годной вам диеты в пользу кого ума, ваш ум довольно заниматься почти максимальной отдачей вплоть до
    Диетическое и лечебное питание при различных заболеваниях.
    диеты, как похудеть, упражнения, фитнес, красивая фигура, гадания.
    27 май 2011 Диета Пьера Дюкана во Франции неимоверно популярна. Но не все так розово, Диеты бывают разные – гипокалорийные, белковые и… счастливые.
    Диетическое питание. Средства для похудения, диеты, похудеть быстро. Советы и рекомендации.

    Comment by Lincidoro — July 15, 2011 @ 10:11 pm | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: