The polymath blog

August 13, 2009

(Research Thread III) Deterministic way to find primes

Filed under: finding primes,research — Terence Tao @ 5:10 pm

This is a continuation of Research Thread II of the “Finding primes” polymath project, which is now full.  It seems that we are facing particular difficulty breaching the square root barrier, in particular the following problems remain open:

  1. Can we deterministically find a prime of size at least n in o(\sqrt{n}) time (assuming hypotheses such as RH)?  Assume one has access to a factoring oracle.
  2. Can we deterministically find a prime of size at least n in O(n^{1/2+o(1)}) time unconditionally (in particular, without RH)? Assume one has access to a factoring oracle.

We are still in the process of weighing several competing strategies to solve these and related problems.  Some of these have been effectively eliminated, but we have a number of still viable strategies, which I will attempt to list below.  (The list may be incomplete, and of course totally new strategies may emerge also.  Please feel free to elaborate or extend the above list in the comments.)

Strategy A: Find a short interval [x,x+y] such that \pi(x+y) - \pi(x) > 0, where \pi(x) is the number of primes less than x, by using information about the zeroes \rho of the Riemann zeta function.

Comment: it may help to assume a Siegel zero (or, at the other extreme, to assume RH).

Strategy B: Assume that an interval [n,n+a] consists entirely of u-smooth numbers (i.e. no prime factors greater than u) and somehow arrive at a contradiction.  (To break the square root barrier, we need a = o(\sqrt{u}), and to stop the factoring oracle from being ridiculously overpowered, n should be subexponential size in u.)

Comment: in this scenario, we will have n/p close to an integer for many primes between \sqrt{u} and u, and n/p far from an integer for all primes larger than u.

Strategy C: Solve the following toy problem: given n and u, what is the distance to the closest integer to n which contains a factor comparable to u (e.g. in [u,2u])?  [Ideally, we want a prime factor here, but even the problem of getting an integer factor is not fully understood yet.]  Beating \sqrt{u} here is analogous to breaking the square root barrier in the primes problem.


  1. The trivial bound is u/2 – just move to the nearest multiple of u to n.  This bound can be attained for really large n, e.g. n =(2u)! + u/2.  But it seems we can do better for small n.
  2. For u \leq n \leq 2u, one trivially does not have to move at all.
  3. For 2u \leq n \leq u^2, one has an upper bound of O(n/u), by noting that having a factor comparable to u is equivalent to having a factor comparable to n/u.
  4. For n \sim u^2, one has an upper bound of O(\sqrt{u}), by taking x^2 to be the first square larger than n, y^2 to be the closest square to x^2-n, and noting that (x-y)(x+y) has a factor comparable to u and is within O(\sqrt{u}) of n.  (This paper improves this bound to O(u^{0.4}) conditional on a strong exponential sum estimate.)
  5. For n=poly(u), it may be possible to take a dynamical systems approach, writing n base u and incrementing or decrementing u and hope for some equidistribution.   Some sort of “smart” modification of u may also be effective.
  6. There is a large paper by Ford devoted to this sort of question.

Strategy D. Find special sequences of integers that are known to have special types of prime factors, or are known to have unusually high densities of primes.

Comment. There are only a handful of explicitly computable sparse sequences that are known unconditionally to capture infinitely many primes.

Strategy E. Find efficient deterministic algorithms for finding various types of “pseudoprimes” – numbers which obey some of the properties of being prime, e.g. 2^{n-1}=1 \hbox{ mod } n.  (For this discussion, we will consider primes as a special case of pseudoprimes.)

Comment. For the specific problem of solving 2^{n-1}=1 \hbox{ mod } n there is an elementary observation that if n obeys this property, then 2^n-1 does also, which solves this particular problem; but this does not indicate how to, for instance, have 2^{n-1}=1 \hbox{ mod } n and 3^{n-1}=1 \hbox{ mod } n obeyed simultaneously.

As always, oversight of this research thread is conducted at the discussion thread, and any references and detailed computations should be placed at the wiki.


  1. About the zeta-function strategy, one thing I have just remembered (but which may have been in the back of my mind) is that Selberg already proved in 1943 that, on the Riemann Hypothesis, one has \pi(x+y)-\pi(x)\sim y/\log x for almost all x, for any function y=y(x) growing, roughly, faster than (\log x)^c where c>2.

    The paper starts on page 160 of the first volume of Selberg’s collected works; most of it is available on Google books here.

    Comment by Emmanuel Kowalski — August 13, 2009 @ 7:35 pm | Reply

    • I thought I might mention some heuristics that allow one to guess results such as Selberg’s result without having to do all the formal computations. As mentioned in earlier posts, we have the explicit (though not absolutely convergent) formula

      \displaystyle \sum_{n < x} \Lambda(n) = x - \sum_\rho \frac{x^\rho}{\rho} + \ldots

      where the \ldots are relatively uninteresting terms. If one introduces a spatial uncertainty of A (where 1 \leq A \ll x), one can truncate the sum over zeroes to those zeroes with height O(x/A), at the cost of introducing an error of size O(A); this is basically because x^\rho oscillates more than once over the interval [x,x+A] when |\hbox{Im}(\rho)| \gg x/A and thus (morally) cancels itself out when averaging over such an interval. Heuristically, this gives us something like

      \displaystyle \pi(x+y)-\pi(x) \approx \frac{y}{\log x} - \frac{y}{\log x} \sum_{\rho = O( \frac{x}{y} )} x^{\rho-1}.

      Assuming RH and writing \rho = 1/2 + i \sigma, one gets heuristically

      \displaystyle \pi(x+y)-\pi(x) \approx \frac{y}{\log x} [ 1 - \frac{1}{\sqrt{x}} \sum_{\sigma = O( \frac{x}{y} )} e^{i \sigma \log x}].

      The number of zeroes with imaginary part at most T is about T \log T, so there are about \frac{x}{y} \log x zeroes in play here. Pretending the exponential sum behaves randomly (which can be justified in an L^2 sense by standard almost orthogonality methods), the sum is expected to be about \sqrt{ \frac{x}{y} \log x} on the average, and the main term dominates the error term once y \gg \log x. This is a logarithm better than Selberg's result; I think this comes from the fact that we are looking at sharply truncated sums \pi(x+y)-\pi(x) rather than smoothed out counterparts, and the above heuristics have to be weakened to reflect this.

      The same type of analysis, by the way, gives agreement with the Poisson statistics for \pi(x+y)-\pi(x) for y \sim \log x at the second moment level, assuming GUE. It also explains why we start getting good worst-case bounds once y \gg \sqrt{x} \log x.

      Unfortunately, average case results don't seem to be of direct use to our problem here due to the generic prime issue (or Gil's razor). If we had a way of efficiently computing or estimating this random sum \sum_\sigma e^{i \sigma \log x} for some deterministically obtainable x, we might get somewhere…

      Comment by Terence Tao — August 13, 2009 @ 8:07 pm | Reply

      • I agree that as a straightforward analytic computation, this is unlikely to work. The “long shot” hope is to really a real algorithmic part of the argument — i.e., some procedure with “if … then … else ….” steps — to locate a good place to look for primes between x and 2x — since these are essentially distinct from purely analytic arguments and could possibly go beyond the average/genericity issue.

        Comment by Emmanuel Kowalski — August 13, 2009 @ 11:58 pm | Reply

    • I submitted a proposal that the prime counting function is given by:
      N=sum( arctan(tan(Pi*Bernoulli(2*n)*GAMMA(2*n+1))) *(2*n+1))/Pi, n=3..M), M=m/2 for m even and M =m/2-1/2 for m odd. This formula yields the exact number of primes less than m. It was derived from the Riemann hypothesis and a a solution that I submitted that was never looked at. If this formula is right, then I have solve Riemann’s hypothesis! I invite a review of my paper.

      Comment by Michael Anthony — March 14, 2010 @ 12:26 pm | Reply

      • Note also that the above prime counting formula is also given in log form as:

        N= sum(i(n+1/2)*log( (1-i*tan(Pi*bernoulli(2*n)*gamma(2*n))/(1-i*tan(Pi*Bernoulli(2*n)*GAMMA(2*n+1))/Pi , n=3..m); which for large n reduces to the Prime Number theorem.

        Comment by Michael Anthony — March 14, 2010 @ 12:32 pm | Reply

  2. Let me stick my neck out and follow up on the objection to my post #45. It’s true that without better control of the heuristic P(p+1)\approx p^{.3} the argument is in trouble. However, I think that assuming P(p+1) = p^{.3} \pm \epsilon p^{.3} is a bit beyond worst-case. As long as the quantity c(\delta) in the Erdos-type theorem is strictly decreasing for \delta in the range .3 \leq\delta\leq .3+\epsilon, (which I think follows from the Baker-Harmen methods) then we should be able to assume that P(p+1) = p^{.3} + o(p^{.3}). Of course, we’d need something more quantitative than this to be useful.

    Let’s be very optimistic for second. If we could get down to P(p+1) = p^{.3} + O(p^{.3 \times .53}) (that is to say we could get the error in the estimate P(p+1)\approx p^{.3} to be slightly worse than a square-root) then I think the procedure works as stated without enlarging any of the search intervals. The reason for this is that the search space resulting from not knowing where in the interval [x,x+cx^{.53}] the large prime (p) we are looking for is, and the search space resulting from not knowing where in the interval [p^{.3},p^{.3}+p^{.3\times.53}] the small prime (P(p+1)) is, would entirely overlap. Of course as the error estimate worsens from O(p^{.3 \times .53}) towards o(p^{.3}) the overall performance decreases towards the O(10^{k})^{.53}.

    However, I think it would be possible to get some mileage out of even a very weak error estimate. Say we could get P(p+1) = p^{.3}+O(p^{.3}/\ln^{a}(p)) (which is probably the most likely estimate we would be able to obtain, if we could get anything at all). Then I think this method should get some sort of logarithmic improvement over the prime gap results, such as moving us down to around O(k^{-a/3}(10^{k})^{.53}). For large enough a this might be able to move us towards the ultra-weak conjecture on RH prime gap result.

    Of course, all of this assumes we could get some sort of control on the error in the approximation P(p+1) \approx p^{.3}, and the pay-off would most likely be very small.

    Comment by Mark Lewko — August 13, 2009 @ 8:21 pm | Reply

    • Admittedly, this is just saying that a favorable estimate in some other problem would translate into a favorable estimate for our problem. Of course, we are not at a shortage of “other problems” of this form.

      Comment by Mark Lewko — August 13, 2009 @ 9:22 pm | Reply

  3. Here is a suggestion on how to combine strategy A with strategy D, following Harald Helfgott’s suggestion, to attack the o(\sqrt{n}) problem (problem 1 above): Instead of using the Riemann zeta function, one can use the zeta function associated to the number ring Z[i], say, to count those primes p that are a sum of two squares. Assuming the RH for this zeta function (and assuming that it doesn’t have appreciably more zeros than the usual zeta function — this would be need to be looked up), we should be able to say that every interval [n, n+ \sqrt{n} \log n], say, contains the “right number” of primes that are 1 \pmod{4} (and I suppose one can also use the usual Dirichlet L-functions for this task… so we might only need assume GRH, not ERH).

    But now these primes p \equiv 1 \pmod{4} are the sum of two squares, and there are only O(n/\sqrt{\log n}) numbers up to {}n that are a sum of two squares. Assuming that it is easy to “walk through” only those numbers (that are a sum of two squares) in our interval [n, x+\sqrt{n}\log n], it seems to me that this shaves a factor of \sqrt{\log n} off the running time to locate a prime (but maybe this can also be achieved by using a sieve on small primes for [n, n+\sqrt{n} \log n] to begin with — i.e. maybe an easier method will do).

    But now that was what we got just with the form x^2 + y^2. The same trick should work with any norm form, though I doubt one could get any better than a \log n factor improvement in the running time this way. Anyways, it was just a thought…

    Comment by Ernie Croot — August 13, 2009 @ 8:35 pm | Reply

    • The question of walking/enumerating fast the numbers represented by x^2+y^2, without repetition, seems quite difficult to me. The characterizations of these numbers which I know are all based on properties of the factorization of the integer (all odd prime divisors which are congruent to -1 modulo 4 must appear with even multiplicity). As a toy problem, is there a quick way to check that a number congruent to 1 modulo 4 is not divisible by two distinct primes congruent to -1 modulo 4?

      Comment by Emmanuel Kowalski — August 13, 2009 @ 11:55 pm | Reply

      • Yes, I guess you are right. There might still be something that can be salvaged, though. For example, there are formulas for the smaller of x or y such that x^2 + y^2 = p, for a prime p \equiv 1 \pmod{4} (though, the formulas are too unwieldy for an algorithm), and there is a dynamical systems approach due to Roger Heath-Brown that identifies the fixed point of the system with (x,y) that might be useful (I doubt it). I seem to recall also that there is a standard way to find x and y using continued fractions, though I can’t remember just now how that algorithm works. In any case, none of these ideas seem to work (if so, it’s probably just a reworking of the Euclidean algorithm, which is all continued fractions really are).

        Maybe one can look at sequences of Farey fractions somehow. For example, say we want a sum of squares that comes close to a^2 + b^2; in fact, let’s say we want a whole lot of them that we can find quickly. Maybe what would be good is to have them take the form (a + ct)^2 + (b - dt)^2 for some integer t (this, of course, won’t cover all close sums of squares, but it is a start). Now, the difference between this and a^2 + b^2 is something like 2act - 2bdt + (ct)^2 + (dt)^2. And, of course, if we choose c/d to be a close rational approximation to a/b, but where, say, c,d are much smaller than a,b, then this difference will be small. In fact, say |a/b - c/d| = 1/bd, with |d| < \sqrt{b}. Then, the difference between the two sums-of-squares will be (ct)^2 + (dt)^2, which is quite small for t small. And then maybe one can glue together a bunch of these arithmetic progressions (a + ct, b - dt), so as to cover the whole interval [x, x + \sqrt{x}\log x]. Hmmm… I guess we still can get these APs to overlap a lot — it would be good if we could somehow pick the APs to be as disjoint as possible (I have a paper on this sort of thing — picking lots of disjoint APs — but it seems completely irrelevant to this problem).

        Ok, it's starting to look like this approach (combining A and D) won't work in the way I laid out. I think it would be interesting if there were a way to do it, though.

        Comment by Ernie Croot — August 14, 2009 @ 12:42 am | Reply

        • Looking again at what I wrote here, I see that I didn’t express myself clearly. Let me try again: in the first paragraph, all I am saying is that if we knew what the x‘s are such that x^2 + y^2, x < y sums to a prime in the interval under consideration, we could search only along these and avoid the problem of repeat representations.

          And in the second paragraph all I am trying to say is that maybe what we can do is partition the set of numbers in our interval that are a sum of two squares, into families, where the numbers within each family are automatically distinct ( the numbers of the form (a + ct)^2 + (b-dt)^2 are all distinct for fixed a,b,c,d where t varies). Of course, we need that the families themselves don't overlap, and that the arithmetic progressions (a + ct, b - dt) used to define the families, also don't overlap — this is obviously the hard part.

          Comment by Ernie Croot — August 14, 2009 @ 9:51 pm | Reply

      • I don’t get this. Surely running through the numbers of the form x^2 + y^2 in an interval roughly of the form [n,n+sqrt(n)] is an easy task, and surely the time taken should be roughly O(sqrt(n))? This is all that is being required in this argument, it seems to me. What would be hard would be to pick a number of the form x^2+y^2 uniformly and at random (i.e., without multiplicities).

        Comment by H — August 14, 2009 @ 2:34 am | Reply

        • Actually, you do seem to lose the factor of O(sqrt(log N)) in running time that you intended to win by proceeding in this way. The problem is that you’ll end up producing most of the non-primes of the form x^2+y^2 many times over – once for each time the form represents the non-prime. I don’t see a way to avoid this (though obviously one needs to check for primality only once).

          Comment by HH — August 14, 2009 @ 3:34 am | Reply

          • I guess what I had in mind really was the enumeration problem as follows: given x, find the smallest integer larger than x which is of the form a^2+b^2. How fast can this be done deterministically?

            In some ways, the numbers a^2+b^2, when multiplicity of representation is forbidden, are worse than the prime numbers. For instance, their associated zeta function is roughly controlled by the square root of the zeta function (of \mathbf{Q}(i)), so it has singularities even at the zeros of \zeta(s).

            (In the other direction, sometimes those numbers are better; e.g., it is possible to get the asymptotic formula for their counting function using pure sieve, as done first by Iwaniec, I think — the asymptotic itself was first proved by Landau using analytic methods).

            Comment by Emmanuel Kowalski — August 14, 2009 @ 3:12 pm | Reply

            • This look like a very nice problem. Can it be done fast with a randomized algorithm?(without a factoring oracle?). Also, why things become easier when multiplicity of representations is not forbidden?

              Comment by Gil Kalai — August 14, 2009 @ 3:22 pm | Reply

              • I don’t know any randomized algorithm here (but maybe randomized algorithms are not very good for this type of problem where one asks for a very specific answer? for finding a number of this type in a longish interval, on the other hand, one could simply look randomly for a prime congruent to 1 modulo 4).

                Analytically, summing any smooth enough function over integers a^2+b^2, if multiplicity is allowed, can be written as


                and because this is a double sum with two free variables of a smooth function one can apply, for instance, Poisson summation, or many other tools.

                Getting rid of the multiplicity means, in this language, inserting a term 1/m(a^2+b^2), where m(n) is the multiplicity:

                \sum_{a}\sum_{b} f(a^2+b^2)/m(a^2+b^2).

                Because the function m(n) is complicated arithmetically (it depends on the prime factorization of the argument), this is not a nice smooth summation anymore.

                For instance, I don’t think there exists a proof of the counting of those numbers (the case where f(x) is the characteristic function of an interval [1,N]) which doesn’t use multiplicative methods (zeta functions or sieve).

                Comment by Emmanuel Kowalski — August 14, 2009 @ 6:06 pm | Reply

                • I wonder what is known about the complexity of deciding if a number is the sum of two squares. Is it as hard as factoring? Also, what is known about the distribution of gaps between numbers which are sums of two squares.

                  Comment by Gil Kalai — August 15, 2009 @ 7:55 pm | Reply

                  • I’m not sure if this exactly answers your question, but there is a guaranteed to be number representable as the sum of two squares in the interval [x, x+5x^{1/4}]. I don’t have a reference for a proof of this, but it is stated (and attributed to Littlewood) in connection with open problem 64 in Montgomery’s 10 Lectures on the Interface of Harmonic Analysis and Number Theory

                    Comment by Mark Lewko — August 15, 2009 @ 8:44 pm | Reply

                    • I think this fact can be established by a simple greedy algorithm argument: if one lets a^2 be the largest square less than x, then a^2 is within 2\sqrt{x} or so of x; if one then lets b^2 be the first square larger than x-a^2, then b^2 should be within about 5x^{1/4} or so of x-a^2, so a^2+b^2 is a sum of two squares in [x,x+5x^{1/4}]. (A very similar argument, regarding differences of two squares rather than sums, is sketched in Comment 4 after Strategy C in the post.)

                      It might be of interest to see if there is some way to narrow this interval; the techniques for doing so might be helpful for us.

                      Comment by Terence Tao — August 15, 2009 @ 8:52 pm

                    • That is, in fact, open problem 64 (to establish this with [x,x+o(x^{1/4})]).

                      Comment by Mark Lewko — August 15, 2009 @ 9:03 pm

                    • but regardless of what can be proved, what is the “truth” regarding the density and the distribution of gaps between integers which are the sum of two squares?

                      Comment by Gil Kalai — August 15, 2009 @ 9:29 pm

                    • Halberstam, in a paper in Math. Mag. 56 (1983), gives an elementary argument of Bambah and Chowla that replaces 5x^{1/4} with 3x^{1/4}. I think this is also discussed in one of the early chapters of the book of Montgomery and Vaughan, but I don’t have this available to check. My memory is that they say that nothing better is known. (Halberstam also does, but this is 20 years earlier).

                      In the opposite direction, Richards (Adv. in Math. 46, 1982), proves that the gaps s_{n+1}-s_n between consecutive sums of two squares are infinitely often
                      at least (1/4-\epsilon)\log n, where \epsilon>0 is arbitrarily small.

                      The average gap is of size \sqrt{\log s_n}. I don’t know if there are analogues of the conjectured Poisson distributions of the numbers of primes in intervals of logarithmic length, though that might be something fairly reasonable to expect.

                      Comment by Emmanuel Kowalski — August 15, 2009 @ 9:54 pm

                    • The greedy strategy argument of Bambah and Chowla shows that every interval (x-2\sqrt{2}x^{1/4},x] contains a sum of two squares, for x \geq 1. Nothing better is known even today (the paper of Bambah and Chowla is from around 1945). Not only Littlewood, but also Chowla, stated the improvement to o(x^{1/4} as a research problem.

                      Comment by Anon — August 16, 2009 @ 4:59 am

  4. There is another old construction of pseudoprimes due to Cipolla (1904). Let F_m = 2^{2^m}+1 be the m-th Fermat number. Note that n=F_m satisfies 2^{n-1} = 1 \mod{n}. Now take any sequence m_1<m_2< \ldots < m_r with r \geq 2. Then F_{m_1} F_{m_2} \cdots F_{m_r} is a pseudoprime if and only if m_r < 2^{m_1}.

    Unfortunately, it seems that it doesn't give a denser set of pseudoprimes. One can see this by approximating F_m by 2^{2^m} and by taking logs. And of course, we would like to construct pseudoprimes with as few prime factors as possible.

    Comment by François Brunault — August 13, 2009 @ 11:27 pm | Reply

  5. If a number is passes the Fermat primality test but is not prime it is called a Carmichael number. The number 1729 which is also the smallest number which is the sum of two different cubes in two different ways is a Carmichael number. There are an infinite number of them of them and there are at least n^2/7 between 1 and n.
    if each factor of (6k+1)(12k+1)(18k+1) is prime then their product is a Carmichael number. More on these numbers are at

    One way to find x simultaneously satisfying 2^x-1 = 0 mod x and 3^x-1 = 0 mod x that is not prime would be to try to find a triple of primes 6k+1, 12k+1, 18k+1 as above. If the primes where randomly distributed with density 1/logx then about 1/(log x)^3 searches ought to do it so possibly that would be an algorithm that would find one such k digit number in an polynomial of k time if the AKS test were used to test primality.

    Comment by Kristal Cantwell — August 13, 2009 @ 11:39 pm | Reply

    • Of course the smallest Carmichael number is 561 so the search for k-digit numbers would have to start with 3.

      Comment by Kristal Cantwell — August 14, 2009 @ 4:40 pm | Reply

    • This and related ideas are also discussed in the following paper.

      A. Granville and C. Pomerance, Two contradictory conjectures concerning Carmichael numbers, Math. Comp., 71 (2001), 883-908.

      Granville and Pomerance list essentially all known and conjectured density estimates related to this, as well as some experimental evidence.

      Comment by François Dorais — August 14, 2009 @ 6:08 pm | Reply

  6. […] Possibly related posts: (automatically generated)Thread on BloggingProposed Polymath3 […]

    Pingback by Polymath4 « Euclidean Ramsey Theory — August 13, 2009 @ 11:56 pm | Reply

  7. I started thinking whether proof theoretic methods could shed some new light on this problem. I was originally aiming for negative results, which is where proof theory usually excels, but while doing some preliminary investigations, I found the following wonderful paper:

    J. B. Paris, A. J. Wilkie and A. R. Woods, Provability of the pigeonhole principle and the existence of infinitely many primes, J. Symbolic Logic 53 (1988), 1235-1244.

    In this paper, the authors show that the infinitude of primes is provable in \mathrm{I}\Delta_0 + \Omega_1. A standard formulation of his system consists of the usual defining equations for {0},{1},{+},{\times},{\leq} as well as the functions {|x| = \lfloor\log_2(x+1)\rfloor} and {x\#y = 2^{|x||y|}} (this is the \Omega_1 part) together with induction for bounded formulas (this is the \mathrm{I}\Delta_0 part).

    Their proof is rather clever. The authors show that if there is no prime in {[a,a^{11}]} then there is a bounded formula \phi(x,y) that describes the graph of an injection from {[0,9a|a|]} to {[0,8a|a|]}, thereby violating a very weak form of the Pigeon Hole Principle (PHP). Before then, they show that for every standard positive integer n and every bounded formula \phi(x,y), it is provable in \mathrm{I}\Delta_0 + \Omega_1 that: for every z \geq 1 the formula \phi(x,y) does not describe the graph of an injection from {[0,(n+1)z]} to {[0,nz]}.

    By standard witnessing theorems of bounded arithmetic, this means that there is a function such that p(x) \geq x is always prime and p(x) is polynomial time computable using an oracle somewhere in the polynomial hierarchy. In fact, I gather that the following paper proves a suitable form of PHP in the weaker system T^2_2.

    A. Maciel, T. Pitassi and T., A. R. Woods, A new proof of the weak pigeonhole principle, J. Comput. System Sci. 64 (2002), 843-872.

    Improving this to S^1_2 [or T^1_2, or S^2_2] would give a polytime algorithm for producing large primes [modulo polynomial local search, modulo a NP oracle]. There are a lot of follow up literature relating weak PHP and infinitude of primes to other common problems in complexity theory and proof theory, but I haven’t had a chance to survey many of them.

    On the other hand, a proof that S^1_2 does not prove the infinitude of primes would also be interesting. This would not imply that there is no polynomial time algorithm for producing large primes, but it would show that the proof of correctness of such an algorithm would require stronger axioms. Possibly Peano arithmetic would suffice to prove correctness, but the proof may require second-order axioms (e.g. deep facts about zeros of \zeta(s)) or, albeit unlikely, much worse (e.g. ZFC + large cardinals)!

    Comment by François Dorais — August 14, 2009 @ 5:11 am | Reply

    • Well, the NP-oracle algorithm is obvious: once we know that {[x,t(x)]} contains a prime for some term {t} of the language, we can repeatedly bisect this interval and query the oracle to see which side contains a prime. Since for every term {t} there is a standard {e} such that |t(x)| \leq |x|^e+e, this homes in on a prime in polynomial time. So I would conjecture that some weak Bertrand’s Postulate is provable in S^2_2, namely that there is a term t such that S^2_2 proves that there is always a prime in the interval {[x,t(x)]}. This term could be much larger than x^{11}, leaving room for weaker forms of PHP or some completely different idea…

      The PLS possibility is intriguing. The dynamical system attack suggested by Tao may be on the right track (though we may want to increase the number of neighbors a bit). Note that the PLS algorithm doesn’t need to produce the prime itself, only some datum that can be decoded into a suitable prime in polytime, so that gives a bit more room to set up good cost functions. I don’t have definite thoughts, but this looks plausible.

      I’m afraid I still have no intuition for the polytime possibility…

      Comment by François Dorais — August 14, 2009 @ 6:51 am | Reply

    • To bring this down to Earth a little, the weak pigeonhole principles mentioned above are related to the existence of collision-free hash functions (in the weakest of the two senses). Suppose \phi(x,y) (which is allowed to depend on {z} too) is a simple enough bounded formula which always defines the graph of a function from {[0,t(z)]} to {[0,z]} for some term t such that t(z) \geq z+1 for all {z}. Then, finding a S^1_2 proof that, for all {z}, \phi(x,y) does not represent the graph of an injection from [0,t(z)] to {[0,z]} is essentially equivalent to finding a polynomial time algorithm that always produces a collision for \phi, i.e., two numbers in {[0,t(z)]} that map to the same value via \phi, at least for all standard numbers {z}.

      A common bottleneck in the work relating PHP and primes is the difficulty of factoring, so I will provisionally assume that I have a factoring oracle. In this case, there is a simple bounded formula \phi as above such that either there are polynomial time constructable primes in all sufficiently long intervals, or \phi describes the graph of a collision-free hash function. (Warning: I’ve only superficially checked the details of this argument, so I may have missed a sneaky inadmissible use of induction somewhere.) Because of the nature of this result, I suspect that similar (possibly better) results already exist in descriptive complexity. Perhaps this rings a bell to some readers?

      A little side remark about factoring – it is probably not a mere coincidence that the main part of the most efficient factoring algorithms we know consist in finding a collision for some function…

      Comment by François Dorais — August 14, 2009 @ 7:56 pm | Reply

      • Below is the key result of Woods in its general form, which goes through in S^1_2 + Factoring. The most subtle point is to show in S^1_2 that every prime that divides a number must occur in the prime factorization of that number.

        The description of the function H in the proof is \Sigma^b_1 modulo the factoring oracle, so the witnessing theorem for S^1_2 applies and this can be used to relate collision-free hashing functions with long intervals without polynomial time constructable primes (modulo factoring).

        As usual |x| = \lceil\log_2(x+1)\rceil, and when a x = p_1^{e_1}\cdots p_n^{e_n} is factorization it is always assumed that p_1 < \cdots < p_n, e_1,\dots,e_n \geq 1, and that the factorization is coded in a manner which is easily decodable in S^1_2.

        Theorem (Woods). If every element of the interval $[b+1,b+a]$ is $a$-smooth, then there is an injection from ${[0,a|b|-1]}$ into ${[0,(1+\lfloor a/2 \rfloor)|b|+2a|a|-1]}.$

        I give the construction in some detail since this is slightly more general than the proof in PWW and I wanted to convince myself that this does indeed work in S^1_2 + Factoring. However, the tedious verifications are left to the reader.

        Sketch of proof. For every x \in [0,a|b|-1] there is a unique triple (i,j,k) such that

        \displaystyle (i-1)|b| \leq x \leq i|b|-1

        and, factoring b+i = p_1^{e_1} \cdots p_{n}^{e_n}, we have 1 \leq j \leq n, 1 \leq k \leq e_j, and

        \displaystyle (i-1)|b| + u_j + (k-1)|p_j| \leq x \leq (i-1)|b| + u_j + k|p_j| - 1,


        \displaystyle u_j = e_1|p_1| + \cdots + e_{j-1}|p_{j-1}|.

        So u_1 = 0 and u_j + k|p_j| \leq u_n+e_n|p_n| \leq 2|a| - 1.

        The function H will be linear on each interval described a triple (i,j,k) as above. There are two main cases depending on whether (*) there is a 1 \leq \ell \leq n such that there are no multiples of p_\ell^{e_\ell+1} in {[b+1,b+a]} and there are no multiples of p_\ell^{e_\ell} in {[b+1,b+i-1]}.

        Case (*) holds. Let \ell be the first element of {[1,n]} that satisfies (*). Then define

        \displaystyle H(x) = x - (i-1)|b| + 2a|a| + \lfloor(1+p_\ell)/2\rfloor|b|.

        In other words, H maps the interval {[(i-1)|b|,i|b|-1]} onto the interval of length |b| starting with

        \displaystyle 2a|a|+\left\lfloor\frac{p_\ell+1}{2}\right\rfloor|b|.

        Since p_\ell \leq a, we have

        \displaystyle 2a|a| \leq H(x) \leq (1+\lfloor a/2\rfloor)|b| + 2a|a|-1.

        Case (*) fails. Then either (1) b+i is not the first multiple of p_j^{e_j} in {[b+1,b+a]}, or (2) b+i is the first multiple of p_j^{e_j} in {[b+1,b+a]} but there is a multiple of p_j^{e_j+1} in {[b+1,b+a]}.

        Ad (1). Let b+s be the first multiple of p_j^{e_j} in {[b+1,b+a]} (so s < i). Note that p_j^{e_j} divides i - s. Factor i - s = q_1^{f_1}\cdots q_m^{f_m} and say p_j = q_h. Then set

        \displaystyle H(x) = x + 2(i-s-1)|a| + v_h - (i-1)|b| - u_j,

        where u_j is as above and similarly

        \displaystyle v_h = f_1|q_1|+\cdots+f_{h-1}|p_{h-1}|.

        In other words, H maps the interval

        \displaystyle [(i-1)|b| + u_j + (k-1)|p_j|,(i-1)|b| + u_j + k|p_j|-1]

        onto the interval

        \displaystyle [(i-s-1)|a| + v_j + (k-1)|q_h|,(i-s-1)|a| + u_j + k|q_h|-1],

        which both have length |p_j| = |q_h|.

        Ad (2). Let b+s be the first multiple of p_j^{e_j+1} in {[b+1,b+a]} (so s > i). Note that p_j^{e_j} divides s - i. Factor s - i = q_1^{f_1}\cdots q_m^{f_m} and say p_j = q_h. Then set

        \displaystyle H(x) = x + 2(s-i+t)|a| + v_h - (i-1)|b| - u_j,

        where u_j and v_h are as above and

        \displaystyle t = p_j^k\left\lfloor\frac{a-s}{p^k_j}\right\rfloor-1.

        In other words, H maps the interval

        \displaystyle [(i-1)|b| + u_j + (k-1)|p_j|,(i-1)|b| + u_j + k|p_j|-1]

        onto the interval

        \displaystyle [(s-i+t)|a| + v_j + (k-1)|q_h|,(s-i+t)|a| + u_j + k|q_h|-1],

        which both have length |p_j| = |q_h|.

        Note that in both subcases (1) and (2), we have H(x) \leq 2a|a|-1.

        Comment by François Dorais — August 15, 2009 @ 2:15 am | Reply

        • I am trying to understand this construction. What happens if two different triples (i,j,k) and (i',j',k') with i \neq i' satisfy condition (*)? Is there an argument that guarantees that the two corresponding p_\ell are different? Otherwise I do not see why the mapping is injective. I’m sorry if it’s a dumb question.

          Comment by Anonymous — August 15, 2009 @ 4:15 pm | Reply

          • The verifications are indeed tedious! There are a two cases: If the exponent of p is the same (say e) in b+i and b+i' then they cannot both be the first multiple of p^e in {[b+1,b+a]} as required by the second half of (*). If the exponents are different then whichever has the smallest exponent fails the first half of (*).

            Comment by François Dorais — August 15, 2009 @ 6:47 pm | Reply

    • The main reason for analyzing the question in S^1_2 is the following result of Sam Buss.

      Witnessing Theorem. Let \phi(x,y) be a \Sigma_1^b formula. If S^1_2 \vdash (\forall x)(\exists y)\phi(x,y) then there is a polynomial time computable function f(x) such that S^1_2 \vdash (\forall x)\phi(x,f(x)).

      If \phi(x,y) is equivalent to ‘y \geq x and {y} is prime’ then from a proof of the infinitude of primes in S^1_2 we could automatically extract a polytime algorithm for producing large primes. On the other hand, if the infinitude of primes is not provable in S^1_2 then we would not necessarily conclude that there is no polytime algorithm for producing large primes, but the correctness of this algorithm would not be formalizable in S^1_2. (Basically, the proof would require much more powerful tools than elementary arithmetic.)

      There is a little snag here: the usual formula for ‘{y} is prime’, namely

      \mathrm{prime}(y) \equiv (\forall u,v \leq y)(y = uv \rightarrow u = y \lor v = y),

      is \Pi^b_1 instead of \Sigma^b_1. Of course, this is not a problem with a factoring oracle, but this is a nagging assumption. The AKS algorithm does provide a \Sigma^b_1 formula to express ‘{y} is prime’. The snag is that the correctness of the AKS algorithm may not be provable in S^1_2, i.e., it is conceivable that a model of S^1_2 may contain an infinitude of “AKS pseudoprimes” but only a bounded number of actual primes. While the extracted algorithm for producing “AKS pseudoprimes” would still produce actual primes in the real world, this situation would be rather awkward.

      The problem with the correctness of AKS in S^1_2 is proving the existence of a suitable parameter r (as in Lemma 3 of Terry Tao’s post on AKS). The simple proof requires finding a small prime r that does not divide n^i-1 for i \leq |n|^2. So this seems to require the infinitude of prime lengths, i.e., a proof of

      (\forall x)(\exists y)(x \leq y \land \mathrm{prime}(|y|)).

      (Note that the quantifiers in \mathrm{prime}(|y|) are sharply bounded.) This looks a lot easier to tackle than the infinitude of primes, but it is still not that easy. Indeed, this is a lot like trying to prove the infinitude of primes in \mathrm{I}\Delta_0, which is an open problem. I don’t know how to do it, so let me pose it as a problem.

      Problem: Prove the infinitude of prime lengths in S^1_2.

      A negative answer would be fine, but be aware that negative answer would also solve negatively the question of the infinitude of primes \mathrm{I}\Delta_0.

      This difficulty suggests the possibility of working in S^1_3 instead. This system is just like S^1_2 except that it has a new basic function symbol x \#_3 y = 2^{|x|\#|y|}. (Yes, there is a whole hierarchy of sharp functions, each of which has subexponential growth.) In S^1_3 the lengths are closed under the sharp operation and the PWW proof in \mathrm{I}\Delta_0 + \Omega_1 can be reproduced for lengths. I’ve heard it claimed that the Witnessing Theorem above generalizes to S^1_3 with “polynomial” replaced by “ever-so-slightly-superpolynomial” (i.e., bounded by |t(x)| where t(x) is a term of S^1_3), which is likely true but I never saw a proof…

      Comment by François Dorais — August 15, 2009 @ 10:40 pm | Reply

      • Here’s an idea for the problem. Given a length |x| use a number {z} of length |x|^2 to do a sieve of Eratosthenes. Place a 1 bit in the {i}-th digit of {z} iff {i} is a multiple of some 2 \leq u \leq |x|. Show by induction on |x| (or PIND on {x}) that z has at least one 0 bit beyond the |x|-th. Anyway, this doesn’t immediately give the right parameter {r}, but a similar trick might work…

        Comment by François Dorais — August 16, 2009 @ 12:17 am | Reply

      • The trick above allows us to count just about anything that only involves only length-sized numbers. So basic sieve methods should go through and give lower bounds on the density of length primes. This is enough to pick a suitable parameter r (which doesn’t have to be optimal since any r \leq |n|^{17000} would be just fine for our purposes).

        Propositions 4 and 5 (again referring to Terry Tao’s presentation of AKS) are more problematic since \mathcal{G} is too big to count. Proving that the various products in the proof of proposition 4 are distinct is fine since the polynomials all have length-size degree. The little pigeonhole trick at the start of the proof proposition 5 is fine since the exponents i,j,i',j' are all length-size. The final part of the proof of proposition 5 is problematic.

        It seems that this boils down to another pigeonhole problem. This one is a little different since formulating it as an injection from {[0,2^t]} to {[0,n^{\sqrt{t}}]} seems to require solving discrete logarithm problem! Maybe there’s a better way to think about this?

        Comment by François Dorais — August 16, 2009 @ 3:35 am | Reply

    • I think it’s time to recap some and draw some preliminary conclusions.

      Here is a quick guide for those who are unfamiliar with bounded arithmetic: The system S^1_2 is the essentially the minimal system of arithmetic that can formalize all polynomial time algorithms. The system T^1_2 is marginally stronger and allows some simple search algorithms (usually optimization algorithms) to be formalized too. So “provable in S^1_2” can be read as provable using exclusively polynomial-time methods. Similarly, “provable in T^1_2” can be read as provable using polynomial-time methods and some simple optimization methods, but nothing else.

      More precisely, here are the two basic facts which stem from the witnessing theorems of bounded arithmetic:

      (1) From a proof of the infinitude of AKS-primes in S^1_2 one can automatically extract a polynomial time algorithm for producing primes (in the real world).

      (2) From a proof of the infinitude of AKS-primes in T^1_2 one can automatically extract a polynomial local search (PLS) algorithm for producing primes (in the real world).

      Since the AKS theorem does not appear to be formalizable in S^1_2, I use “AKS-prime” for numbers that pass the AKS test and “prime” for numbers without proper divisors. (Of course, there is no such distinction in the real world.) A PLS algorithm wouldn’t be as good as a polytime algorithm, but such an algorithm would show that finding primes is either not NP-hard or NP = co-NP. See also the recent work of Beckmann and Buss [] which allows to draw similar conclusions from proofs in T^i_2 for i \geq 2.

      The line of thought using a factoring oracle and relating finding primes with some pigeonhole principles has some sketchy aspects and I can’t draw anything very useful from it at this time. Basically, while the factoring oracle may be inoffensive, I have a hard time convincing myself of that. So I can only draw conclusions from a S^1_2 proof of polytime factorization, which is even more unlikely than polytime factorization.

      Now for some preliminary conclusions, which are only heuristic but are worth keeping in mind.

      While I initially thought that the non-provability of the AKS theorem in S^1_2 would be a negative, I now realize that it could be a blessing. Indeed, it suggests that AKS-like congruences may be more amenable to polynomial time attacks than pure divisibility (e.g. attacks using sieves and prime density). So Avi Wigderson’s proposal (Research Thread II, comment #18) may deserve more attention than it has been getting thus far. Moreover, result (2) suggests first trying something with an optimization flavor to it.

      While density methods and variations based on divisibility may be somewhat limited for polynomial time attacks, this does not mean that they should be neglected. Indeed, analytic number theory provides a wealth of second-order information about this which may help evade these limitations. Perhaps a hybrid solution of the following type: try solving the AKS congruences using this method; if that fails, then there must be an unusually large number of primes in this small set.

      Comment by François Dorais — August 16, 2009 @ 6:21 pm | Reply

    • Here is the simple formula you are looking for:
      arctan(tan(Pi*bernoulli(2*n)*GAMMA(2*n+1)))*(2*n+1)/Pi= 1 for Prime or zero for non-prime!
      In log form:
      i*(n+1/2)*log((1-tan(Pi*bernoulli(2*n)*GAMMA(2*n+1)))*(2*n+1)/Pi)/(1+tan(Pi*bernoulli(2*n)*GAMMA(2*n+1)))*(2*n+1)/Pi) )= 1 for Prime or zero for non-prime!

      Comment by Michael Anthony — March 14, 2010 @ 1:10 pm | Reply

  8. Here’s another suggestion, just using the usual Riemann zeta function: our explicit formula for \psi(x) = \sum_{n \leq x} \Lambda(n) is x - \sum_{\zeta(\rho)= 0} x^\rho/\rho. Now maybe we can work with a small set of values x, specifically chosen to nullify the effect of a large proportion of the zeros up to height, say, \sqrt{x} or so. How? Well here is a first attempt: consider F(x) := \psi(x) + \psi(x/2) + \cdots + \psi(x/N) (and of course we will want to show, say, that F(x+\sqrt{x}) - F(x) is “large”, so that one of the intervals [x/n, (x+\sqrt{x})/n] has loads of primes). Applying the explicit formula, we find that F(x) = x(1 + 1/2 + \cdots + 1/n) - \sum_\rho (x^\rho/\rho)(1 + 1/2^\rho + \cdots + 1/N^\rho). Notice here that 1 + 1/2^\rho + \cdots + 1/N^\rho is a fragment of the Riemann zeta function, and its closeness to $\latex \zeta(s)$ can be determined through the usual integration-by-parts analytic continuation \zeta(s) = s/(s-1) - s \int_1^\infty \{t\}/t^{s+1} dt (where \{t\}$ is the fractional part of t) for Re(s) > 0. But we know that \zeta(\rho) = 0, so I think this says that our Dirichlet polynomials should all be close to 1/(\rho-1) (I would need to check my calculations) when evaluated at these roots \rho. If this is correct, then we get a formula something like F(x) = x(1 + \cdots + 1/n) - \sum_\rho x^\rho/\rho(\rho-1) + E, for some error E. The fact that we get essentially a \rho^2 in the denominator is good news, assuming I haven’t made an error.

    Assuming this is all correct, the skeptic in me says that when the dust has settled we get no gain whatsoever.

    Comment by Ernie Croot — August 14, 2009 @ 4:11 pm | Reply

    • Cramér’s conjecture would permit one to find a prime by testing all the integers in a single very short interval. Does the problem get easier by allowing to test the integers in a collection of widely scattered very short intervals? The question is inspired by the observation that Chebyshev’s method allows us to count weighted prime powers in the set

      E_x = (x/2,x] \cup (x/4,x/3] \cup (x/6,x/5] \cup \cdots

      with extreme precision, in fact

      \left|\sum_{n \in E_x}\Lambda(n) - \log(2)x\right| \leq \log(x)

      What I am suggesting is, maybe the explicit formula approach can be made more powerful with the extra flexibility of searching in several very short intervals. If they are not too many, but are very short, the additional work will not be sgnificant.

      Comment by Anon — August 14, 2009 @ 5:21 pm | Reply

      • I forgot to say that what I am suggesting is a slight elaboration on Ernie’s strategy, to consider something like

        $\psi(x + h(x)) – \psi(x) + \psi(c_2x + h_2(x) – \psi(c_2x) + \cdots + \psi(c_m + h_m(x)) – \psi(c_mx)$

        with suitable $c_2,\ldots,c_m$ and $h(x),h_1(x),\ldots,h_m(x)$ to knock out terms in the explicit formula. Here $m$ could be as large as a power of $\log(x)$.

        Comment by Anon — August 14, 2009 @ 5:40 pm | Reply

        • That *may* work, but I would think you will need the h_i(x) to be quite large (\sqrt{x} or so — which is what we are aiming to slightly improve after all), but I doubt you could knock out lots of terms with m only of size a power of log — with only a power of log number of terms I would think you would start losing control over the corresponding terms in the explicit formula for zeros a little bigger than m itself (which is a power of log).

          A serious problem with this sort of approach is that you don’t know ahead of time where the zeros of zeta are located on the half-line (assuming RH), and so you are sort of forced to choose the c_i‘s and the h_i(x)‘s to conform to some short piece of the zeta function. Well, there may be other nice choices out there (like terms of zeta smoothed in some way, or maybe terms coming from a short Euler product, or maybe a short segment of a completely different zeta function with which you play the usual zero-repulsion games), but if so then I would think you don’t get anything much better.

          It just occurred to me that maybe one can modify an idea of Jean Louis Nicolas (on a completely unrelated sort of problem). I’ll think about it…

          Comment by Ernie Croot — August 14, 2009 @ 7:21 pm | Reply

    • I just realized I think there is a factor N^{1-\rho} missing in my calculations. Probably this means that one would need to do some smoothing (of the Dirichlet polynomial) for there to be any chance of finding primes more quickly by this method. At any rate, I don’t think these methods have a chance of breaking the square-root barrier.

      Here is the calculation in detail: for arbitrarily small \epsilon > 0 we have \sum_{n = 1}^N 1/n^s = \int_{1-\epsilon}^N d [x]/x^s =  [x]/x^s  |_{1-\epsilon}^N + s \int_{1-\epsilon}^N [x] dx/x^{s+1}. Here, [x] = x - \{x\}, where \{x\} denotes the fractional part of x. Rewriting the [x] in the integral as x - \{x\}, we get that the Dirichlet polynomial is N^{1-s} (1 - s/(s-1)) + s/(s-1) - s \int_{1-\epsilon}^N \{x\} dx/x^{s+1}. Now, these last two terms here are approximately \zeta(s) for Re(s) > 0 and N large, because Re(s+1) > 1, making the integral absolutely convergent — of course we are using here the formula \zeta(s) = s/(s-1) - s \int_{1-\epsilon}^\infty \{x\} dx/x^{s+1}. If we have \zeta(\rho) = 0, then, we get that our Dirichlet polynomial is approximately N^{1-\rho}/(1-\rho).

      Comment by Ernie Croot — August 15, 2009 @ 6:05 pm | Reply

      • Please correct typo in Gamma…Gamma(2*n)>> Gamma(2*n+1), In my paper primes and the Riemann Hypothesis, I show that the primes obey, the relation
        frac(theta)=arctan(tan(theta)). You see the primes obey this rule for
        theta = Pi*bernoulli(2*n)*Gamma(2*n+1), only if 2*n+1 is a prime. The function theta is a consequence of the Riemann Hypothesis: s=1/2+i*tan(theta)/2.
        In fact, I show that:
        Zeta(s) = sum(-2^nu*bernoulli(nu)*GAMMA(1-s)/(factorial(nu)*GAMMA(-s-nu+2)), nu = 0 .. infinity); and that when the function vanishes, s/(1-s) = (1/2+(1/2*I)*tan(theta))/(1/2-(1/2*I)*tan(theta))= exp(theta) iff RH is true. The connection with primes is obvious, since when s=-2*n and 2*n+1 is a prime, the function takes an exponential form, and von-Staudt Clausen makes it clear the primes can be counted.

        Comment by Michael Anthony — March 14, 2010 @ 12:50 pm | Reply

  9. Here is an elementary observation, which replaces the problem of finding an integer with a factor in a range with that of finding an integer without a factor in a range.

    Suppose one is looking for a prime larger than u. Observe that if a number n > u does not have any prime factor larger than u, then it must have a factor in the interval [u,u^2], by multiplying the prime factors of n (which are all at most u) one at a time until one falls into this interval. In fact, given any number 1 \leq m \leq n, one must have a factor in the interval [m,mu] by the same reasoning.

    So, to find a prime larger than u in at most a steps, one just needs to find an interval [n,n+a] and an interval [m,mu] with m<n such that the multiples of all the integers in [m,mu] do not fully cover [n,n+a].

    This seems beyond the reach of sieve theory, but at least we no longer have the word “prime” in the problem…

    Comment by Terence Tao — August 15, 2009 @ 3:39 am | Reply

    • Here is a variant of the above idea.

      Suppose we are given a factoring oracle that factors k-digit numbers in O(1) time. What’s the largest prime we can currently generate in O(k) time? Trivially we can get a prime of size k or so (up to logs) by scanning [k+1,2k] for primes. If one assumes ABC and factors all the numbers in, say, [2^k+1, 2^k+k], one can get a prime of size k^{2-o(1)} at least (see my comment in II.10). But this is so far the best known.

      Suppose we wanted to show that [2^k+1,2^k+k] had an integer with a prime factor of at least k^{10}, say. If not, then all numbers here are k^{10}-smooth. I think this (perhaps with a dash of ABC) shows that all numbers here have a huge number of divisors – \exp(k / \log k) or so. So perhaps we can show that the divisor function \tau cannot be so huge on this interval?

      Comment by Terence Tao — August 15, 2009 @ 3:10 pm | Reply

  10. This might already have been discussed… Let a_1 = 2, and a_{j+1} be the largest prime factor of \prod_{i=1}^j a_i. Since we have a factoring oracle, this is a deterministic way of generating primes, however, what is the best known lower bound on $\latex a_j$? It increases quite rapidly but it is not guaranteed that a_{j+1} > a_j. For example, the first elements of this sentence are 2, 3, 5, 7, 11, 13, 509, 6211, 358711, 2010733, 1478047, 5096818106023799, 13427486454730774291, 2200195137345942520234685042012598697858953613.

    Comment by Anonymous — August 16, 2009 @ 1:41 am | Reply

    • I think the lower bound is something like 53. It will not go through all values there is a proof that some lower values are forbidden and it is not monotonic but I don’t think there are any nice lower bounds.

      Comment by Kristal Cantwell — August 16, 2009 @ 3:25 am | Reply

  11. […] current discussion thread for this project is here. Michael Nielsen has a wiki page that summarizes the known results and current ideas. Finally, […]

    Pingback by Deterministic way of finding primes « Who groks in beauty? — August 16, 2009 @ 1:51 am | Reply

  12. I think that one can compute the parity of \pi(x) using only \sqrt{x} or so bit operations, but maybe I am mistaken. If I am right, then if one knows that, say, the interval [x,2x] contains an odd number of primes (as given to one by an oracle, perhaps), then one can use a binary search to locate one using only about \sqrt{x} (times some logs) or so bit operations.

    Here is the idea: it is fairly easy to compute

    S := \sum_{n \leq x} \tau(n)

    in time \sqrt{x} or so, using the fact that for n not a square, \tau(n) is twice the number of divisors of n that are less than \sqrt{n}; so, roughly, forgetting the contribution of the squares, the sum is

    2 \sum_{d \leq \sqrt{x}} |\{ d^2 \leq n \leq x\ :\ d | n\}|.

    Consider S modulo 4. An individual \tau(n) will be divisible by 4 if there are at least two primes that each divide n to an odd power, or a single prime that divides n to a power that is 3 mod 4. So, the only n‘s for which \tau(n) is not divisible by 4 are the numbers of the form p^{4t+1} y^2, p is prime not dividing y, along with the squares; and, we have that \tau(p^{4t+1} y^2) \equiv 2 \pmod{4}. This means that

    S\ \equiv\ 2 |\{p^{4t+1} y^2 \leq x\ :\ (p,y)=1,\ p\ {\rm prime}\}| + \sum_{n \leq \sqrt{x}} \tau(n^2) \pmod{4}.

    Note that this last sum can be computed using only about x^{1/4} or so bit operations; and so, using only this many bit operations we can easily compute the parity of

    S'\ :=\ |\{ p^{4t+1} y^2 \leq x\ :\ (p,y)=1,\ p\ {\rm prime}\}|.

    Taking into account the fact that there are at most O(x^{1/5}) numbers of the form p^{4t+1} \leq x, t \geq 1, it is easy to see that we can just as easily compute the parity of

    S''\ :=\ |\{py^2 \leq x\ :\ p\ {\rm prime}\}|\ =\ \pi(x) + \sum_{n \geq 2} \pi(x/n^2)

    using only \sqrt{x} or so bit operations. But in fact we can just as quickly evaluate the parity of the sum

    S'''\ :=\ \pi(x) + \sum_{y > 1 \atop (y, k!)=1} \pi(x/y^2),

    where k > 0 will be chosen to depend on a parameter \epsilon > 0 below. The reason we can as quickly evaluate this sum (well, up to a factor k^{O(k)} or so) is that we just need to restrict our initial sum over \tau to those n \leq x that are coprime to k!.

    Now let T(x) denote the time required to compute \pi(x) modulo 2. Then clearly we have that

    T(x)\ =\ \sqrt{x} (k \log x)^{O(k)} + \sum_{y > 1,\ {\rm gcd}(y,k!)=1} T(x/y^2),

    since to compute the parity of \pi(x) we just need to compute S''' mod 2, and then the terms \pi(x/y^2) mod 2.

    If we have that T(z) < z^{1/2 + \epsilon}, for z_0 < z < x/k^2, say, then we deduce that

    T(x)\  <\ \sqrt{x} (k \log x)^{O(k)} + \sum_{y > k} T(x/y^2)\  <\ \sqrt{x} (\log x)^{O(1)} + x^{1/2+\epsilon} \sum_{y > k} 1/y^{1+2\epsilon}.

    For k big enough in terms of \epsilon this should be smaller than x^{1/2+\epsilon}. So it seems to me that we have a quick algorithm for computing the parity of \pi(x), assuming I am not overlooking something obvious. I feel like I must be overlooking something — it can’t be this simple, can it?

    Assuming this is right, maybe a similar method will allow us to compute \pi(x) mod 3, mod 5, …, just as easily, and therefore \pi(x) itself via the Chinese Remainder Theorem.

    Comment by Ernie Croot — August 16, 2009 @ 2:15 pm | Reply

    • One issue that may be a concern is that a given integer n could have multiple representations of the form p^{t+1} y^2, so one would have to take this into account.

      A related observation: if f(n) and g(n) are arithmetic functions whose partial sums can be computed in \sqrt{n} polylog(n) time, then the Dirichlet convolution f*g can also be computed in \sqrt{n} polylog(n) time by the Gauss hyperbola method. (This is implicit in your first step, as \tau is the Dirichlet convolution of 1 with 1.) Unfortunately, there is a big difference between the “smooth” arithmetic functions (whose Dirichlet series have no poles in the critical strip) and the “rough” ones (where there are plenty of poles). One can’t get from the former to the latter by Dirichlet convolution. 1 and \tau are examples of the former, while \Lambda and \mu are examples of the latter. Jumping from the former to the latter seems related to me to the infamous “parity problem”…

      Comment by Terence Tao — August 16, 2009 @ 3:20 pm | Reply

      • I understand what you are saying, and in fact I had the same reservations (knowing that functions like tau behave rather differently from mu, say). But I don’t see where the problem is. Following your comment, I thought maybe it was in going from S’ to S” (where the coprimeness condition is dropped), but that can’t be where the problem lies because from the fact that y < \sqrt{x} we only have to worry about primes smaller than \sqrt{x} in computing S''. Of course we also have that a number of the form p y^2 has a unique such representation, so that problem is avoided too; as is the problem of uniqueness of representation of p^{4t+1} y^2, given that (p,y) = 1.

        Assuming that the algorithm is correct (and I have tried hard to see where it could be wrong, and still feel that there has to be an error somewhere), one thing that *maybe* takes care of the parity concerns is the fact that the algorithm is useless as far as locating primes, unless one is given that there are an odd number of primes in some long interval to begin with — i.e. unless one is given some initial non-trivial parity information about the prime counting function on some interval.

        Comment by Ernie Croot — August 17, 2009 @ 1:59 am | Reply

        • I haven’t found a problem in first reading myself; philosophically, I don’t see a problem with the parity of \pi(x) being computable in \sqrt{x} time (I’m thinking by analogy with the class numbers of quadratic fields, which are very mysterious and seem unpredictable, except that their parity is known — since Gauss — in terms of the factorization of the discriminant of the field).

          Comment by Emmanuel Kowalski — August 17, 2009 @ 2:17 am | Reply

          • Let me also point out something else: Selberg’s original formulation of the “parity problem” does not apply here. In that formulation he said that you could take a set of integers having an even number of prime factors, satisfy all the sieve axioms, and yet obviously the set has no primes. But in the algorithm I described we are working mod 2 — mod 2 information is so fine that the “main term” parity objection of Selberg simply doesn’t apply. Furthermore, information about sums of arithmetical functions mod small primes also should escape parity-type objections. This perhaps gives one a ray of hope of escaping the usual “analytic obstructions” to determining \pi(x) quickly — perhaps there is a mod 3, mod 5, … version of the argument (again, assuming it is correct) that will allow us to determine \pi(x) quickly via CRT (in other words, `arithmetical [i.e. algebraic] methods’ give us a way out of `analytic obstructions’).

            Comment by Ernie Croot — August 17, 2009 @ 3:51 am | Reply

        • OK, I went through it and it looks correct to me – though I haven’t seen anything like it before, so I’m still getting used to it. Emmanuel’s comment is reassuring, though.

          I guess the next thing is to look at what tau(n) mod 8 tells us… some combination of the parity of the semiprime counting function and pi(n) mod 4? Unfortunately the hyperbola method doesn’t seem to let us quickly compute the former given the parity of pi(n), but I didn’t check this too carefully.

          If one wants mod 3 information, I think one may have to work with the second divisor function \tau_2(n) since one will want \tau_2(p)=3

          Comment by Terence Tao — August 17, 2009 @ 3:57 am | Reply

          • Hmm, I can’t figure out how to compute \sum_{n<x} \tau_2(n) = \sum_{a,b,c: abc \leq x} 1 in time x^{1/2}; the best I can do is x^{2/3} or so. (My earlier comment about how the Dirichlet convolution of two quantities whose partial sums were square-root-computable also had partial sums that were square-root computable was incorrect.) That's a shame, because \sum_{n<x} \tau_2(n) \hbox{ mod } 9 does seem to be encoding something about \pi(n) \hbox{ mod } 3, modulo simpler terms…

            Comment by Terence Tao — August 17, 2009 @ 4:38 am | Reply

            • Here’s a suggestion for how to do this, and perhaps all the higher divisor sums as well, though I am not sure it will work: fix an integer N. Then, since \tau(N) = N^{o(1)} we also get \tau_d(N) = x^{o(1)}, for any d \geq 1. This *suggests* to me that the lattice points in the region xyz \leq N; x,y,z > 0, can be partitioned into those in at most N^{o(1)} integer simplices (well, certainly if \tau_d(N) were at least N^{1/2+\epsilon} or so we would have no chance at all with this type of geometric approach); and if so we could use an off-the-shelf algorithm to count lattice points in simplices (e.g. I think Jesus de Loera has some algorithms to do this; and I think also Barvinok has a nice analytic algorithm). Of course, one cannot just connect up the lattice points along the boundary xyz = N to form some kind of approximate triangulation of the region, because our region xyz \leq N; x,y,z > 0 is highly non-convex, but maybe there is some kind of nice simplicial complex that one can find easily.

              Comment by Ernie Croot — August 17, 2009 @ 1:29 pm | Reply

              • This sounds promising. I did some further computations and it does indeed seem that if one can compute \sum_{n \leq x} \tau_k(n) \hbox{ mod } (k+1)^2 in O(x^{1/2+o(1)}) time then one should be able to compute \pi(n) \hbox{ mod } k+1 in O_k(x^{1/2+o(1)}) time, because the latter is essentially the contribution of the square-free numbers to the former, and the numbers which have a square factor can be recursively removed (perhaps with a dash of Mobius inversion) within the square root time budget. We need to do this for k out to about \log n or so in order to get the chinese remainder theorem to detect changes in \pi(n) deterministically (although probabilistically one could do this with far fewer moduli), so I guess we need to keep the dependences on k to be subexponential, but perhaps we should defer that problem until we can at least solve the k=2 issue.

                I had two thoughts that may be relevant. Firstly, if one assumes access to x^{1/2+o(1)} memory as well as x^{1/2+o(1)} time, then one can precompute just about any arithmetical function, say \tau_k(n), for n \leq x^{1/2+o(1)}, and then by an FFT one can also compute the partial sums \sum_{n \leq y} \tau_k(n) for y \leq x^{1/2+o(1)}. I don’t immediately see how this could be useful but it may come in handy later.

                Secondly, it may be that k=2 is in fact the worst case; as k increases one may be able to approach x^{1/2} again. My intuition here is that when working out a sum \sum_{a_1 \ldots a_{k+1} \leq x} 1, one should (usually) be able to find a subset of the a_1 \ldots a_{k+1} whose product is relatively close to x^{1/2}, and the degree of approximation should get better as k increases. Unfortunately I don’t know how to exploit this fact to get an efficient algorithm, and also there are the tail cases when a few of the a_j dominate in which one doesn’t get anywhere close to x^{1/2}. (For this least-significant-digit analysis, one cannot afford to throw away any error terms unless they are divisible by the modulus of interest.)

                Comment by Terence Tao — August 17, 2009 @ 3:44 pm | Reply

                • I think I figured out how to do it, following a nice lunch: look at the lattice point counting problem backwards — in other words, instead of trying to count the number of lattice points satisfying xyz \leq N, count the number xyz \geq N; x,y,z \leq N. Unlike the region xyz \leq N, this one is convex; and so, one can easily form a polytope where the vertices on xyz = N form a boundary.

                  I think the same should work in higher dimensions, after some modification. The point is that I think we no longer have convexity when we go to higher dimensions, but we can decompose the polytope into a small number of convex pieces.

                  Comment by Ernie Croot — August 17, 2009 @ 4:09 pm | Reply

                  • This is a neat idea, but there is one possible problem I see – perhaps there are lattice points with xyz < N that are not within the convex hull of the lattice points with xyz=N? Consider for instance the case when N is prime. (Of course, in this case we are done immediately with our finding primes problem, but still…)

                    Comment by Terence Tao — August 17, 2009 @ 4:32 pm | Reply

                    • Ok, that was silly of me. Perhaps, though, there is a way to find the other points on the convex hull, by considering solutions to xyz = N' for N' near to N. Actually, I guess if N has a small number of prime factors then this won’t do, because the set of N' that one would have to look through is quite large.

                      When I get some free time I will think about this some more…

                      Comment by Ernie Croot — August 17, 2009 @ 4:47 pm

                • This reminds me of results in the thesis of Fouvry which roughly state that if one understands (in the sieve sense) a certain number of \tau_k-functions, then one can get information beyond the Bombieri-Vinogradov theorem concerning the primes. Unfortunately, I can’t find this mentioned in the MathSciNet reviews of his early papers, and possibly that part of the thesis was not published. I will try to have a look at the original when I can.

                  Analytically, the \tau_k functions become worse and worse when k increases (this is related to their Dirichlet series having “degree” k). In particular, the error term in the summation formula becomes worse with k, unless one uses something like the Lindelöf Hypothesis.

                  Comment by Emmanuel Kowalski — August 17, 2009 @ 4:19 pm | Reply

    • Related to this approach, there is a nice identity of Linnik. Let \Lambda(n) be the van Mangoldt function and t_{j}(n) the number of representations of n as ordered products of integers greater than 1, then \Lambda(n) = \ln(n) \sum_{j=1}^{\infty} \frac{(-1)^{j-1}}{j} t_{j}(n). The sum is rather short since t_{j}(n)=0 for j larger than about \ln(n). Note that the function t_{j}(n) is related to \tau_{k}(n) by the relation t_j(n) = \sum_{k=0}^{j}(-1)^{j-k} {j \choose k} \tau_{k}(n). Again, t_{2}(n) is computable in n^{1/2} steps, however t_{j}(n), for larger j, appears more complicated. Curiously this is a fundamental ingredient in the Friedlander and Iwaniec work.

      Comment by Mark Lewko — August 17, 2009 @ 6:32 pm | Reply

      • Three quick comments:

        (1) I was slightly confused when I wrote the above comment. However, I think the observation is still useful, since \sum_{n\leq x}\Lambda(n)=\sum_{n\leq x}\ln(n)\sum_{j=1}^{\infty} \frac{(-1)^{j-1}}{j} t_{j}(n). So to find \sum_{n\leq x} \Lambda(n) we only need to know \sum_{n\leq x} t_j(n) for j\leq\ln(n). Of course we could use the Chinese remainder theorem idea instead here.

        2) I don’t seem to be able to compute D_{k}(x) := \sum_{n\leq x}\tau_{k}(n) in less than x^{1-1/k} steps. To get this let D_{k}(x) = \sum_{ab\leq x} \tau_{k-1}(a) = \sum_{a\leq x^{1/k}} \sum_{b\leq x/a} \tau_{k-1}(b) + \sum_{x^{1/k}<b \leq x} \sum_{a\leq x/b} \tau_{k-1}(a)= \sum_{a \leq x^{1/k}}\sum_{b\leq x/a}\tau_{k-1}(b)+\sum_{a\leq x^{1-1/k}}\tau_{k-1}(a)\sum_{x^{1/k}\leq b\leq x/a}1. So, at least for this approach, we essentially need to compute \tau_{k-1}(n) for n \leq x^{1-1/k}. This agrees with Tao's computation since my k=3 is his k=2. This is reminiscent of the trivial divisor problem estimates, which brings me to

        (3) It is interesting though that we have reduced the computation of \sum_{n<x} \Lambda(n) to the computation of D_{k}(n) for k\leq \ln(n). At this point, one might forget about computing the values D_{k}(n) and plug in estimates for D_{k}(n). These estimates are of the form D_{k}(x) = x P_{k}(\ln(x)) + \Delta_{k}(x), for explicit polynomials P_{k}. In our situation we need \Delta_{k}(x) \leq c_k x^{\alpha_{k}} for \alpha_{k} \leq1-\delta for all k. As far as I can find there aren't known estimates of this form. However, Titchmarsh conjectures that \alpha_{k} = \frac{1}{2} - 1/2k.

        More specifically we have that \sum_{n\leq x} t_j(n)=\sum_{k=0}^{j} (-1)^{j-k}{j\choose k}\sum_{n\leq x}\tau_{k}(n)=\sum_{k=0}^{j}(-1)^{j-k} {j\choose k}D_{k}(x). Assuming the binomial coefficients don't do too much damage, this should give us an estimate for \sum_{n<x} \Lambda(n) with an error term O(x^{1/2+\epsilon}). So this would get us down to O(10^{k})^{.5+\epsilon} on Titchmarsh's conjecture. Moreover, if the main term worked out to be x (and, again, the binomial coefficients aren't a problem), I think this would show that Titchmarsh's conjecture implies RH. (If I were to guess, the problem with this will be that we need cancellation in the binomial sum.)

        Comment by Mark Lewko — August 18, 2009 @ 12:26 am | Reply

        • 2 corrections and a comment:

          1) The mangled calculation in the second paragraph should be: D_{k}(x) = \sum_{ab\leq x} \tau_{k-1}(a) = \sum_{a\leq x^{1/k}} \sum_{b\leq x/a} \tau_{k-1}(b) + \sum_{x^{1/k}<b \leq x} \sum_{a\leq x/b} \tau_{k-1}(a)= \sum_{a \leq x^{1/k}}\sum_{b\leq x/a}\tau_{k-1}(b)+\sum_{a\leq x^{1-1/k}}\tau_{k-1}(a)\sum_{x^{1/k}\leq b\leq x/a}1 [Corrected.]

          2) The x/\ln(x) in the last paragraph should just be x. [Corrected.]

          On reflection, the last paragraph of the previous post is absurd for any number of reasons. Most notably, Titchmarsh's Theory of the Zeta Function gives that (in the notation above) that \alpha_{k} \leq 1/2 is equivalent to the Lindelof hypothesis. So my casual heuristic at the end of the post could be reworded as "the Lindelof Hypothesis implies RH". In fact the binomial coefficients cause all sorts of problems. Sterling's formula shows that the binomial coefficients get as large as n/\sqrt{\ln_{2}(n)} which is near the main term, before multiplying by the error terms \Delta_{k}(n)).

          On take-a-way from this is that while \sum_{n\leq x} \Lambda(n) is determind from the values D_{k}(n)'s, the relation is very subtle. Specifically, it seems nearly impossible to turn a good estimes for the D_{k}(n)'s into a good estimate for \sum_{n\leq x} \Lambda(n).

          Comment by Mark Lewko — August 18, 2009 @ 7:42 am | Reply

  13. I have a question about the background. Are the conjectures on the Poisson’s distribution of primes in small intervals, (and the distribution of gaps between primes; and the Hardy Littelwood conjectures) related (or equivalent) to conjectures regarding gaps between zeroes of the zeta function? (I remember hearing a lot about the later question and the relation with random matrix theory but the actual implication to primes is not fresh in my memory (and probably never was)).

    In general, (since we are willing to take standard conjectures regarding the zeta function for granted) do we expect properties of Riemann zeta function to have baring on behavior of primes in short (polylog) intervals?

    Comment by Gil — August 17, 2009 @ 7:10 am | Reply

    • Yes, distributional information about primes (and almost primes) in short intervals is more or less equivalent to distributional information about zeroes. For instance, in this paper of Goldston and Montgomery,

      they show (using RH) that the pair correlation conjecture about zeroes is equivalent to the second moment of primes in short intervals matching the Poisson prediction.

      There is a later paper by Bogolmony and Keating,

      which shows (heuristically, at least) that a sufficiently strong version of the prime tuples conjecture implies the full GUE conjecture, though I am not sure exactly what they do with the error terms in the prime tuples conjecture.

      I discuss these sort of issues a little in my blog post,

      Of course, all of these facts only give average-case information about primes in short intervals rather than worst-case, so are subject to the razor.

      Comment by Terence Tao — August 17, 2009 @ 3:55 pm | Reply

      • I have three general remarks: If, for example, we can show how to find a large prime in a short interval of integers but we need to start the interval at a random place; or if we can show using a factoring oracle that in such a short interval on average a large prime is present (maybe this is obvious), such results are also interesting “derandomization” results. It is true that they do not save us many random bits and that maybe some general principles (e.g. Nisan’s pseudorandom generator) can save us even more random bits but still it is an interesting form of partial derandomziation. (Derandomization can have the following form: construct a hybrid algorithm of a specific type where one part is deterministic.)

        2. The various razors gives us “primes” with unusual (maybe even “statistically unusual”) properties. So it is not impossible that we will be able to come up with plausible conjectures for some anti razor razor.

        3.Regarding “Strategy D: Find special sequences of integers that are known to have special types of prime factors, or are known to have unusually high densities of primes.”

        I think we should try to come up with some conjectures in the direction that sequences a(n) described by low complexity means cannot have unusually high density of primes or unusually small gaps between primes. And maybe showing negative results will even be slightly easier than positive results. Maybe some negative result of this kind can be proved (or are already known) for sequences based on polynomials. It would be nice to include in such conjectures all the sequences that people (naively) suggested.

        We do not want to allow that a(n) be anything computable in time poly(log (n)) because this will allow (more or less) to let a(n) be the nth prime. But anything we can imagine that still allow a(n)=n^7+n^3-5 or a(n)=2^n+1 should be ok.

        Is there even a single sequence with higher density of primes known? (even based on some standard conjectures?)

        Comment by Gil Kalai — August 17, 2009 @ 5:00 pm | Reply

        • I was about to say that I was not sure that I understood your remark about the nth prime being computable in time poly(log(n)). But now I think I do. We could define a(n) to be the smallest prime greater than {}n if there is a prime between {}n and n+(\log n)^3, and {}n otherwise. That will give us far too many primes. Is that the kind of thing you meant?

          I was also about to suggest low arithmetic complexity of some kind. For example, we could regard the arithmetic complexity of n^7+n^3-5 as being at most 8 because one can write it as (nnnn+1)nnn-5. Here we are allowed to call up {}n for free. But to get 2^n takes roughly \log n operations, by repeated squaring, so it doesn’t help. On the other hand, it is much simpler to get 2^n from 2^{n-1} than it is to get the nth prime from the previous one. Perhaps that could be used in some way. I.e., one asks for bounded arithmetic complexity, but one can call on the previous value, or perhaps even all previous values (provided we can specify them sufficiently simply, or else one might be able to do some silly encodings), of the function.

          Comment by gowers — August 17, 2009 @ 5:20 pm | Reply

        • I was carried away a little, but i want to “prevent” that when a(n) is computed its value is guaranteed to be prime via primality testing algorithm or something like that.

          Comment by Gil — August 17, 2009 @ 9:29 pm | Reply

        • Sorry, my first general remark looks meaningless to me (it is obvious that there are good intervals just from the density). Update: I managed to understand what I meant. Statistical conjectures regarding gaps between primes (which come short of Cramer’s conjecture) allow to reduce the number of random intervals you must look (so the number of random bits when you restrict yourself to this argorithm). Similarly, statistical properties of the number of non-S-smooth numbers in large intervals, (stronger than just what you get from the density of these numbers) will also lead to “saving” some random bits when you insist on an algorithm that looks for large prime factor in large intervals and what very high probabilities for success. Anyway (even without this dubious justification), it seems that the problem of showing that the probability for non-S-smoth numbers in large intervals is very small (rather than 0) is an interesting variation for Gowers’ original suggestion and it is not affected by our razor.

          Comment by Gil — August 18, 2009 @ 5:59 pm | Reply

      • If the result we are trying to prove and we have all constructable numbers with k digits have prime free gaps g of size say k^100 then could the distribution of times still be normal or have tail that is as thin as the normal distribution?

        Comment by Kristal Cantwell — August 18, 2009 @ 4:05 pm | Reply

        • Unfortunately, the answer seems to be yes: the set of constructible numbers is only poly(k) in size, and is thus extremely sparse compared against the set of all k-digit numbers, which is exponential in size. One could have all sorts of pathological behaviour near constructible numbers and still have a perfectly ordinary distributional law on all numbers. (This is closely related to the discussion on “generic primes” on the wiki, and on the “razor” on this thread.)

          Comment by Terence Tao — August 18, 2009 @ 4:45 pm | Reply

  14. Let me make a general comment about Cramer’s conjecture, though maybe it has been made before (it’s hard to keep up with all the postings): a much weaker conjecture than Cramer’s, that also would allow us to find a prime number quickly, is the assumption that given x, there exists a quickly determinable set A of size at most (\log x)^{O(1)} such that \{1,2,...,x\} \subseteq A + P(x), where P(x) denotes the set of primes up to x. The reason this conjecture is nicer than Cramer’s, as far as determining primes quickly, is that it includes under its purview many other types of additive results (Cramer can be interpreted as an additive result of sorts). For example, there is a nice paper by Granville et al on primes plus powers-of-two; and there is a beautiful paper of Christian Elsholtz on something called “Ostmann’s Conjecture” that has some “sieve boot-strapping” ideas that might be useful.


    I think I will post a lot less frequently to the polymath project, as classes have started back here at Georgia Tech… it’s a pity this project wasn’t started earlier in the summer (I may post something later this week, though, on how to use the Szemeredi-Trotter theorem to locate primes — there are some ideas in an unpublished note I sent to someone once that I think might allow one to do this, but I need to think it over first to see if it is silly).

    Comment by Ernie Croot — August 17, 2009 @ 1:58 pm | Reply

    • Well, one of the great things about this type of collaboration is that (in principle) one can contribute as frequently or as rarely as one wishes… though catching up is indeed a major problem. I should work on the wiki again to try to incorporate the recent progress (including several of your neat ideas)…

      One problem with the A+P approach is that it is still subject to the razor; one could delete all the elements of x-A from P and end up with a set of redacted primes which still looks the same as the set of all primes for the purposes of additive combinatorics or the circle method, but for which A+P now omits a single point x. This is basically why results about sums A+B of two dense sets generally have to allow for a small number of exceptions, whereas sums of three dense sets don’t.

      But if there is some way to avoid the razor, there may well be a chance here…

      Comment by Terence Tao — August 17, 2009 @ 4:26 pm | Reply

  15. I’d like to suggest a pseudorandom generator approach to the problem
    that does not hit the wall of proving strong circuit lower bounds.

    We want a function G : \{0,1\}^{\ell} \rightarrow \{0,1\}^n
    such that |\mathrm{Pr}_X[PRIME(X)] - \mathrm{Pr}_Y[PRIME(G(Y))]| \leq 1/n^2 (say), where X \in_U \{0,1\}^n and Y \in_U \{0,1\}^\ell. If G(y) is
    computable in time t(\ell) then we can find an {n}-bit
    prime in time 2^\ell t(\ell) by going through all seeds. This
    is polynomial if \ell = O(\log n) and t(\ell) = 2^{O(\ell)}.

    The main idea of the approach is to take a Nisan-Wigderson-looking
    generator G_f based on a well-understood Boolean function
    f : \{0,1\}^d \rightarrow \{0,1\} (instead of a hard Boolean
    function as would traditionally be done) and argue that a primality
    test is not effective to distinguish the output of this generator from
    the uniform distribution. Intuitively, if {f} is so simple
    that it has very little to do with primality, it is hard to expect
    that a primality test can be used to break the generator. Let us try
    to make this idea precise.

    Here is the Nisan-Wigderson generator construction. Let f : \{0,1\}^d \rightarrow \{0,1\} be a Boolean function. Let S = (S_1,\ldots,S_n) be an (\ell, d, q)-design, which means that
    each S_i is a subset of \{1,\ldots,\ell\} of
    cardinality exactly {d} and every two distinct S_i
    intersect in at most {q} places. The Nisan-Wigderson generator
    G_{f,S} : \{0,1\}^{\ell} \rightarrow \{0,1\}^n is defined as
    follows: G_{f,S}(y) = f(y|_{S_1}) \cdots f(y|_{S_n}), where
    y|_{S_i} denotes the restriction of {y} to the bits in
    positions S_i. Think of \ell as c \log n, of
    {d} as c' \log n for a c' \le c, and of {q} as c'' \log n for a c'' \le c'. Such designs

    The analysis of the Nisan-Wigderson generator shows that if {D} is a distinguisher in the sense that \mathrm{Pr}_X[D(X)] - \mathrm{Pr}_Y[D(G(Y))] \ge 1/n^2, then the following procedure
    predicts f(z) significantly better than guessing:

    Procedure A(z): given {z}, choose i \in_U \{1,ldots,n\}, choose r = (r_i,\ldots,r_n) \in_U \{0,1\}^{n-i+1}, choose s \in_U \{0,1\}^{[\ell]-S_i}, let
    {y} be the result of inserting {z} in {s} in
    the places dictated by {S_i} (leaving a string in \{0,1\}^{[\ell]} as a result), let x = (f(y|_{S_1}),\ldots, f(y|_{S_{i-1}}), r_i,\ldots,r_n), compute D(x), if the
    outcome is {1} return r_i, otherwise return 1-r_i.

    The precise result says that if \mathrm{Pr}_X[D(X)] - \mathrm{Pr}_Y[D(G(Y))] \ge 1/n^2, then \mathrm{Pr}_{z,i,r,s}[A(z) = f(z)] \ge 1/2 + 1/n^3 (note the loss of
    a 1/n factor from 1/n^2 to 1/n^3). This is
    proved by the standard hybrid argument.

    Now let us interpret the meaning of this when {D} is a
    primality test. The algorithm A(z) predicts f(z) by
    checking whether an {n}-bit number {x} that is
    composed of i-1 somewhat complicated bits (the first part of
    {x}) followed by n-i+1 random bits (the second part of
    {x}) is prime (or non-prime), and returning the {i}-th
    bit of {x}. This is a uniformly chosen number in an interval
    of the form [k 2^{n-i+1},k 2^{n-i+2}), where {i} is
    chosen uniformly at random in \{1,\ldots,n\} and {k}
    follows a somewhat complicated distribution depending on the function
    {f} and the design {S}. Now, if {f} and
    {S} are chosen sufficiently simple (instead of complicated as
    the hardness-vs-randomness paradigm suggests), there is a chance that
    one can analyse the distribution of {k} and show that the
    marginal distribution on the {i}-th bit of a random {x} chosen uniformly in the random interval [k 2^{n+i-1}, (k+1) 2^{n+i-1}) is almost independent of whether {x} is prime or not. If {f} is a balanced function so that \mathrm{Pr}_z[f(z) = 1] = 1/2, this would contradict the fact that
    A(z) guesses f(z) significantly better than guessing.

    There are tons of simple {f}‘s one can look at, starting with
    f(z) = parity(z) or even f(z) = firstbit(z), and
    similarly for {S} (which by the way, need not be strictly a
    design in the sense of having small intersections since that is not
    used in the analysis above).

    Comment by Albert Atserias — August 17, 2009 @ 2:59 pm | Reply

    • I like this idea! Doesn’t seem very easy though…

      Considering that G(y) must be even half the time, we see that f must take values 0,1 equally often. Not difficult.

      Considering G(y) modulo 3, we see that the probability that

      \sum_{i=1}^n (-1)^{i-1} f(y|_{S_i}) \equiv 0 \pmod{3}

      must be close to 1/3.

      Considering G(y) modulo 2^k-1, we get similar restrictions relating the sums

      F_{a,k}(y) = \sum_{i \equiv a \pmod{k}} f(y|_{S_i}

      for $a = 0,\dots,k-1$.

      Assuming some (perhaps strong) facts regarding the equidistribution of primes in arithmetic progressions, these are potentially the only restrictions on f,S we need to worry about.

      Choosing S very well and f simple enough, we may be able to control the sums F_{a,k}(y) with small perturbations of y and get the right probabilities?

      Comment by François Dorais — August 17, 2009 @ 5:13 pm | Reply

    • Assuming GRH, a distinguishing pair S,f should give a roughly \widetilde{O}(2^{n/2}) algorithm for producing an approximately n-bit prime. Basically, equidistribution of primes modulo 2^i prevents small choices of i in procedure A from distinguishing since the predictor used is too close to random guessing. For large choices of i, the intervals [k2^{n-i+1},(k+1)2^{n-i+1}) are short enough to scan until we find primes to explain the discrepancy. Since G only takes O(n^c) values modulo 2^i we can afford to check them all. (Of course, if we miraculously pick a non-distinguishing pair S,f then we have a polytime algorithm!)

      Maybe there are smart choices of S,f that could eliminate the GRH assumption?

      Are you sure S being a block design is not necessary?

      Comment by François Dorais — August 17, 2009 @ 8:11 pm | Reply

      • About the no-need of {S} to be a design. The small intersection property of {S} is only needed by NW when they want to turn the procedure A(z) into a small circuit that predicts {f} without using {f} itself: for fixed {i, r, s} that achieve the bias of A(z), each f(y|_{S_j}) with j \leq i-1 depends only on c'' \log n bits of the input {z} and therefore we can plug a circuit of size n^{c''} in place of f(y|_{S_j}). If c'' is small enough with respect to c', this leaves a small circuit. But here we do not care getting a circuit so we leave the f(y|_{S_j})‘s. Small intersection is not used anywhere else in the analysis of NW.

        About your win-win argument. From what I understand here, why O(2^{n/2})? Let us say we divide by cases i \leq 1+4\log_2 n and i \geq 2+4\log_2 n. The first case is ruled out by equidistribution (?) of primes modulo 2^i as the predictor gets success roughly 1/2 + 1/n^4. The second case leaves an interval [k 2^{n-i+1}, (k+1) 2^{n-i+1}) of length 4 n^4 which we can check exhaustively to explain the discrepancy (and hence find primes). But is this what you have in mind? I guess the question is rather: what is the exact result on equidistribution of primes that you plan to use? We might want to choose {f} and {S} so that {k} is rather uniformly distributed.

        Comment by Albert Atserias — August 17, 2009 @ 10:04 pm | Reply

        • What I wrote was not very clear. I think the main issue is that I am reading the numbers backwards, which is rather confusing, to say the least! When I read G_{f,S}(y) = f(y|_{S_1}) \cdots f(y|_{S_n}), I think of f(y_{S_1}) as the least significant bit, so your intervals are arithmetic progressions in my universe. To clean things up, I’ll redefine

          G_{f,S}(y) = 1 + 2f(y|_{S_1}) + \cdots + 2^nf(y|_{S_n})

          so the we avoid the pesky even/odd issue too. So now the prediction algorithm can be restated as follows.

          Procedure A: Given {z}. First, pick i \in_U \{1,ldots,n\}. Then pick y \in \{0,1\}^\ell such that y|_{S_i} = z, uniformly at random. Finally, pick x \equiv G_{f,s} \pmod{2^i}, 0 \leq x \leq 2^n-1, uniformly at random. If x is prime return x_i otherwise return 1-x_i (where x_i = \lfloor x/2^i \rfloor \mathrm{mod} 2).

          Equidistribution of primes then means that for fixed i the sets

          P(t,2^i) = \{ p < 2^n : p \equiv t \pmod{2^i}, p prime\}

          all have roughly equal size for odd t. Under GRH, the error is something like O(n^2 2^{n/2}). So when the algorithm picks i small, it is pretty much randomly guessing. For large i, the sizes of the P(t,2^{i+1}) can vary wildly, but the arithmetic progression \{ x < 2^n : x \equiv t \pmod{2^i} \} has size 2^{n-i}.

          So the prime finding algorithm runs like this: Pick suitable f, S. Compute the numbers G_{f,S}(z) as above for z \in \{0,1\}^\ell (note that 2^\ell = O(n^c)). If one of these numbers is prime, then bingo! Otherwise, systematically perturb the few most significant bits of the numbers just computed until a prime is found.

          My earlier scribbles suggest that you under GRH only need to perturb the top half bits of each number, but that may have been slightly optimistic.

          Comment by François Dorais — August 17, 2009 @ 11:50 pm | Reply

          • Ah! I made a silly mistake in my scribbles – I didn’t consider the dependency on the modulus in the GRH estimate. This algorithm is probably a whole lot slower than O(2^{n/2}) because of that! I think I’ll let number theorists figure out whether this algorithm is worthwhile…

            Comment by François Dorais — August 18, 2009 @ 1:42 am | Reply

    • I posted some details and discussions around this idea here. The “win-win” twist that I described earlier does indeed give a deterministic \widetilde{O}(2^{n/2}) algorithm for generating n-bit primes under GRH. In fact, Albert’s original idea (using intervals instead of arithmetic progressions) with the “win-win” twist also gives an \widetilde{O}(2^{n/2}) algorithm assuming RH. Further analysis is needed to see if the RH/GRH assumptions can be eliminated with a suitable choice of f,S.

      As a “proof of concept” experiment, I implemented the algorithm (without the win-win twist) to generate 1000-bit primes using only 121 random bits. (Specifically, I used the parity function together with the block design generated by cubic polynomials over \mathbb{F}_{11} as described in the Nisan-Wigderson paper.) I didn’t try all 2^{121} possible seeds, but I picked three random samples of 1000 seeds. The three runs gave me 4, 1, 5 primes. These agree with the expected 2.9 primes. The variance seems high, but I suspect that’s the price to pay for using 121 bits to do the work of 1000. Anyway, this doesn’t prove anything except for the fact that the idea is not fundamentally flawed.

      Comment by François Dorais — August 20, 2009 @ 8:57 am | Reply

  16. Beurling primes and integers

    Tammy Ziegler gave me some very useful links to papers regarding Beurling numbers that contain some results and conjectures about them. Especially relevant is a paper by Jeff Lagarias. This is related to several aspects of our discussion.

    We start with a collection of real numbers called B-primes p_1<p_2<\dots <p_n,\dots and consider the ordered collection of all products, a_1<a_2,\dots. These products are referred to as B-integers.

    For such systems you can talk about the "prime number theorem", "zeta function," "analytic continuation," "RH" and various statement about the gap distributions betweeb the primes (or related gaps between zeroes of the zeta function.) Beurling proved in the 30s the PNT under great generality is such cases.(What is needed is that the number of integers smaller than x is close enough to x.) For the stronger statements there are various counterexamples(Malliavin gave in the early 60s examples that satisfies known nice property of the zeta function and yet violate RH.).

    Here are 2 basic examples for B-numbers

    Example 1: The B-primes are the ordinary primes, the B-integers are the ordinary positive integers.

    Example 2: The B-primes are of two types: a) primes which are 1 (mod 4) each with weight 2. b) Squares of primes which are 3(mod 4).
    (The B-integers are those integers which can be expressed as the sum of 2 squares weighted by the number of such representations)

    We come to Lagarias’s work:

    The B-integers satisfies Delone property if for some r and R, the gaps a_{i+1}-a_i are always bigger than r>0 and smaller than R.

    Lagarias’s Conjecture 1: Beurling numbers with the Delone property satisfy (RH)

    Lagarias’ Conjecture 2: Beurling numbers with the Delone property are obtained from the ordinary primes by deleting a finite number number of primes and adding certain sets of integers.

    Lagarias proves Conjecture 1 Conjecture 2 for the case where the primes are integers and gave a complete characterization of such B-primes. [You mean Conjecture 2, right? – T.] [Yes, thnx]

    If Conjecture 2 is correct then Conjecture 1 is equivalent to the RH. Of course these conjectures seem very strong and looking for counter examples may be a good idea but nevertheless we can also try to consider even stronger conjectures.

    Delone property also seems too strong (even example 2 is excluded) so lets us vaguely state:

    “Strong Lagarias’ Conjecture:” Beurling integers with strong decay of gaps satisfy (RH).

    Problem 1: Find (perhaps in the existing literature) an appealing form of “Strong Lagarias Conjecture”

    Problem 2: What are the relations between the distribution of gaps for the B-integers and the distribution of gaps for the B-primes.

    Problem 3 (in the anti razor razor direction): Will strong decay for the distribution of gaps for B-integers suffice to prove results about gaps of S-smooth numbers?

    Challenge 4: Find stochastic models of Breuling primes for which the Breuling integers satisfies the conjectures on quick decay of gaps.

    In the Delone case, Lagarias’s conjecture conflicts with such stochastic example (which is reasonable), but if we relax the Delong condition and try to immitate very closely the conjectured gap properties of primes maybe stochastic examples with nice properties of primes can be found.

    I suppose that finding stochastic models for primes-like numbers is a big subject and I know very little about it.(I am aware of a book by Arratia, Barbour and Tavare on probabilistic approach to logarithmic combinatorial structures. (but did not read it).)

    Example 3: Consider a random subset of primes, each prime is taken with probability 1/2 and each taken prime is given a weight 2. I do not know if this example (or some other weighted version) satisfies (RH). I remember I was told that without the weights you do not have even analytic continuation but I do not have a link for this result.

    We can formulate our algorithmic problem for general Beurling prime systems. Suppose that we have a cost 1 oracle to access the $n$th B-integer, and to check for B-primality and (possibly) to B-factor, how many steps are needed to find a prime number greater than the nth B-integer.

    There are various papers by Titus Hilberdink and coauthors with all sort of amazing connections to various other areas, and Tammy also mentioned a recent Mertens’ formula by Ricard Ollofson.

    Comment by Gil Kalai — August 18, 2009 @ 1:21 pm | Reply

    • Regarding the “strong Lagarias conjecture” it can be perhaps more narural to try to relate the gaps between the “square free Beurling numbers” and the gaps between the “primes”. I.e. to try to relate to conjectured behavior of gaps between primes (or for zeroes of zeta function) to the conjectured behavior of gaps between squere free numbers (is the later conjectured to be Poisson? are squere freeness like coin flips?) hoping that for some stochatics models the connection be easier. Maybe the relation between gaps of square free numbers and between gaps of primes a standard issue? are there standard conjectures (theorems??) about it?

      Another cute thought is to understand the Beurling “transform” from “primes” to “integers” for continuous measures. (Start with a measure mu supported on [2,infinity] and consider the sum of mu mu*mu mu*mu*mu etc.) From Ollofson’s paper I realize that Beurling himself already studied it but I wonder what is known about it.

      Comment by Gil — August 21, 2009 @ 11:13 am | Reply

  17. […] generate a prime number of at least digits. The post isn’t a contribution to the ongoing research discussion, but rather describes background material I wanted to understand better: a proof of a mathematical […]

    Pingback by Michael Nielsen » Bertrand’s postulate — August 18, 2009 @ 1:53 pm | Reply

  18. I was thinking a little about the toy problem of how to estimate the sum \sum_{abc \leq x} 1 in time x^{1/2+o(1)} rather than x^{2/3+o(1)}. To simplify matters I looked at the model case where a, b are restricted to be comparable to x^{1/3}, say x^{1/3} \leq a,b \leq 2x^{1/3}. Then one is counting the lattice points under the curve c = \frac{x}{ab}. The naive algorithm of summing \lfloor x/ab \rfloor for x^{1/3} \leq a,b \leq 2x^{1/3} is clearly a x^{2/3+o(1)} algorithm.

    It seems to me that it would suffice to understand how to count lattice points under curves c = P(a,b) for x^{1/3} \leq a,b \leq 2x^{1/3}, where P is a polynomial of bounded degree that takes values comparable to x^{1/3} on this square. The point is that by subdividing the a,b square into subsquares of length x^{1/3-\varepsilon} for some \varepsilon > 0, and then Taylor expanding \frac{x}{ab} on each subsquare up to accuracy x^{-10} (say), one can reduce things to the polynomial problem. (The x^{-10} error can be eliminated by noting from the divisor bound that there are only x^{o(1)} lattice points on the curve c=\frac{x}{ab}, and all the other lattice points lie a distance at least x^{-2/3} away (since the denominator ab has size about x^{2/3}), so one just has to manually correct for x^{o(1)} lattice points by checking whether they lie under the Taylor approximant or not.)

    In the special case when the polynomial P decouples as P(a,b) = Q(a)+R(b), it seems like an FFT is a possibility. The idea is to round off Q and R to the nearest integer, and convolve the images of Q and R using an FFT. In principle this only requires x^{1/3+o(1)} time. Unfortunately the rounding occasionally runs into “carry” issues. Since we cannot afford to count even a single lattice point incorrectly in this strategy, one has to fight harder to deal with this issue; one can add more precision but this degrades the run time. Admittedly there is a bit of room between x^{1/3} and x^{1/2} so perhaps this can be salvaged.

    Perhaps rewriting c \leq \frac{x}{ab} as \log c \leq \log x - \log a - \log b may allow FFT methods to be used? Again one is running into the problem though that more precision is needed. (Also, the Fourier transform of the log-integers is close to the zeta function, so we may just be rehashing old Strategy A options here.)

    Comment by Terence Tao — August 18, 2009 @ 8:53 pm | Reply

  19. I had a few thoughts about the finding consecutive squarefree ingegers problem today.

    First an elementary observation. The problem is equivalent to finding a large squarefree value of the quadratic polynomial n^2+n (simply by the relation n^2+n=(n+1)n). (Also I think, modulo some congruence conditions, the problem is also equivalent to finding squarefree solutions to the polynomial n^2-1).

    Secondly, it would be nice to obtain unconditional results in the direction of this problem (without a factoring oracle). Related to this there is a paper of Lenstra (“Miller’s Primality Test”) in which he gives a polynomial-time deterministic test that identifies squarefree numbers (the problem is that it might fail to report a squarefree number). The algorithm is fairly simple: If p^{n-1} = 1 \mod n for every prime p such that p< \ln^2(n) then n is square-free.

    Some possible directions to go with this would be to (1) try to understand the exceptional set of "pseduo-squarefree numbers", and show our current algorithms work using this test (2) try to prove that some variant can always detect squarefree numbers, possibly assuming some number theoretic conjecture (such as GRH), that is find a Miller-Rabin-type test, and (3) try to apply the ideas of AKS to get a polynomial-time deterministic test without error.

    Comment by Mark Lewko — August 19, 2009 @ 12:04 am | Reply

    • Looking at the results of Eric Bach [Explicit bounds for primality testing and related problems, Math. Comp. 55 (1990), 355-380], I think that replacing p < \log(n)^2 by p \leq 3\log(n)^2 in Lenstra’s test turns it into a “Carmichaelity” test under ERH. So I suppose that very few squarefree numbers pass this test.

      Comment by François Dorais — August 19, 2009 @ 12:50 am | Reply

  20. I have a small remark on Croot’s strategy (post #12). Let me briefly summarize the strategy. In his post Croot showed how to compute the parity of \pi(x) in x^{.5} steps. It has also been observed that if we can compute D_{k}(x) = \sum_{n\leq x} \tau_{k}(n) for all k \leq \ln(x) then we can recover \pi(x) exactly. We could then use a binary search to find a prime near x in around x^{.5} steps. The difficulty with this, so far at least, is that we haven’t been able to compute D_{2}(x) in fewer than x^{2/3} steps. The hope is that we could compute all of D_{k}(x) (for k < \ln(x) in time x^{.5} and thus compute \pi(x) in around x^{.5} steps. We could then find a prime a large prime using a binary search, in around x^{.5} steps.

    It seems to me that if we succeed in this strategy we not only will be able to find some k digit prime in around O(10^{k})^{.5} steps, but we will also have have solved a much more general problem: we'll be able to compute the n-th prime in around \sqrt{n} steps. By the prime number theorem we know that the n-th prime is between [0, c n\ln(n)], we'll then should be able to locate it using in a binary search (querying \pi(n) at each stage) in around O(n^{1/2}) steps (ignoring logs).

    I don't have any particular reason to think that this shouldn't be possible. However, the fact that it would solve a much more general problem is cause for thought.

    Comment by Mark Lewko — August 20, 2009 @ 3:43 am | Reply

  21. If we can find a fast algorithm to solve these lattice counting problems, I think we should be able to use it in sympathy with the prime gap results.

    Let’s assume RH. We then have that there is a prime in the interval [x, x+ x^{1/2}] (I’m going to ignore logs throughout this). Now to find this prime using a binary search we could compute \pi(x+h) for various values of 0 \leq h \leq x^{1/2}. However, this is really more information than we need. It would suffice to know \pi(x+h)- \pi(x) for these values of h.

    Using Linnik’s identity, to find \pi(x+h)-\pi(x) (or, more precisely \sum_{x\leq n \leq x+h}\Lambda(n)) all we need to know is \sum_{x \leq n \leq x+h} \tau_{k}(n) for k\leq \ln(x+h). In the k=1 case, this is \sum_{x\leq ab \leq x+h} 1. Now if h=\sqrt{x}, we should be able to compute this in about x^{1/4} steps. (Similarly, I think we should be able to compute the parity of \pi(x+x^{1/2})-\pi(x) in x^{1/4} steps. )

    Now if we can find a way to compute all of the \sum_{x \leq n \leq x+h} \tau_{k}(n) in x^{1/2-\delta} steps (x^{1/2} is the trivial mark here), I think, it will get is below O(10^{k})^{.5}. I think a modification to my calculations above (August 18, 2009 @ 12:26 am) gives that \sum_{x \leq n \leq x+h} \tau_{k}(n) is computable in x^{.5(1-1/k)} steps, but this isn’t bounded away from x^{.5} for large k.

    Similar remarks should hold starting with the Baker-Harmen unconditional prime gap result instead of RH.

    Comment by Mark Lewko — August 21, 2009 @ 8:30 am | Reply

  22. Here are some comments on the “parity of pi” approach (I had a little free time yesterday!):

    1. By generalizing the approach in just the right way, one can develop a completely different way to locate primes, that does not involve the prime counting function. The idea is to work in the polynomial ring F_2[x] instead of just F_2: I think one can compute the prime generating function

    f_N(x)\ :=\ \sum_{p \leq N \atop p\ prime} x^p

    in F_2[x] using the same approach as finding the parity of \pi(x). Of course, because f_N(x) has \sim N/\log N terms, by itself it is useless; HOWEVER, if one is given a polynomial g(x), of degree less than N^\epsilon, say, then I think one should be able to compute f_N(x) \pmod{g(x)} in time only slightly worse than N^{1/2 + \epsilon}. The reason is that the polynomial f_N(x) has a “short description” — it looks something like

    \sum_{d \leq \sqrt{N}} {x^{d n_d} - 1 \over x^d - 1},

    for carefully chosen n_d. Of course that isn’t quite right, because as we know, computing \pi(x) \pmod{2} is done recursively — this is only the first step in the recursion.

    Ok, so to show that there is a prime number in some interval [M,N], all we have to do is present a polynomial g(x) (preferably very low degree) such that $g(x)$ does NOT divide f_{M-1}(x) + f_N(x) in F_2[x]. It seems to me that in order to have a chance of making this method work, one will need an explicit form for the generating function f_N(x), similar to the approximate formula above, except that perhaps some of the coefficients of the (x^{d n_d}-1)/(x^d - 1) are not 1 (maybe they will be binomial coefficients or something, like in Mark Lewko’s nice formulas). Note that this method will tell us nothing about the prime counting function, so it is a completely different type of approach than the parity-of-pi method.

    How could we locate a non-divisor g(x)? Here are some suggestions:

    A. Maybe there is a way to rapidly determine the power that a single polynomial divides f_N(x). If so, then we could just attempt to show that (x+1)^k fails to divide our polynomial for some k.

    B. One can manipulate the function f_N(x) a little, to determine the generating function for the number of number of integers in [1,N] that are multiples of some small primes, and then one very large prime. For our purposes, that is all we need to locate a large prime, since we could recover it by doing some factoring. The fact that such a manipulation to the generating function of f_N(x) is the sort we have easy control over, it seems plausible that through some combination of looking for small degree polynomial divisors and such adjustments to our generating function, we could locate a low-degree non-divisor.

    C. We don’t just have to work with the function \tau = 1*1 to locate primes — we can also work with Jacobi or Kronecker symbols. In other words, we can work with \tau^* = \delta * 1, where \delta(n) = (n/q)_2. I think when we do this, after the dust has settled, we will be able to find the parity of the number of primes \leq x that are squares mod q (well, for q prime); and in the context of the polynomial method above, we will be able to locate primes that are squares mod q, q prime. Now, when we go to working with Kronecker symbols, it seems to me that this will only require minor adjustments to our generating function formula — adjustments that we can get our hands on, and hopefully show that some adjustment leaves us with a polynomial not divisible by a certain low-degree polynomial.

    2. The other comment I wanted to make concerns the function \sum_{abc \leq x} 1 that we were trying to count before. We were trying to do this to determine the prime counting function mod 3. Well, there are other ways to find \pi(x) mod 3 that might be easier — I haven’t spent much time thinking about it, but here goes: let \chi(n) be a non-principal, cubic Dirichlet character mod 7. Then define the multiplicative function \tau^*(n) = (\chi*1)(n). If n is square-free, and not divisible by 7, then I think that \tau^*(n^2) \pmod{9} is non-zero precisely when n is a prime that is 1 or -1 mod 7. By taking other kinds of convolution here, I think one can tell whether n is in other cosets of the index-3 multiplicative subgroup \{1,-1\}. So, it seems to me that by counting sums like \sum_{n \leq x \atop n\ square-free} \tau^*(n^2), one can determine \pi(x) \pmod{3}.

    Of course, this approach has its own drawbacks: One *does* have a kind of hyperbola method here to compute such sums, but the fact that we work with \tau^(n^2), instead of \tau^*(n), means that there are too many terms to sort through, to give a \sqrt{x} algorithm. Well, that is just what I surmised by a back-of-the envelope calculation — maybe there are smarter ways to compute the sum.

    Later today I will try to think about the incidence geometry approach I wrote about earlier this week (Szemeredi-Trotter), and will write about it either today or tomorrow if it works out…

    Comment by Ernie Croot — August 22, 2009 @ 3:00 pm | Reply

  23. On the discussion thread, we discovered that a method of Odlyzko lets us deterministically find k-digit primes in time (10^k)^{1/2+o(1)} without RH (i.e. solving Problem 2); I wrote up the method at

    There is also a combinatorial method (the Meissel-Lehmer method) that gives a x^{2/3+o(1)} algorithm,

    Click to access meissel.lehmer.pdf

    The bottleneck in the Odlyzko method is that it takes about O(y) time and space to compute the number of primes in [x-y,x] for y close to x^{1/2} (by sieve of Eratosthenes). If, for instance, one could compute the number of primes in [x-x^{0.501},x] in time x^{0.499+o(1)}, then Odlyzko’s method would give a x^{0.499+o(1)} algorithm.

    Comment by Terence Tao — August 22, 2009 @ 6:01 pm | Reply

    • Alternatively: Odlyzko’s method lets us deterministically zoom in (in x^{0.499+o(1)} time) on an interval of size x^{0.501} which is guaranteed to contain a prime (in fact, by pigeonhole one can ensure that the density of primes here is at least 1/\log x). But we don’t yet know how to actually extract a prime from this interval yet in anything better than x^{1/2+o(1)} time.

      Comment by Terence Tao — August 22, 2009 @ 6:05 pm | Reply

    • I wrote up a sketch of the Meissel-Lehmer method at

      There are a number of bottlenecks preventing this method from going past x^{2/3}. One of them is to compute the sum \sum_{x^{1/3} < p,q < x^{2/3}: pq < x} 1, where p, q range over primes. I don’t see a way to do this in time faster than x^{2/3}.

      Comment by Terence Tao — August 22, 2009 @ 6:38 pm | Reply

  24. Looking at this problem from the other direction I wonder, how long prime-free intervals can be explicitly constructed?
    That is, without simply testing all integers up to some size, and assuming some upper bound for the integers considered.

    Comment by Klas Markström — August 23, 2009 @ 12:47 pm | Reply

    • Well, the example of the prime-free intervals [p!+2;p!+p] has been mentionned by Gil already I think. Wikipedia remarks that in fact [p!+2;p!+q] is prime-free, where q is the prime following p.

      Maybe we can push this example: it seems to me that for any integer k the intervals [k\cdot p!+2; k\cdot p!+p] are also prime-free (since k\cdot p!+2 is also divisible by 2, k\cdot p!+3 by 3 and so on). For instance in the case p=3 it leads to pairs of non-primes spaced by 3!=6, namely [8,9], [14,15]… which is a kind of block-sieve maybe. The way various intervals are removed (and overlap) for different primes p is probably non-trivial, I’ll try to get some numerics.

      Comment by Thomas Sauvaget — August 23, 2009 @ 3:08 pm | Reply

    • If this theorem is not true then we simply construct a number and it is surrounded by a large prime free interval(if k is the number of digits then we can pick a number say 100 and the interval is of size at least k^100. So if this theorem were not true then the only cost would constructing the number and that would be relatively cheap.

      Comment by Kristal Cantwell — August 23, 2009 @ 6:34 pm | Reply

  25. Since the larger context of our problem is to find relations between computational complexity problems (and hierarchies of conjectures) and number theoretic problems (and hierarchies of conjectures) let’s try to examine the following heuristic (vaguely stated)suggestion:

    (PIR) “Every problem which is computationally easy for integers is also computationally easy for primes.”

    For example: finding the smallest integer larger than an integer x is easy, and by Cramer’s conjecture so is finding the smallest prime larger than x. Is there an integer in the interval [\alpha,\beta] is easy so (PIR) suggests that so is answering the question “Is there a prime in the above interval?”. (Again this is true under Cramer.)

    “Is there an integral vector satisfying a a system of inequalities?” is known to be an easy question when the number of variables is bounded, and (PIR) asserts that so is the question about such vectors with all-prime-coordinates. If it is easy to determine if a certain Diophantine equation has integer solution in a certain range then it should also be easy to determine if it has prime solution, etc.

    Maybe (PIR) is silly and admits simple counterexamples but I cannot think of any.

    (Unlike deterministically finding primes the issue here is not derandomization.) The reverse principle seems untrue since factoring is easy for primes and apparently hard for integers.(And so is the question if an integer is the sum of 2 squares.)

    Comment by Gil Kalai — August 23, 2009 @ 9:18 pm | Reply

    • Presumably one has to narrow the set of problems to the “interesting” ones to avoid grue paradoxes. For instance, if one defines a grue number to be a number which is either composite, or is the n^th prime where n is the Godel number of a proof of an undecidable statement, then finding a grue integer is trivial but finding a grue prime is undecidable.

      Of course, the artificiality here came from the fact that primality was used to define the notion of a grue number. I suppose if one avoided using primality or other number-theoretic notions in defining the problem at hand, then one could perhaps get somewhere. If the primes were sufficiently computationally pseudorandom then I guess any low-complexity problem could not distinguish between primes and integers (by definition of computational pseudorandomness) and so this might be a formalisable version of the principle.

      Comment by Terence Tao — August 24, 2009 @ 1:46 am | Reply

    • It’s also very easy to decide whether there is an integral solution to m=n+2 in a given range. I’d be quite interested in an algorithm that could find prime solutions …

      Comment by gowers — August 24, 2009 @ 9:13 am | Reply

    • Right, if we want to propose formal version rather than a heuristics, we will need to talk about low complexity problems. Even for them this heuristics represents a vastly strong conjecture on primes being pesudorandom. (So strong, that maybe there is a “real” counterexample – something which is easy for integers and hard for primes like factoring is in the other direction).

      In particular, finding twin primes in an interval has (based on what we conjecture about primes) a polynomial-time algorithm because if the interval is large the algorithm just say YES, (relying on a strong version of Cramer’s conjecture), and if the interval is small it checks all numbers there.

      Comment by Gil Kalai — August 24, 2009 @ 11:23 am | Reply

    • Actually, I do not know if such a principle holds even when you replace the primes by, say, a truly random subset of the integers (say of density 1/2).

      Comment by Gil — August 24, 2009 @ 2:10 pm | Reply

      • I believe you can modify Lenstra’s algorithm for integer linear programming to get a probabilistic algorithm for “random subset of the integers linear programming” by randomly sampling the lower-dimensional slices in the recursive step. The density of the random subset will clearly affect the required number of samples and thus the running time.

        Applied to prime numbers, the algorithm’s strategy for finding primes in an interval would just reduce to randomly sampling integers in the interval and checking primality. Same goes for what the algorithm would do to find twin primes. Is anything is known about random sampling and twin primes?

        Comment by Nicolai Hähnle — August 28, 2009 @ 1:33 pm | Reply

        • This is interesting. I suppose the conjecture would be that a random k-digit number is a young twin prime with probability k^{-2}; but this just like Cramer’s conjecture are well well beyond reach.

          So for integer programming in bounded dimension when the random subset is of density 1/2, does your argument gives average polynomial running time or even O(poly) running time almost surely. And what precisely the random sampling statement for primes you would need to extend it?

          I suppose the next problem to play with is Frobenius coin problem. Ravi Kannan proved that there is a polynomial algorithm to determine the largest amount you canot pay with relatively prime k coins. We want to say that we can also determine the largest integer in a random subset of integers (or the largest prime) that we cannot pay. This does not look automatic from the problem but maybe follows from Kannan’s algorithm.

          Comment by Gil Kalai — August 29, 2009 @ 9:01 am | Reply

  26. Here is an idea for how to cross the square-root barrier (and possibly even compute the parity of pi in sub-exponential time), which is perhaps similar to Terry’s P(a,b) strategy: at the top level in two of the “parity of pi approaches” (the F_2 and the F_2[x] approaches I wrote about), we must compute sums \sum_{n \leq x} \tau(n) mod 2. I believe this can be facilitated by applying the following basic identity

    1/(n+j) = j/(n+1) - (j-1)/n - j(j-1)/n(n+1)(n+j),

    (I suppose one can think of it as some type of 2D Taylor/Laurent expansion, as in Terry’s P(a,b) approach, but I prefer to just think of it as some polynomial arithmetic.)

    as I will explain; and, of course there are identities involving many more terms of this sort — relating 1/(n+j) to 1/n, 1/(n+1), 1/(n+2), ..., 1/(n+k), for some k.

    Now the way this can be used is as follows: our sum-of-tau can be expressed in terms of the sum \sum_{d \leq \sqrt{x}} \lfloor x/d\rfloor, and let us consider the contribution of those d \in [x^{1/3+\epsilon}, x^{1/2}]: first, note that applying the above identity we have that for any such d and for 1 \leq j \leq x^{3\epsilon/2}/10, say, we have that

    x/(d+j) = jx/(d+1) - (j-1)x/d - E,

    where E  x^{1/3 + \epsilon} or so, we just have to look up in the table the entry corresponding to which of our mod 2 intervals a_0 = x/D \pmod{2} and a_1 = x/(D+1) \pmod{2} happen to lie in.

    This clearly can all be done in time substantially less than x^{1/2}, as it doesn’t take that long to build the table, at least if \epsilon > 0 isn’t too big. Also, here is a way to deal with the issue of mod 2 carries: first of all, they occur only very rarely, and in building the N \times N table, we can add data indicating exactly where they can occur given the initial data a_0, a_1 — in this way we can know where to be more careful about them.

    Ok, so that is a crude first step to how to beat the square-root barrier. Now, what *should* probably be the case is that there is some very quick algorithm that determines the number of terms in the sequence a_0, a_1, ..., a_N that round down to an odd number, given a_0,a_1. And furthermore, there should be an equally simple algorithm that works with more terms — i.e. the natural extension of the above where we relate 1/(n+j) to 1/n, 1/(n+1), ..., 1/(n+k). If this is so, then my calculations indicate that the parity of the sum \sum_{d \leq \sqrt{x}} \lfloor x/d \rfloor should be computable in sub-exponential time.

    That wouldn’t give us a sub-exponential time to compute the parity of \pi(x) just yet, though, because there are still around \sqrt{x} levels of iteration to consider. But I think that once one unrolls these iterations, and expresses \pi(x) in terms of linear combinations of sums of divisor functions, one might have a chance to evaluate the parity of these quickly enough (perhaps through heavy use of Kummer’s Theorem on p-divisiblity of binomial coefficients, with p = 2) to beat the square-root barrier, and possibly even get a sub-exponential time algorithm. If this is so, then I think there is a good chance that a similar idea will work with the F_2[x] algorithm I mentioned in a previous posting because, after all, modding the prime generating function out by the polynomial x in F_2[x] is equivalent to determining the parity of \pi(x), and one wouldn’t expect that it should be any harder for other low-degree polynomials.

    Comment by Ernie Croot — August 24, 2009 @ 1:19 am | Reply

  27. Some of what I wrote seems to be missing from the above posting — maybe it is an error with wordpress. Here is the stuff below where the error E is introduced:

    where E  x^{1/3 + \epsilon} or so, we just have to look up in the table the entry corresponding to which of our mod 2 intervals a_0 = x/D \pmod{2} and a_1 = x/(D+1) \pmod{2} happen to lie in.

    This clearly can all be done in time substantially less than x^{1/2}, as it doesn’t take that long to build the table, at least if \epsilon > 0 isn’t too big. Also, here is a way to deal with the issue of mod 2 carries: first of all, they occur only very rarely, and in building the N \times N table, we can add data indicating exactly where they can occur given the initial data a_0, a_1 — in this way we can know where to be more careful about them.

    Ok, so that is a crude first step to how to beat the square-root barrier. Now, what *should* probably be the case is that there is some very quick algorithm that determines the number of terms in the sequence a_0, a_1, ..., a_N that round down to an odd number, given a_0,a_1. And furthermore, there should be an equally simple algorithm that works with more terms — i.e. the natural extension of the above where we relate 1/(n+j) to 1/n, 1/(n+1), ..., 1/(n+k). If this is so, then my calculations indicate that the parity of the sum \sum_{d \leq \sqrt{x}} \lfloor x/d \rfloor should be computable in sub-exponential time.

    That wouldn’t give us a sub-exponential time to compute the parity of \pi(x) just yet, though, because there are still around \sqrt{x} levels of iteration to consider. But I think that once one unrolls these iterations, and expresses \pi(x) in terms of linear combinations of sums of divisor functions, one might have a chance to evaluate the parity of these quickly enough (perhaps through heavy use of Kummer’s Theorem on p-divisiblity of binomial coefficients, with p = 2) to beat the square-root barrier, and possibly even get a sub-exponential time algorithm. If this is so, then I think there is a good chance that a similar idea will work with the F_2[x] algorithm I mentioned in a previous posting because, after all, modding the prime generating function out by the polynomial x in F_2[x] is equivalent to determining the parity of \pi(x), and one wouldn’t expect that it should be any harder for other low-degree polynomials.

    Comment by Ernie Croot — August 24, 2009 @ 1:25 am | Reply

    • Ok, WordPress did it again. Here is one last attempt, following a slight rewording:

      where $E$ at most 1/10 or so. Modding both sides out by 2 (we are working here with R/2Z, not Z/2Z) we get

      x/(d+j) \equiv jx/(d+1) - (j-1)x/d - E \pmod{2};

      and so, if we knew the values of x/(d+1) and x/d mod 2, then to within a good error, we can easily determine the value of x/(d+j) mod 2. The upshot of this is that, forgetting the issue of wraparound mod 2 (e.g. the number 1.01 is within 1/10 of 0.99, yet 1.01 rounds down to 1 which is odd while 0.99 rounds down to 0 which is even — this issue surely can be dealt with easily), given these two initial values we should be able to quickly determine the parity of the numbers \lfloor x/(d+j) \rfloor for 1 \leq j \leq x^{3\epsilon/2}/10, thereby speeding up our sum-of-divisors-mod-2 calculation considerably.

      Well, here is a crude way to do this: subdivide the interval [0,2) up into a say N := x^{3\epsilon/2} or so sub-intevals. Then, build an N x N table, whose (h,i) entry counts the parity of the number of terms in the sequence a_0, a_1, a_2, ..., a_N that round down to an odd number, where the sequence is generated as follows: let a_0 be the center point of the hth interval, and let a_1 be the center point of the ith interval. Then, let a_j be produced through the relation

      a_j \equiv j a_1 - (j-1) a_0 \pmod{2}.

      One can build this table in time at most x^{O(\epsilon)}. Then, to determine the parity of a short piece like \sum_{D \leq d \leq D + N} \lfloor x/d \rfloor, where D > x^{1/3 + \epsilon} or so, we just have to look up in the table the entry corresponding to which of our mod 2 intervals a_0 = x/D \pmod{2} and a_1 = x/(D+1) \pmod{2} happen to lie in.

      This clearly can all be done in time substantially less than x^{1/2}, as it doesn’t take that long to build the table, at least if \epsilon > 0 isn’t too big. Also, here is a way to deal with the issue of mod 2 carries: first of all, they occur only very rarely, and in building the N \times N table, we can add data indicating exactly where they can occur given the initial data a_0, a_1 — in this way we can know where to be more careful about them.

      Ok, so that is a crude first step to how to beat the square-root barrier. Now, what *should* probably be the case is that there is some very quick algorithm that determines the number of terms in the sequence a_0, a_1, ..., a_N that round down to an odd number, given a_0,a_1. And furthermore, there should be an equally simple algorithm that works with more terms — i.e. the natural extension of the above where we relate 1/(n+j) to 1/n, 1/(n+1), ..., 1/(n+k). If this is so, then my calculations indicate that the parity of the sum \sum_{d \leq \sqrt{x}} \lfloor x/d \rfloor should be computable in sub-exponential time.

      That wouldn’t give us a sub-exponential time to compute the parity of \pi(x) just yet, though, because there are still around \sqrt{x} levels of iteration to consider. But I think that once one unrolls these iterations, and expresses \pi(x) in terms of linear combinations of sums of divisor functions, one might have a chance to evaluate the parity of these quickly enough (perhaps through heavy use of Kummer’s Theorem on p-divisiblity of binomial coefficients, with p = 2) to beat the square-root barrier, and possibly even get a sub-exponential time algorithm. If this is so, then I think there is a good chance that a similar idea will work with the F_2[x] algorithm I mentioned in a previous posting because, after all, modding the prime generating function out by the polynomial x in F_2[x] is equivalent to determining the parity of \pi(x), and one wouldn’t expect that it should be any harder for other low-degree polynomials.

      Comment by Ernie Croot — August 24, 2009 @ 1:27 am | Reply

      • I’m boarding a plane shortly (yet again), and so can’t comment too intelligently on this very interesting approach, but I just had one quick comment: the iteration levels are basically required in order to restrict to squarefree integers (note that the parity of \pi(x) is essentially \sum_{n<x} \tau(n) \mu^2(n) mod 4). So a toy problem would be to beat the square root barrier to compute \sum_{n<x} \mu^2(n), the number of squarefrees less than x. If one can do this in some simple combinatorial manner then there is hope that the iterative process to restrict to squarefrees in general can also get past this barrier.

        Comment by Terence Tao — August 24, 2009 @ 1:51 am | Reply

        • I think I can solve the toy problem in n^{1/3} steps. The algorithm depends on access to values of the Mobius function, so we probably need to assume a factoring oracle.

          By Mobius inversion we have that \sum_{x \leq n} \mu(n)^2 = \sum_{x \leq n} \sum_{l | x} \mu(l^{1/2}) = \sum_{l^{2} m \leq n} \mu(l) . Now we can split this last sum up as \sum_{l^{2} m \leq n} \mu(l) = \sum_{\substack{l^2 m\leq n \\ l \leq n^{1/3}-1}}\mu(l)+\sum_{\substack{ l^2 m \leq n \\ l\geq n^{1/3}}} \mu(l).

          Letting \left[ a\right] denote the floor of a, we have that the first sum is equal to \sum_{l \leq n^{1/3}-1} \mu(l) \left[\frac{n}{l^2} \right] and is computable in about n^{1/3} steps. Rearranging the second sum we have that \sum_{\substack{ l^2 m\leq n \\ l\geq n^{1/3}}}\mu(l)= \sum_{m\leq n^{1/3}} \mu(\sqrt{\frac{x}{m}}). So this is also computable in n^{1/3} steps.

          This is inspired by Roth’s proof that the largest gap between a square-free number x and the next square-free number is O(x^{1/3}).

          Comment by Mark Lewko — August 24, 2009 @ 5:29 am | Reply

        • That’s a nice problem, and probably one can do even better than Mark’s very nice approach — perhaps getting down to x^{1/5} or better (the only reason I say this is that there are analytic estimates to count square-frees in interval widths down that low — so it seems that there are methods to sample very short intervals, analytically at least). I will think on it when I get a little free time…

          It occurs to me that one can indeed determine the parity of \pi(x) in time quicker than x^{1/2}, perhaps down to x^{1/3} or lower: basically, from my original posting on the topic, we know that the sum of divisors up to x can be used to determine the parity of

          \pi(x) + \pi(x/4) + \pi(x/9) + \cdots,

          and we now know we can do this in time significantly below x^{1/2}, even just using the “crude method” I discussed above — so, maybe our running time is something like x^{0.49}, say.

          Now what we can do in this sum of pi’s above is just evaluate the last few terms trivially, by observing that

          \sum_{x^{1/3} < d < x^{1/2}} \pi(x/d^2)\ =\ \sum_{p \leq x^{1/3} \atop p\ prime} |\{ x^{1/3} < d < x^{1/2} : p \in [x/d^2, \sqrt{x}]\}|.

          This can clearly be computed in time at worst x^{1/3 + o(1)}. And what this does is it leaves us only x^{1/3} terms to evaluate to compute the parity of \pi(x). When all is said and done, this gives us a way to find the parity in time x^{0.49}.

          Surely this sum over these d \in [x^{1/3},x^{1/2}] can be computed a lot quicker using, say, some variant of Odlyzko's method. If so, then it wouldn't surprise me if when we replace the “crude algorithm'' with something much better (to compute the sum-of-tau's in sub-exponential time, say) we end up with a way to find the parity in time x^{1/4} or better. In fact, it wouldn't surprise me if it could be done in sub-exponential time, by adding extra layers of iteration (if we can compute \pi(x) a little more quickly, then maybe we can compute weighted sums over primes — like what we had above — a little more quickly, which then means we can compute \pi(x) even quicker than before, which then means…)


          Now let's say we *could* compute the sum \sum_{n \leq x} \tau(n) in sub-exponential time by the 1/n, 1/(n+1), ..., 1/(n+k) method. Then perhaps a similar approach works with the \tau^*(n) function I wrote in a previous posting, where \tau^* = \chi * 1, where \chi is a cubic Dirichlet character mod 7. If this is so, then it seems to me that we can compute \pi(x) mod 3 in time far below the square-root barrier as well. And a similar trick should work with other Dirichlet characters. The upshot of this is that it might be possible, after a lot of work, to compute \pi(x) exactly in time x^{1/4} or better!

          As happened to me last week, I will have to take another week off from polymath to do my teaching and other work. See you next week…

          Comment by Ernie Croot — August 24, 2009 @ 2:11 pm | Reply

          • Actually, I see that there is an issue with the fact that \sum_{d \geq 2} (1/d^2)^{0.49} doesn’t converge, meaning that one needs to be a little more careful in order to get a x^{0.49} algorithm to find the parity of \pi(x). Surely it can be done somehow…

            Comment by Ernie Croot — August 24, 2009 @ 2:56 pm | Reply

        • I think I can solve the toy problem in o(n^1/2) which is less than n^1/3 but doesn’t need a factoring oracle. It uses at least n^1/2 in memory so while it breaks the barrier in terms of time it pays the price in storage. It is based on the sieve of Atkins:

          This computes the primes up to n in O(n/log log n)
          time. The idea is as follows we first use this sieve to find the number of square free numbers up to n^1/4 and save the value for each value less than n^1/4 then for each of these numbers i we find the number of primes in the region n/p^2 is k plus a fractional value then for each prime bigger than n^1/4 we look at those primes for which n/p^2 =k rounded down has the same value then we used the data from the algorithm for lower values to see how man of the values one through k are square free then we have an accurate count of the number of numbers removed by the primes and the whole process should be o(n^1/2) although the memory used is at least O(n^1/2).

          Comment by Kristal Cantwell — August 26, 2009 @ 9:28 pm | Reply

  28. Since the Odlyzko (or Meissel-Lehmer) method compute \pi(x) exactly, we can use it to compute the n-th prime exactly via a binary search in around x^{1/2} steps.

    This is stronger than we need, since we only aim to find some k-digit prime. One strategy for trading some of this precision for speed is to try to compute \pi(x+h)-\pi(x), where h is about \sqrt{x} using these methods. With a binary search, and the RH prime gap result, we just need to get this computation below x^{1/2} steps. I spent a while trying to modify the Meissel-Lehmer method to do this, however I haven’t fully figured out how to make this work. Maybe someone else has some thoughts about this.

    Let me say a little bit about how this strategy might work. I’ll use the notation from the exposition of the Meissel-Lehmer method in the book “Algorithmic Number Theory” by Eric Bach and Jeffery Shallit.

    Let \pi(x,x+h) be the number of primes in the interval [x,x+h] and let \phi(x,x+h,a) denote the number of integers in [x,x+h] such that all of the prime factors are greater than the a-th prime, p_{a}. Also let P_{k}(x,x+h,a) denote the number of integers in the range with exactly k prime divisors, all of which are greater than p_{a}. Clearly \phi(x,x+h,a) = P_{0}(x,x+h,a) + P_{1}(x,x+h,a) + \ldots (here we define P_{0}(\cdot) =1). Now let a = \pi( (x+h)^{1/3}). This gives us that \phi(x,x+h,a) = 1 +\pi(x,x+h) +P_{2}(x,x+h,a) . Rearranging this gives us that \pi(x,x+h) = \phi(x,x+h,a)-1 - P_{2}(x,x+h,a).

    Let us first deal with P_{2}(x,x+h,a). This can be expanded as P_{2}(x,x+h,a) = \sum_{p_{a}\leq p\leq\sqrt{x+h}} \pi(x/p,(x+h)/p) . Of course, \pi(x,y) = \pi(x)-\pi(y). So the largest value of \pi(\cdot) we will need to evaluate will be around (x+h)^{2/3}. As long as h is of lower order than x (such as h=\sqrt{x}) then the largest value of \pi(\cdot) we need to evaluate is around x^{2/3}. Using the Meissel-Lehmer method (without any modifications) we can do this is x^{4/9} steps (using Odlyzko’s method, we could get this down to x^{1/3} steps). I’d like to say that it shouldn’t be much more expensive than this to compute all of the values of \pi(n) we need for n \leq x^{2/3}, since we should be able to cache a lot of steps, but I’m unsure about this.

    We are left to compute \phi(x,x+h,a). We can recursively compute this using the identity \phi(x, x+h ,a) =\phi(x,x+h,a-1) - \phi(x/p_{a}, (x+h)/p_{a}, a-1). Now (on RH) we assume that h\leq\sqrt{x}. To evaluate \phi(x,x+h,a) recursively, we can form a tree with a top node of \phi(x,x+h,a) and two branches leading from this node to \phi(x,x+h,a-1) and \phi(x/p_{a}, (x+h)/p_{a}, a-1). From each of these nodes we form two more branches, again using the identity, until a stopping condition is met. There are two situations in which the evaluation of \phi(x,y,a) is easy. First if a=1 then \phi(x,y,a) is the number of odd elements in the interval [x,y]. Also if y \leq x+1 then \phi(x,y,a) is easy to compute. We could keep expanding our tree until one of these two conditions are met, but this isn’t efficient. Notice that every node of the tree is of the form \phi(x/n,y/n,b) where [x,y] was our initial interval. Moreover, an expression of this form can occur on at most 1 node of the tree.

    Our stopping condition will be either of the following:

    (i) b=1 and n x^{1/3}. (these are “special leaves”)

    Since a distinct \phi(x/n,y/n,b) can occur on at most one node, there are at most x^{1/3} ordinary leaves. Moreover, we have an exact expression to compute these, so the sum over these can be done in about x^{1/3} steps.

    The special leaves are the problem however. The n-value in the parent of a special leave must satisfy n< x^{1/3}, and b < x^{1/3} (by construction), which implies that there are less than x^{2/3} special leaves. Unfortunetly this isn't good enough, since any computation that looks at all of them won't get below x^{1/2} steps.

    Comment by Mark Lewko — August 24, 2009 @ 2:02 am | Reply

    • WordPress truncated the stopping condition, it should have been:

      (i) b=1 and n\leq x^{1/3} (these are “ordinary leaves”)
      (ii) n\geq x^{1/3}+1. (these are “special leaves”)

      Comment by Mark Lewko — August 24, 2009 @ 5:47 am | Reply

  29. Wow – amazing synchronicity. I stumbled across this page trying to find if anyone, anywhere, knew of faster ways to count sums of divisor functions, and it turns out that you folks look like you’re not only talking about that, but for very similar reasons to me.

    A few years ago, I wrote some code in C to calculate the count of primes < n by calculating the various sum of divisor functions and then applying Linnik's identity. I tried to do everything I could to speed it up, but ultimately I never could get it fast enough to be competitive. I think the current version runs in something like O(n^4/5) time (at least from eyeballing it) and O(epsilon) space, but because the various loops for the sums of divisor functions are particularly sped up by aggressive use of wheels, I think its constant time factors are pretty good – or at least for an algorithm with O(epsilon) for memory. It calculates 10^12 in about a second on my reasonably decent laptop with a wheel culling out primes up to 19.

    The code is here (the code is at the bottom of this pdf, along with descriptions of what I'm doing above it)

    if anyone wants to look over it or play with it. I keep trying to find some way to rearrange things in the algorithm to find some clever way to cache… something… to massively speed things up, but so far I've had no real luck. I've tried a lot of things, but I clearly haven't stumbled on the right idea yet. I could list some of those if anyone is interested.

    At any rate, if any of you feel like taking a look at the code and seeing if you have any brilliant ideas, I would be overjoyed. While I wasn't looking, this whole topic (counting sums of divisors quickly) has turned into my Ahab-style White Whale.

    Comment by Nathan McKenzie — August 27, 2009 @ 10:02 am | Reply

  30. […] previous research thread for the “finding primes” project is now getting quite full, so I am opening up a fresh […]

    Pingback by (Research Thread IV) Determinstic way to find primes « The polymath blog — August 28, 2009 @ 1:44 am | Reply

  31. What about the misspelling in the title?

    [Corrected, thanks – T.]

    Comment by Ruth — December 19, 2018 @ 6:11 pm | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: