I’ve been feeling slightly guilty over the last few days because I’ve been thinking privately about the problem of improving the Roth bounds. However, the kinds of things I was thinking about felt somehow easier to do on my own, and my plan was always to go public if I had any idea that was a recognisable advance on the problem.
I’m sorry to say that the converse is false: I am going public, but as far as I know I haven’t made any sort of advance. Nevertheless, my musings have thrown up some questions that other people might like to comment on or think about.
Two more quick remarks before I get on to any mathematics. The first is that I still think it is important to have as complete a record of our thought processes as is reasonable. So I typed mine into a file as I was having them, and the file is available here to anyone who might be interested. The rest of this post will be a sort of digest of the contents of that file. The second remark is that I am writing this as a post rather than a comment because it feels to me as though it is the beginning of a strand of discussion rather than the continuation of one, though it grows out of some of the comments made on the last post. Note that since we are operating on the Polymath blog, anybody else is free to write a post too (if you are likely to be one of the main contributors, haven’t got moderator status and want it, get in touch and I can organize it).
The starting point for this line of thought is that the main difficulty we face seems to be that Bourgain’s Bohr-sets approach to Roth is in a sense the obvious translation of Meshulam’s argument, but because we have to make a width sacrifice at each iteration it gives a type bound rather than a type bound. Sanders’s argument gives a type bound, but if we use that then it is no longer clear how to import the new ideas of Bateman and Katz. Therefore, peculiar as it might seem to jettison one of the two papers that made this project seem like a good one in the first place, it is surely worth thinking about whether the width sacrifice that Bourgain makes (and that is also made in subsequent refinements of Bourgain’s method, due to Bourgain and Sanders) is fundamentally necessary or merely hard to avoid.
After thinking about this question in somewhat vague terms for quite a while, I have now reached a more precise formulation of it. To begin with, I want to avoid the technical issue of regularity, which can be thought of as arising from the fact that a lattice is a discrete set and therefore behaves a little strangely at small distance scales. We sort of know that that creates only technical difficulties, so if we want to get a feel for what is true, then it is convenient to think of a Bohr set as being a symmetric convex body in for some
The question I want to consider is this. Let be a convex body in let and let be a subset of of relative density that contains no 3AP with common difference of length greater than (This last condition is needed, since every set of positive measure contains a non-trivial 3AP, by the Lebesgue density theorem. It can be thought of as admitting that our set-up isn’t really continuous but just looks continuous at an appropriate distance scale.) Is it possible to show that there is a density increase of around on a structured subset of of comparable width?
Actually, what I really want is (I think — I haven’t checked this formulation as carefully as I should have) a trigonometric function such that for some absolute constant The reason this would be nice is that we could then pass to a subset of on which would have increased density, and the width of would be comparable to that of The set would no longer be convex: it would look more like a union of parallel slabs cut out of But it would be Freiman isomorphic to something like a “convex body” in Going back to Bohr sets, we ought to have no trouble getting from a -dimensional Bohr set to a -dimensional Bohr set of the same width. And that would be much more like the Meshulam set-up where the codimension increases and that is all.
Reasons to be pessimistic.
Let me try to put as strongly as I can the argument that there is no hope of getting a density increase without a width sacrifice.
To begin with, think what a typical 3AP looks like. For the purposes of this argument, I’ll take to be a sphere in Since is convex, the average of two points in always lies in Therefore, there is a one-to-one correspondence between pairs of points in and triples of points in that form a 3AP. What does the average of two random points of typically look like? Of course, it can be any point in but if is high-dimensional, then a random point in is close to the boundary, and a random second point in is not only also close to the boundary, but it is approximately orthogonal to the first point (assuming that is centred at the origin). Therefore, the average of the two points typically lives close to a sphere of radius times the radius of Therefore, if we take to be the set we have a set of measure exponentially close to 1 (by which I mean exponentially in the dimension of ) with exponentially fewer 3APs than there are in
What this simple example shows is that if we want to obtain a density increase, it will not be enough to use the fact that contains few 3APs — we will have to use the fact that it contains no 3APs. Even having exponentially few 3APs doesn’t help. So a straightforward Roth-style Fourier manipulation doesn’t work.
Pushing this example slightly further, even the set has measure roughly What can we say about the 3APs it contains? They live close to the boundary of a sphere, and that forces them to have small common difference. So one way that we might exploit the fact that has no 3APs rather than just very few 3APs would be just to count 3APs with a small common difference (the smallness depending on both and ). Since there are exponentially fewer of these than there are 3APs in general, we really would be using more than just that there are exponentially few 3APs.
But if we restrict attention to 3APs with small common difference, can we hope to find a “global” density increase, as opposed to the “local” density increase one would obtain by passing to a smaller-width Bohr set? Consider what happens, for instance, if one has a subset of with the wrong number of 3APs with small common difference. If “small” means “at most ” and is substantially less than then one can take a fairly random union of intervals of length 10m, say. This will have many more than its fair share of 3APs of common difference less than but because of the randomness we will not detect any global correlation with a trignometric function.
Reasons to be less pessimistic.
We seem to be in a difficult situation: one example appears to force us to consider small common differences (though in fact Bourgain doesn’t, because instead of restricting the difference of the 3AP he restricts its central element to lie in a small Bohr set), while another example appears to suggest that from the fact that there are no 3APs with small common difference one cannot conclude that there is global correlation with a trigonometric function.
However, there is a mismatch between the two examples, and at the moment I cannot rule out that the mismatch is pointing to something fundamental. The mismatch is this: the first example (where we take a sphere and remove the heart) works because we are in a high dimension, whereas the second works because we are in a low dimension.
Let me explain what I mean. The first example relied strongly on measure concentration, so it is clear that it needed us to be in a high dimension. As for the second, it relied on our being able to say that if you take two points in the set with small, then is likely to be in the set. To achieve that, we took a union of balls of radius quite a bit larger than the smallness of the small common differences. But in high dimensions, if you want to be able to conclude from the fact that and both belong to some ball that also probably belongs to that ball, then you need the radius of the ball to be much larger: a constant factor is nothing like good enough. (Why? Because the two points will almost always be on the boundary and as far away as the smallness condition allows. And since the boundary is “curved on average”, or something like that, will not then be in the ball unless the radius is large enough for the boundary to feel flat at that distance scale.)
What interests me about this is that the smallness you need in order to deal with the first example seems to be very closely related to the smallness you need in order to make the second example work. In the first case, you need to drop down to a distance scale with the property that if you take two typical points in (which will be near the boundary) then will typically belong to as well. If we now try to create an example of the second type out of balls of radius then in order to get it to work, we need to have balls that are large enough to have the property that … if belong to then probably does as well. In other words, we seem to be forced to take our sub-balls of to be as big as itself.
Who is correct, the optimist or the pessimist?
One problem with the optimist’s argument above is that it is qualitative. I argued qualitatively that the very high-dimensionality that makes the first example work stops the second example working. But it is conceivable that one might manage to turn that into a rigorous argument that showed that one could get away with dropping the width by a factor of 2 instead of something like and that, it turns out, would produce only -type savings in the final bound.
But if the optimist is correct, then a natural question arises: how would one go about turning that qualitative argument into a rigorous and quantitative proof that not having small APs leads to a global correlation with a trigonometric function? One’s first instinct is to think that it would be necessary to classify Bohr sets according to their “true” dimension, or something like that — which would be difficult, as the structure of a Bohr set depends in subtle ways on the various linear relations between the characters that define it. If we take it as read that any such classification would be bound to lose constants in a way that would destroy any hope of the kind of exact result we would need, what does that leave?
The main thing we have to decide is our “distance scale”. That is, we are given a Bohr set and we need to define some other set and restrict attention to 3APs with common difference in Or perhaps we will prefer to choose something more general like a probability measure that is concentrated on “small” values, and choose our common difference -randomly. But how do we make that choice without understanding all about
The only possible answer I can think of is to define or in some simple way in terms of that is designed to give you sensible answers in the cases we understand. For instance, if is actually a subspace of then we want to consider all differences, so we want to be the characteristic measure of And if is a -dimensional sphere, then we want to be a ball of radius chosen such that if you take a “shell” of of measure then an average 3AP will have common difference comparable to that radius.
It looks more and more as though it would be necessary to consider not just Bohr sets in isolation, but Bohr sets as members of ensembles. Fortunately, thanks to work of Ben Green and Tom Sanders, we have the idea of a Bourgain system to draw on there.
Let me give one further thought that makes me dare to hope a little bit. It seems that quite a lot of our problems are caused by the fact that high-dimensional Bohr sets have boundaries, and measure concentrated on those boundaries. But if we pass to a “shell” (by which I mean a set that is near the boundary) then it does not have a boundary. (By the way, any argument that seems to be making us consider a spherical shell is sort of interesting, given that it raises the hope of ultimately connecting with the Behrend lower bound.) In order to think about what happens when we are in a high-dimensional set with no boundary, let us now suppose we are in the group We are immediately encouraged to note that if a set in this group contains no 3APs, then we get correlation with a trigonometric function by the usual Fourier argument.
What happens, though, if we restrict the common difference to be small in some sense? I’m not sure, but let me at least do the Fourier calculation. It is not hard to check that transforms to which can also be written as or as
In low dimensions I would normally deal with such a sum by taking to be an AP with smoothed edges so that it had absolutely summable Fourier coefficients (which could easily be arranged to be real and non-negative). And then I would simply use averaging to say that there must be some value of for which the sum is large. In high dimensions this is not good enough: the sum of the coefficients is exponential in the dimension, so the density increase we would get would be exponentially small. So how would we exploit the high-dimensionality? (Or perhaps it just isn’t the case that a local-ish 3APs count implies a global correlation.) I have just the vaguest of ideas here, which is that in high dimensions the set of places where the Fourier coefficients of are large are fairly dissociated. Perhaps one can show somehow that it is not possible for to be large for many that form a dissociated set, so that the only way for the whole sum to be large is if there is some for which it is very large. In other words, perhaps we can show that the naive averaging argument really is very inefficient.
I’m going to leave this here, but let me quickly make a remark about the pdf file that I linked to at the beginning of this post. It is not meant to be anything like a polished document, which means it shouldn’t even necessarily be assumed to be correct. In fact, at the top of page 11 I made quite an important mistake: the expression I wrote down is not the probability I said it was; to get that probability one needs to replace the third by an average of some characteristic functions rather than characteristic measures, and that means that the approach works rather less neatly than I had hoped it would.