[At the end of a survey paper on additive combinatorics and computational complexity which is to appear in SIGACT News, I list three major open questions in additive combinatorics which might be amenable to a "computer science proof." They are all extremely well studied questions, by very smart people, for the past several years, so they are all very long shots. I don't recommend anybody to start working on them, but I think it is good that as many people as possible know about these questions, because when the right technique comes along its applicability can be more quickly realized.]
The first question is to improve the Triangle Removal Lemma. I have talked here about what the triangle removal lemma is, how one can prove it from the Szemerédi Regularity Lemma, and how it implies the length-3 case of Szemerédi’s Theorem.
As a short recap, the Triangle Removal Lemma states that if is an -vertex graph with triangles, then there is a set of edges such that the removal of those edges eliminates all the triangles. Equivalently, it says that if a graph has triangles which are all pair-wise edge-disjoint, then there must be triangles overall.
The connection with Szemerédi’s Theorem is that if is an abelian group with elements, and is a subset of with no length-3 arithmetic progressions (i.e., is such that there are no three distinct elements in such that ), then we can construct a graph that has vertices, pair-wise edge-disjoint triangles, and no other triangles. This contradicts the triangle removal lemma if , and so we must have .
This is great, until we start looking at the relationships between the constants hidden by the notation. Quantitatively, the Triangle Removal Lemma states that for every there is a such that if a graph has at least pair-wise edge-disjoint triangles, then it has at least triangles. The only known proof, however, has incredibly small: grows like a tower of exponentials of height polynomial in . The proof uses the Szemerédi Regularity Lemma, and the Regularity Lemma is known to require such very bad dependencies.
63 years ago, Behrend showed that , prime, has a subset that contains no length-3 arithmetic progression and whose size is . (Last year, Elkin gave the first improvement in 62 years to Behrend’s bound, but the improvement is only a multiplicative polylog factor.) Combined with the graph construction mentioned above, this gives a graph with vertices, edge-disjoint triangles, and no other triangle. Thus, the graph has triangles where , but one needs to remove edges to make it triangle-free, where . This shows that, in the Triangle Removal Lemma, must grow super-polynomially in , and be at least .
The question is to shorten the gap between the tower-of-exponential relationship between and coming from the proof via the Szemerédi Regularity Lemma and the mildly super-polynomial lower bound coming from the above argument.
Gowers has constructed graphs showing that the tower-of-exponentials bounds in the Szemerédi Regularity are necessary. One idea on improving the lower bound could be to look at Gowers’s examples and show that with similar techniques one can construct stronger lower bound examples for the Triangle Removal Lemma. Gowers, however, has given a lot of thought to the Triangle Removal Lemma, and it is safe to assume that he knows his own construction well, so I think that this direction is completely hopeless.
In terms of improving the upper bound, one could first consider a direct proof of the Triangle Removal Lemma via iterative partitioning, instead of proving the Regularity Lemma from iterative partitioning and proving the Triangle Removal Lemma by reduction. When one considers the direct proof (which is straightforward), one sees that at every step there are many choices of sets to use to refine the partition. One may hope that, among those choices, one could find one that makes the partition grow polynomially or sub-exponentially instead of exponentially. For example, if one could show that the energy can be increased while only increasing the number of atoms in the partition by a polynomial, in the end we would get a double-exponential relationship between and , which would be quite amazing. (It would recover the bounds of Roth’s original proof for arithmetic progressions of length 3.) Again, people with a lot of experience on iterative partitioning methods have thought about this problem, so this approach is also likely to be hopeless.
So maybe one needs to think of something completely different, and put the problem in a context in which fresh techniques can be applied. Here are some observations I made about two years ago. Unfortunately, they circle back to problems that people have thought about for 40+ years. An equivalent way of stating the Triangle Removal Lemma is to say that there is a function which tends to infinity when tends to zero such that if a graph has triangles then they can all be removed by removing edges. This means, in particular, that if the graph contains exactly triangles, then there is an edge that belongs to at least triangles.
Can we find a different proof of this fact?
Note that Theorem 1 is enough to establish Szemerédi’s Theorem for progressions of length 3: take a graph with edge-disjoing triangles and no other triangle; we need to reach a contradiction if is large enough compared with . Applying Theorem 1, we find that there is an edge that belongs to at least triangles, and this is more than if is large enough, contradicting the assumption that all the triangles in the graph are edge-disjoint.
Now, label each edge (indeed, every pair of vertices) in a graph by the number of triangles it belongs to, and ignore multiplicative constant factors (so we don’t need to worry whether we are counting triangles as ordered 3-tuples or unordered ones and so on). To prove Theorem 1 we need to argue that the labels are “irregularly distributed,” so it is natural to look at their variance. Consider then the average, for a random “edge” (or, rather, for a random pair of vertices, which may or may not be an edge), of the square of the number of triangles it belongs to.
If the graph has exactly triangles, then by Cauchy-Schwarz it is easy to see that the above quantity is at least . If all triangles can be removed by removing edges, then a more careful Cauchy-Schwarz argument shows that the average square label is at least . If one can show that this average square is at least times a term that goes to infinity, then in particular some label must be at least times a term that goes to infinity, which proves Theorem 1. The following statement, then, is “intermediate” between the full Triangle Removal Lemma and Theorem 1, and it might be the right question to think about:
Terry Tao observed that, up to normalization, the average of counts the number of diamonds of the graph, that is, the number of patterns made of four vertices, and looking like two triangles with an edge in common. So Theorem 2 can be stated equivalently as
Terry wrote a fascinating post putting this question in the broad context of extremal questions about “local patterns” in graphs. In the comments to that post, Vlado Nikiforov points out the notion of “book with pages” (a pattern consisting of triangles sharing a common edge — a diamond is a “book with 2 pages”) about which Erdös and others considered many questions since the 1960s. (Although I am not sure whether a statement exactly equivalent to Theorem 3 has been formulated before, related conjectures have been around for more than 40 years. Note that Theorem 3 is true; the question is how good a function we can get, and whether we can find a proof that does not go through the Szemerédi Regularity Lemma.)