After discussing Szemeredi’s Theorem and the analytical approaches of Roth and Gowers, let’s see some ideas and open problems in the combinatorial approach.
The following result is a good starting point.
Triangle Removal Lemma. For every
there is a constant
such that if
is an
-vertex graph with at most
triangles, then it is possible to make
triangle-free by removing at most
edges.
This result follows easily from the Szemeredi Regularity Lemma, and it is a prototype of several results in the field of property testing.
Indeed, consider the problem of distinguishing triangle-free graphs from graphs that are not even close to being triangle-free. For the purpose of this example, we think of a graph as “close to being triangle-free” if it can be made triangle-free by removing at most edges, for some small constant
. Then the Lemma tells us that in a not-even-close graph we have at least
triangles, and so a sample of
vertices is likely to contain a triangle. So here is an algorithm for the problem: pick at random
vertices and see if they induce a triangle. We note that:
- The algorithm satisfies our requirement;
- The running time is a constant independent of
and dependent only on
;
- this all makes sense only if
grows moderately as a function of
.
Let us now see that the Triangle Removal Lemma proves Szemeredi’s Theorem for sequences of length 3. As we discussed earlier, it is enough to prove the Theorem in groups of the type , with
prime; we will do better and prove the Theorem in any abelian group. So let
be an abelian group of size
, an let
be a subset of
of size
. Construct a tri-partite graph
where
are sets of vertices, each of size
, and each a copy of the group
. The set of edges is defined as follows:
- for
and
,
is an edge if there is an
such that
- for
and
,
is an edge if there is a
such that
- for
and
,
is an edge if there is a
such that
Now we notice that if is a triangle, then there are
in
such that (after some cancellations)
, which means that
is an arithmetic progression of length 3 in the group
. In fact, we see that the number of triangles in the graph is precisely
times the number of triples
in arithmetic progression in
.
Consider now the triangles corresponding to the “trivial” progressions of the form
. (These are the triangles
.) We can see that these triangles are edge-disjoint, and so in order just to remove such triangles we have to remove from the graph at least
edges. So the Triangle Removal Lemma implies that there are at least
triangles in the graph, and so at least
triples in arithmetic progression in
. If
, then some of those arithmetic progressions must be non-trivial, and we have the length-3 case of Szemeredi’s theorem.
We see that both for the “property testing” application and for the Szemeredi Theorem application it is important to have good quantitative bounds on . We know that, in Szemeredi’s theorem,
must be super-polynomial in
, and so, in the Triangle Removal Lemma,
must be super-polynomial in
.
The known bounds, however, are quite unreasonable: we only know how to bound by a tower of exponentials whose height is
. Unfortunately, such unreasonable bounds are unavoidable in any proof of the Triangle Removal Lemma based on Szemeredi’s Regularity Lemma. This is because, as proved by Gowers, such tower-of-exponentials bounds are necessary in the Regularity Lemma.
Can we find a different proof of the Triangle Removal Lemma that has only a singly- or doubly-exponential ? That would be wonderful, both for property testing (because the proof would probably apply to other sub-graph homomorphism problems) and for additive combinatorics. Over the last year, I have found at least three such proofs. Unfortunately they were all fatally flawed.
Or, is there a more-than-exponential lower bound for the Triangle Removal Lemma? This would also be a major result: it would show that several results in property testing for dense graphs have no hope of being practical, and that there is a “separation” between the quantitative version of Szemeredi’s Theorem provable with analytical methods versus what can be proved with the above reduction. Besides, it’s not often that we have super-exponential lower bounds for natural problems, with Gowers’s result being a rare exception.
By the way, what about Szemeredi’s Theorem for longer sequences? For sequences of length one needs a “
-clique removal lemma” for
-uniform hypergraphs, which in turn can be derived from the proper generalization of the Regularity Lemma to hypergraphs. This turns out to be quite complicated, and it has been accomplished only very recently in independent work by Nagle, Rödl and Schacht; and by Gowers. An alternative proof has been given by Tao. The interested reader can find more in expository paper by Gowers and Tao.
What about Szemeredi’s own proof? It does use the Regularity Lemma, which was conceived and proved specifically for this application, and it involves a reduction to a graph-theoretic problem. I have to admit that I don’t know much else.
Very clear and nicely written exposition – thanks!
Pingback: The Triangle Removal Lemma « in theory