# The Triangle Removal Lemma

[At the end of a survey paper on additive combinatorics and computational complexity which is to appear in SIGACT News, I list three major open questions in additive combinatorics which might be amenable to a “computer science proof.” They are all extremely well studied questions, by very smart people, for the past several years, so they are all very long shots. I don’t recommend anybody to start working on them, but I think it is good that as many people as possible know about these questions, because when the right technique comes along its applicability can be more quickly realized.]

The first question is to improve the Triangle Removal Lemma. I have talked here about what the triangle removal lemma is, how one can prove it from the Szemerédi Regularity Lemma, and how it implies the length-3 case of Szemerédi’s Theorem.

As a short recap, the Triangle Removal Lemma states that if ${G}$ is an ${n}$-vertex graph with ${o(n^3)}$ triangles, then there is a set of ${o(n^2)}$ edges such that the removal of those edges eliminates all the triangles. Equivalently, it says that if a graph has ${\Omega(n^2)}$ triangles which are all pair-wise edge-disjoint, then there must be ${\Omega(n^3)}$ triangles overall.

The connection with Szemerédi’s Theorem is that if ${H}$ is an abelian group with ${n}$ elements, and ${A}$ is a subset of ${H}$ with no length-3 arithmetic progressions (i.e., ${A}$ is such that there are no three distinct elements ${a,b,c}$ in ${A}$ such that ${b-a = c-b}$), then we can construct a graph ${G=(V,E)}$ that has ${3n}$ vertices, ${|A| \cdot n}$ pair-wise edge-disjoint triangles, and no other triangles. This contradicts the triangle removal lemma if ${|A| = \Omega(n)}$, and so we must have ${|A| = o(n)}$.

This is great, until we start looking at the relationships between the constants hidden by the ${o(\cdot )}$ notation. Quantitatively, the Triangle Removal Lemma states that for every ${\epsilon}$ there is a ${\delta = \delta(\epsilon)}$ such that if a graph has at least ${\epsilon \cdot n^2}$ pair-wise edge-disjoint triangles, then it has at least ${\delta \cdot n^3}$ triangles. The only known proof, however, has ${\delta}$ incredibly small: ${1/\delta}$ grows like a tower of exponentials of height polynomial in ${1/\epsilon}$. The proof uses the Szemerédi Regularity Lemma, and the Regularity Lemma is known to require such very bad dependencies.

63 years ago, Behrend showed that ${{\mathbb Z}/N{\mathbb Z}}$, ${N}$ prime, has a subset ${A}$ that contains no length-3 arithmetic progression and whose size is ${N/2^{O(\sqrt {\log N})}}$. (Last year, Elkin gave the first improvement in 62 years to Behrend’s bound, but the improvement is only a multiplicative polylog ${N}$ factor.) Combined with the graph construction mentioned above, this gives a graph with ${3N}$ vertices, ${N^2/^{O(\sqrt {\log N})}}$ edge-disjoint triangles, and no other triangle. Thus, the graph has ${\leq \delta N^3}$ triangles where ${\delta < 1/N}$, but one needs to remove ${> \epsilon N^2}$ edges to make it triangle-free, where ${\epsilon > 2^{-O(\sqrt{\log N})}}$. This shows that, in the Triangle Removal Lemma, ${1/\delta}$ must grow super-polynomially in ${1/\epsilon}$, and be at least ${1/\epsilon^{\log 1/\epsilon}}$.

The question is to shorten the gap between the tower-of-exponential relationship between ${1/\delta}$ and ${1/\epsilon}$ coming from the proof via the Szemerédi Regularity Lemma and the mildly super-polynomial lower bound coming from the above argument.

Gowers has constructed graphs showing that the tower-of-exponentials bounds in the Szemerédi Regularity are necessary. One idea on improving the lower bound could be to look at Gowers’s examples and show that with similar techniques one can construct stronger lower bound examples for the Triangle Removal Lemma. Gowers, however, has given a lot of thought to the Triangle Removal Lemma, and it is safe to assume that he knows his own construction well, so I think that this direction is completely hopeless.

In terms of improving the upper bound, one could first consider a direct proof of the Triangle Removal Lemma via iterative partitioning, instead of proving the Regularity Lemma from iterative partitioning and proving the Triangle Removal Lemma by reduction. When one considers the direct proof (which is straightforward), one sees that at every step there are many choices of sets to use to refine the partition. One may hope that, among those choices, one could find one that makes the partition grow polynomially or sub-exponentially instead of exponentially. For example, if one could show that the energy can be increased while only increasing the number of atoms in the partition by a polynomial, in the end we would get a double-exponential relationship between ${1/\delta}$ and ${1/\epsilon}$, which would be quite amazing. (It would recover the bounds of Roth’s original proof for arithmetic progressions of length 3.) Again, people with a lot of experience on iterative partitioning methods have thought about this problem, so this approach is also likely to be hopeless.

So maybe one needs to think of something completely different, and put the problem in a context in which fresh techniques can be applied. Here are some observations I made about two years ago. Unfortunately, they circle back to problems that people have thought about for 40+ years. An equivalent way of stating the Triangle Removal Lemma is to say that there is a function ${c(\delta)}$ which tends to infinity when ${\delta}$ tends to zero such that if a graph has ${\leq \delta n^3}$ triangles then they can all be removed by removing ${\leq n^2/c(\delta)}$ edges. This means, in particular, that if the graph contains exactly ${\delta n^3}$ triangles, then there is an edge that belongs to at least ${\delta n\cdot c(\delta)}$ triangles.

Theorem 1 There is a function ${c(\delta)}$ such that ${\lim_{\delta \rightarrow 0} c(\delta) = \infty}$ and such that in any ${n}$-vertex graph with exactly ${\delta n^3}$ triangles there is an edge that belongs to at least ${\delta n\cdot c(\delta)}$ triangles.

Can we find a different proof of this fact?

Note that Theorem 1 is enough to establish Szemerédi’s Theorem for progressions of length 3: take a graph with ${\alpha n^2}$ edge-disjoing triangles and no other triangle; we need to reach a contradiction if ${n}$ is large enough compared with ${\alpha}$. Applying Theorem 1, we find that there is an edge that belongs to at least ${\alpha \cdot c(\alpha/n)}$ triangles, and this is more than ${1}$ if ${n}$ is large enough, contradicting the assumption that all the triangles in the graph are edge-disjoint.

Now, label each edge (indeed, every pair of vertices) in a graph by the number of triangles it belongs to, and ignore multiplicative constant factors (so we don’t need to worry whether we are counting triangles as ordered 3-tuples or unordered ones and so on). To prove Theorem 1 we need to argue that the labels are “irregularly distributed,” so it is natural to look at their variance. Consider then the average, for a random “edge” (or, rather, for a random pair of vertices, which may or may not be an edge), of the square of the number of triangles it belongs to.

If the graph has exactly ${\delta n^3}$ triangles, then by Cauchy-Schwarz it is easy to see that the above quantity is at least ${\delta^2 n^2}$. If all triangles can be removed by removing ${n^2/c(\delta)}$ edges, then a more careful Cauchy-Schwarz argument shows that the average square label is at least ${\delta^2 n^2 c(\delta)}$. If one can show that this average square is at least ${\delta^2 n^2}$ times a term that goes to infinity, then in particular some label must be at least ${\delta n}$ times a term that goes to infinity, which proves Theorem 1. The following statement, then, is “intermediate” between the full Triangle Removal Lemma and Theorem 1, and it might be the right question to think about:

Theorem 2 There is a function ${c'(\delta)}$ such that ${\lim_{\delta \rightarrow 0} c'(\delta) = \infty}$ and such that in any ${n}$-vertex graph ${G=(V,E)}$ with exactly ${\delta n^3}$ triangles, if we label each vertex pair ${(u,v)}$ by the number ${t(u,v)}$ of triangles involving ${u}$ and ${v}$, then

$\displaystyle \mathop{\mathbb E}_{(u,v) \in V^2} t^2(u,v) \geq \delta^2 \cdot n^2 \cdot c'(\delta)$

Terry Tao observed that, up to normalization, the average of ${(t(u,v))^2}$ counts the number of diamonds of the graph, that is, the number of patterns made of four vertices, and looking like two triangles with an edge in common. So Theorem 2 can be stated equivalently as

Theorem 3 There is a function ${c'(\delta)}$ such that ${\lim_{\delta \rightarrow 0} c'(\delta) = \infty}$ and such that any ${n}$-vertex graph ${G=(V,E)}$ with exactly ${\delta n^3}$ triangles has at least ${\delta n^4 c'(\delta)}$ diamonds.

Terry wrote a fascinating post putting this question in the broad context of extremal questions about “local patterns” in graphs. In the comments to that post, Vlado Nikiforov points out the notion of “book with ${k}$ pages” (a pattern consisting of ${k}$ triangles sharing a common edge — a diamond is a “book with 2 pages”) about which Erdös and others considered many questions since the 1960s. (Although I am not sure whether a statement exactly equivalent to Theorem 3 has been formulated before, related conjectures have been around for more than 40 years. Note that Theorem 3 is true; the question is how good a function ${c'(\cdot)}$ we can get, and whether we can find a proof that does not go through the Szemerédi Regularity Lemma.)

## 8 thoughts on “The Triangle Removal Lemma”

1. Has there been work towards showing that the Behrend bound is optimal in some restricted setting? For example, if the graph is tripartite and each vertex participates in the same number of triangles?

2. This would imply that Behrend’s bound is optimal for Roth’s theorem, which is a question completely out of reach of current techniques, and possibly it’s not even true.

3. I see. But to understand the structure of the problem, perhaps it’s valuable to study it for restricted families of graphs?

4. Arguably, two most exciting applications of the TRL are the proof of Roth’ Theorem and the proof of the Corners Theorem, and in both cases it suffices to have a much relaxed version of the TRL; namely:

For any $\delta>0$ there exists $n_0=n_0(\delta)$ such that for every graph $\Gamma$ on $n\ge n_0$ vertices, and every system of at least $\delta n^2$ edge-disjoint triangles in $\Gamma$, there exists at least one more triangle in $\Gamma$.

I wonder whether it is possible to find a reasonably simple and “effective” proof of this relaxed version of the TRL, and also whether, using a kind of Varnavides argument, one can derive from it the “full TRL”.