# Ellenberg’s announcement of a solution to the cap-set problem

Jordan Ellenberg has just announced a resolution of the “cap problem” using techniques of Croot, Lev and Pach, in a self-contained three-page paper. This is a quite unexpected development for a long-standing open problem in the core of additive combinatorics.

Perhaps the right starting point for this story is 1936, when Erdos and Turan conjectured that, for every ${k}$, if ${A}$ is a subset of ${\{1,\ldots, N\}}$ without ${k}$-terms arithmetic progressions, then ${|A|= o_k(n)}$, or, equivalently, that if ${A}$ is a subset of the integers of positive density, then it must have arbitrarily long arithmetic progressions. Their goal in stating this conjecture was that resolving it would be a stepping stone to proving that the prime numbers have arbitrarily long arithmetic progressions. This vision came true several decades later. Szemeredi proved the conjecture in 1975, and Green and Tao proved that the primes contain arbitrarily long arithmetic progressions in 2004, with Szemeredi’s theorem being a key ingredient in their proof.

Rewinding a bit, the first progress on the Erdos-Turan conjecture came from Roth, who proved the ${k=3}$ case In 1955. Roth’s proof establishes that if ${A \subseteq \{ 1,\ldots, N\}}$ does not have length-3 arithmetic progressions, then ${|A|}$ is at most, roughly ${N/\log\log N}$. Erdos also conjectured that the bound should be ${o(N/\log N)}$, and if this were true it would imply that the primes have infinitely many length-3 arithmetic progressions simply because of their density.

Roth’s proof uses Fourier analysis, and Meshulam, in 1995, noted that the proof becomes much cleaner, and it leads to better bounds, if one looks at the analog problem in ${{\mathbb F}_p^n}$, where ${{\mathbb F}_p}$ is a finite field (of characteristic different from 2). In this case, the question is how big can ${A\subseteq {\mathbb F}_p^n}$ be if it does not have three points on a line. An adaptation of Roth’s techniques gives an upper bound of the order of ${p^n/n}$, which, for constant ${p}$, is of the order of ${N/\log N}$ if ${N}$ is the size of the universe of which ${A}$ is a subset.

Bourgain introduced a technique to work on ${{\mathbb Z}/N{\mathbb Z}}$ “as if” it where a vector space over a finite field, and proved upper bounds of the order of ${N/\sqrt {\log N}}$ and then ${N/(\log N)^{3/4}}$ to the size of a subset of ${\{ 1,\ldots , N\}}$ without length-3 arithmetic progressions. The latest result in this line is by Sanders, who proved a bound of ${(N poly\log\log N)/\log N}$, very close to Erdos’s stronger conjecture.

How far can these results be pushed? A construction of Behrend’s shows that there is a set ${A\subseteq \{ 1,\ldots, N\}}$ with no length-3 arithmetic progression and size roughly ${N/2^{\sqrt {\log N}}}$. The construction is simple (it is a discretization of a sphere in ${\sqrt {\log N}}$ dimensions) and it has some unexpected other applications. This means that the right bound in Roth’s theorem is of the form ${N^{1-o(1)}}$ and that the “only” question is what is the ${N^{-o(1)}}$ term.

In the finite vector space case, there is no analog of Behrend’s construction, and so the size of say, the largest subset of ${{\mathbb F}_3^n}$ without three points on a line, was completely open, with an upper bound of the order of ${3^n/n}$ and lower bounds of the order of ${c^n}$ for some constant ${c<3}$. The cap problem was the question of whether the right bound is of the form ${3^{(1-o(1)) }}$ or not.

Two weeks ago, Croot, Lev and Pach proved that if ${A}$ is a subset of ${({\mathbb Z}/4{\mathbb Z})^n}$ without length-3 arithmetic progressions, then ${|A|}$ is at most of the order of ${4^{.926 \cdot n}}$. This was a strong indication that the right bound in the cap problem should be sub-exponential.

This was done a couple of days ago by Ellenberg, who proved an upper bound of the form ${(2.756)^n}$ holds in ${{\mathbb F}_3^n}$. The proof is not specific to ${{\mathbb F}_3}$ and generalizes to all finite fields.

Both proofs use the polynomial method. Roughly speaking, the method is to associate a polynomial to a set of interest (for example, by finding a non-zero low-degree polynomial that is zero for all points in the set), and then to proceed with the use of simple properties of polynomials (such as the fact that the space of polynomials of a certain degree has a bounded dimension, or that the set of zeroes of a univariate non-zero polynomial is at most the degree) applied either to the polynomial that we constructed or to the terms of its factorization.

Let ${P_d}$ be the vector space of ${n}$-variate polynomials over ${{\mathbb F}_3}$ of total degree ${d}$ that are cube-free (that is, such that all variables occur in monomials with degree 0, 1, or 2), and let ${m_d}$ be its dimension.

If ${A\subseteq {\mathbb F}_3^n}$ is a set such that there are no distinct ${a,b,c}$ such that ${a+b+c=0}$ (a different property from being on a line, but the draft claims that the same argument works for the property of not having three points on a line as well), then Ellenberg shows that $\displaystyle m_d - 3^n + |A| \leq 3 m_{d/2}$

then the bound follows from computing that ${m_{2n/3} \geq 3^n - c^n}$ and ${m_{n/3} \leq c^n}$ for ${c \approx 2.756\cdots}$.

The finite field Kakeya problem is another example of a problem that had resisted attacks from powerful Fourier-analytic proofs, and was solved by Zeev Dvir with a relatively simple application of the polynomial method. One may hope that the method has not yet exhausted its applicability.

Gil Kalai has posted about further consequence of the results of Croot, Lev, Pach and Ellenberg.

## 3 thoughts on “Ellenberg’s announcement of a solution to the cap-set problem”

1. “(a different property from being on a line, but the draft claims that the same argument works for the property of not having three points on a line as well)”

In F3, we have -c = 2c, so a set avoiding a+b+c=0 also avoids a + b = 2c. The latter condition is equivalent to avoiding a three-term progression.

2. Dear Professor,

A major complexity class is NL. NL is the class of languages that are decidable on a nondeterministic logspace machine. A logspace machine is a Turing machine with a read-only input tape, a write-only output tape, and a read/write work tapes. The work tapes may contain at most O(log n) symbols. In addition, Sanjeev Arora described in his book “Computational complexity: A modern approach” the certificate-based definition for NL.

The certificate-based definition of NL assumes that a deterministic logspace machine verifies the elements in a NL language using another separated read-only tape for the certificate. On each step of the machine the machine’s head on that tape can either stay in place or move to the right. In particular, it cannot reread any bit to the left of where the head currently is. For that reason this kind of special tape is called “read once”.

CLIQUE is a well-known NP-complete problem. We show that a certificate u which represents a clique V’ of size k on a graph G can be verified by a deterministic logspace machine M using an additional special read-once input tape, and thereby CLIQUE will be in NL as a direct consequence of this previous definition. If any single NP-complete problem can be solved in polynomial time, then every NP problem has a polynomial time algorithm. However, every problem in NL is in P too. Consequently, we can conclude that P = NP.

You can understand more this idea and debug my verifier algorithm in

https://hal.archives-ouvertes.fr/hal-01316353/document

Any suggestion will be welcome,