One of the generally exciting things about additive combinatorics is the confluence of techniques from analysis, combinatorics, and ergodic theory. One of the specifically exciting things to me has been the applicability of the analytic and the combinatorial techniques to computer science. It makes sense that we should make some effort to understand the ergodic-theoretic techniques too, just in case.
So what is ergodic theory? Here is what I learned from wikipedia. The general idea seems that one has a system which is in a certain state which belongs to a possibly huge and complex state space, and the system evolves deterministically in discrete time; one would like to have a general idea of what states the system goes through in the long run, if the start state is in a certain set.
So one has a set (the “states”), which could be a discrete set (though these are not the cases where the theory says very interesting things) like or a continuous one, like say, the interval [0,1]. One also has a measure on , which assigns a “size” to subsets of ; for example we could have when is , or could be the Lebesgue measure of when is [0,1]. In most interesting cases, including [0,1], it is not possible to assign a measure to every subset of and still satisfy the minimal axioms that a function must obey to be called a measure (things like: if and are disjoint, then ); in particular, it is not possible to define a uniform measure on the set of infinite bit-strings (which is the same as [0,1]) so that every subset has a measure, as seen by the solution of this puzzle. In such a case, one has a collection of subsets of , and is only defined on elements of , which are of course called “measurable.” It is important that be closed under complementation and under countable union, that is, that be a -algebra.
This notion makes sense even in the finite setting. If is finite, then one can easily see that if is a -algebra then there is a partition of such that is precisely the collection of sets that can be obtained as unions of blocks of the partition. (Usually, however, in the finite setting one defines for every subset, and so contains all subsets and is generated by the discrete partition.)
Finally, one has an operator that describes the evolution of the system. For example, if , we could have . It is important for the theory that be “measure-preserving” that is that for every measurable set . (In particular, this requires that if is measurable then is measurable, where is the set . If we are in the finite setting, and does not contain all sets, then this measurability condition is quite interesting, because it says that if is the partition that defines the -algebra, then for every we must have that is itself a union of blocks.)
The whole thing is a “measure-preserving dynamical system,” and is the object of study of ergodic theory. Typically, for a starting point we would like to understand the sequence . An interesting case is when the sequence is “all over the space,” meaning that if is a set containing an fraction of the measure of , then roughly of the first elements of the sequence land in . More formally, this is the case when
The ergodic theorem states that if is ergodic, then (1) happens for all , except possibly an exceptional set of measure zero. The property of being ergodic is a sort of “expansion” property of ; it is the requirement that if then , then is , when applied to a set, always generate a noticeable fraction of new elements. In fact, the conclusion of the ergodic theorem is stronger, because one does not just get (1), but also
for nice enough functions .
So far, this seems very remote from combinatorics, except for the feeling that the ergodic theorem is somewhat analogous to discrete theorems on convergence of random walks on expanders (I don’t know if there is a precise connection).
Going back to Szemeredi’s theorem, say, for sequences of length 3, we want to prove that for every , and for every large enough , if has size at least , then contains length-3 progressions. Indeed, known proofs give something stronger: that for every there is an such that for every subset ,
This means that there are at least length-3 progressions where and the progression is contained in . (In particular, if is large enough, then has a length-3 progression.)
If we define , then we can rewrite (2) as
This begins to look like an expression from ergodic theory.
Now, here is a result by Furstenberg which looks similar to (3), but without epsilons and deltas:
If is a measure-preserving system, , and , then
It takes some work to show that Furstenberg’s result implies Szemeredi’s theorem. Instead of trying to explain how this works, I refer you to Terry Tao’s excellent exposition in parts (3) and (4) of this post.
Broadly, there are two points to this approach: one is to use “compactness” arguments to reduce a finitary statement with epsilons and deltas to a cleaner infinitary statement; the other is to abstract away most of the structure of the given problem, and to come up with a more general and abstract statement that can be easier to prove by “inductive” arguments. (Here “induction” is in quotes because it is often an infinitary kind of induction that relies on Zorn’s Lemma.)