One of the three questions that were more frequently asked to me at STOC was why I did not define Gowers uniformity in my talk on this joint work with Alex Samorodnitsky. Is it because the definition is very complicated? Not at all, it is just that the definition seems very arbitrary unless one has the time to tell some stories about it and to look at a few equivalent versions of the definition and some of its basic properties.

My favorite source is this paper by Green and Tao; the first 6-7 sections are a wonderful review of what is known. Another great source is this survey paper by Green, that presents several open questions that could be of interest to us, especially the question of how large can a subset of F_{3}^{n} be so that no three points are on a line. (The big question is whether the asnwer is (3-o(1))^{n} or not.) There is also a question related to “linearity testing,” and I will probably talk about it in another post. Green has also written a more recent survey paper that is more limited in scope but that has more technical details.

In this post, I will tell a story about Gowers uniformity in the computer science language of linearity testing and low degree test for Boolean functions, which may be the gentlest introduction for many readers. I will almost exclusively refer to Boolean functions, but the theory applies more generally to bounded complex-valued functions whose domain is an Abelian group. (Some definitions/results, however, are more simply stated in the special case of Boolean functions.)

Umesh often complains that, during technical talks, the *suspense kills him,* meaning that the speaker goes on and on with technical definitions and results, without giving any indication of where he or she is going.

To spoil all suspense: the eventual goal is to have a “higher degree” Fourier analysis that allows us to talk about correlations of a given function with low-degree polynomials. (In contrast with Fourier analysis that, in the Boolean case, talks about correlations of a given function with linear functions.) For every fixed *k*, we will define a “dimension-*k*” norm for Boolean functions; such norms are conjectured to “approximate” the correlation of the function with the closest degree-*(k-1)* polynomial. The conjecture is a theorem for *k=2* (by Fourier analysis) and for *k=3* (by a tour de force), and it is open otherwise. What is known, however, is that functions with small norm have several useful “pseudorandomness” properties, and that functions with noticable large norm have useful “structure.” (Low-degree-polynomial-like structure.) Most proofs that make use of Gowers uniformity have a two-part structure: if the norm is small, we get what we want from the pseudorandomness; if the norm is large, we get what we want from the structure. In our work on PCP, for example, when we analyse the soundness, we can argue that if the PCP proof is “pseudorandom” then the verifier is about as likely to accept it as it is to accept a random proof (hence, very low probability), but a PCP proof with “structure” can be “decoded” to a valid witness (hence, this cannot happen in the soundness case).

Let’s get started with the definition. If

*f:{0,1}*^{n} -> {0,1}

is a Boolean function, and *y* is an element of *{0,1}*^{n}, let us define

*f*_{y} (x) := f(x) – f(x+y)

where all operations are mod 2 (or bit-wise mod 2). We think of *f*_{y} as a sort of “derivative” of *f*. In particular, if *f* is a constant then *f*_{y} is identically 0 for every *y*, and if *f* is a degree-d polynomial then *f*_{y} is a degree-(d-1) polynomial.

We also define *f*_{y,z} as *(f*_{y})_{z}, and so on.

Now, consider the following test of “non-constancy:” pick at random *x,y*, and check whether *f*_{y}(x)=0. If *f* is a constant, then the test accepts with probability 1; it is also possible to argue that the test accepts with probability that is always at least 1/2, and that it accepts with probability *1/2+eps* if and only if *f* has agreement *1/2 + eps’* with a constant function.

This may not be too exciting, but consider the next “logical” step, that is, pick *x,y,z* at random, and check if *f*_{y,z}(x)=0. If *f* is a degree-1 polynomial (an affine function), then the test accepts with probability 1. Also, it can be shown that the test accepts with probability at least 1/2. The interesting thing, which is easy to prove using Fourier analysis, is that the test accepts with probability noticeably larger than 1/2 if and only if *f* has agreement noticeably larger than 1/2 with a degree-1 function. Indeed, the test checks whether

* f(x) – f(x+y) -f(x+z) + f(x+y+z)=0*

and, up to a change of variables, this is the same as checking if

* f(x) + f(y) +f(z) = f(x+y+z)*

a slight variant of the Blum-Luby-Rubinfeld linearity test.

So you see where this is going: pick *x,x1,…,xk* at random and check if *f*_{x1,…,xk}(x) =0. If *f* is a degree-(k-1) polynomial then the test accepts with probability 1.

Alon, Kaufman, Krivelevich, Litsyn and Ron proved in Random’03 that if the test accepts with probability *1-eps* then *f* has agreement *1-O(2*^{k} eps) with a degree-(k-1) polynomial. Samorodnitsky, in a manuscript that predates our PCP work, proves the more challenging “low-end” result that, for k=3, if the test accepts with probability *1/2 + eps* then the function has agreement *1/2 + eps’* with a degree-2 polynomial.

The low-end result is open for k=4! Furthermore, for k=3, eps’ is exponentially small in eps. It is an open question to have a polynomial dependency.

So, what about these Gowers uniformity norms? Take the probability that the test with *x1…xk* accepts, subtract 1/2 from the probability, and multiply by 2, so you get a number between 0 and 1. This is the dimension-k Gowers uniformity of *f*; Alex and I use the notation U^{k}(f). The 2^{k}th root of the number I just described is the dimension-k Gowers uniformity norm of *f*, denoted ||f|| with subscript *U*^{k} (sorry, HTML does not allow subscripts of subscripts).

What’s with this subtracting 1/2 and multiplying by 2 business? The analytical study of Boolean functions is often cleaner if we think of them as functions

* f: {0,1}*^{n} -> {-1,1}

where bit *b* is replaced by bit *(-1)*^{b} in the output. With this notation, we can define

*f*_{y} (x) := f(x) * f(x+y)

and now the dimension-k uniformity can be given the following equivalent, but much cleaner, definition:

*U*^{k} (f) := **E**_{x,x1,…,xk} f_{x1…xk} (x)

Now the advantage is that the definition can be applied to any real-valued function (which is often useful), not just Boolean functions. The 2^{k}th root of *U*^{k} (f) is a norm for the space of real-valued functions of domain *{0,1}*^{n}. The definitions extends to functions *f: G-> R* where *G* is any Abelian group (we just need to re-interpret the + in the definition) and *R* are the real numbers. The definition can also be extended to complex-valued functions, with a small change to the definition of *f*_{y}.

With this notation, Alex proves that, for Boolean *f*, *U*^{3}(f) is noticeably large if and only if *f* has noticeable correlation with a degree-2 polynomial. Green and Tao, in their 72-page paper I referenced above, prove an analogous result for bounded functions *f:G->C* where *G* is an Abelian group of order not divisible by 2 or by 3, and *C* are the complex number. (If the group contains a large cyclic subgroup, then it is quite messy to even state the theorem. If *G* is of the form *F*^{n}_{q} where *n* is large compared with *q*, then the statement is simply about noticeable norm versus correlation with polynomials.)

And, to summarize again the hundreds of pages of mathematical knowledge we have about these norms, if the norm is small, the function is kind of pseudorandom; if the norm is not small, then the function is kind of “structured,” in a way that sort of resembles a low-degree polynomial. It is open to turn “sort of resembles” into “is correlated with.” (And it cannot be done if the domain of the function is a group with a large cyclic subgroup — in that case, there are more complicated conjectures.)