# Yidali wan sui!

After Italy scored the overtime goal against Australia on Monday, CCTV commentator Huang Jianxiang went beserk, shouting 马尔蒂尼今天生日快乐, happy birthday Maldini, 意大利万岁 long live (literally, “live 10,000 years”) Italy, and so on. We share the sentiment. He later had to apologize.

At YouTube there is an extended transcript (in English).

By the way, is it Go West by the Pet Shop Boys playing at the end in the background?

Update: as the commenters correctly point out, Go West is by the Village People, and the Pet Shop Boys version is a later cover. Shame on me. To atone, here is the Village People video:

# Not one of those girlie blogs

Microsoft Research has developed a tool to measure how butch is a web site. (Via Doug Tygar.) In Theory scores 54% male. It must be all those manly posts about math. (Although I have to agree with the first comment here.)

# Property testing and Szemeredi’s Theorem

After discussing Szemeredi’s Theorem and the analytical approaches of Roth and Gowers, let’s see some ideas and open problems in the combinatorial approach.

The following result is a good starting point.

Triangle Removal Lemma. For every $\delta >0$ there is a constant $\epsilon(\delta) > 0$ such that if $G$ is an $n$-vertex graph with at most $\epsilon(\delta)n^3$ triangles, then it is possible to make $G$ triangle-free by removing at most $\delta n^2$ edges.

This result follows easily from the Szemeredi Regularity Lemma, and it is a prototype of several results in the field of property testing.

Indeed, consider the problem of distinguishing triangle-free graphs from graphs that are not even close to being triangle-free. For the purpose of this example, we think of a graph as “close to being triangle-free” if it can be made triangle-free by removing at most $\delta n^2$ edges, for some small constant $\delta > 0$. Then the Lemma tells us that in a not-even-close graph we have at least $\epsilon(\delta) n^3$ triangles, and so a sample of $O(1/\epsilon(\delta))$ vertices is likely to contain a triangle. So here is an algorithm for the problem: pick at random $O(1/\epsilon(\delta))$ vertices and see if they induce a triangle. We note that:

• The algorithm satisfies our requirement;
• The running time is a constant independent of $n$ and dependent only on $\delta$;
• this all makes sense only if $1/\epsilon(\delta)$ grows moderately as a function of $1/\delta$.

Let us now see that the Triangle Removal Lemma proves Szemeredi’s Theorem for sequences of length 3. As we discussed earlier, it is enough to prove the Theorem in groups of the type ${\mathbf Z}_N$, with $N$ prime; we will do better and prove the Theorem in any abelian group. So let $G$ be an abelian group of size $N$, an let $A$ be a subset of $G$ of size $\delta N$. Construct a tri-partite graph $(X,Y,Z,E)$ where $X,Y,Z$ are sets of vertices, each of size $N$, and each a copy of the group $G$. The set of edges is defined as follows:

1. for $x \in X$ and $y \in Y$, $(x,y)$ is an edge if there is an $a \in A$ such that $x+a=y$
2. for $x \in X$ and $z \in Z$, $(x,z)$ is an edge if there is a $b\in A$ such that $x + b + b = z$
3. for $y \in Y$ and $z \in Z$, $(y,z)$ is an edge if there is a $c \in A$ such that $y+c=z$

Now we notice that if $x,y,z$ is a triangle, then there are $a,b,c$ in $A$ such that (after some cancellations) $a+c = b+b$, which means that $a,b,c$ is an arithmetic progression of length 3 in the group $G$. In fact, we see that the number of triangles in the graph is precisely $N$ times the number of triples $(a,b,c)$ in arithmetic progression in $A$.

Consider now the $N \cdot |A| = \delta N^2$ triangles corresponding to the “trivial” progressions of the form $a,a,a$. (These are the triangles $x,x+a,x+a+a$.) We can see that these triangles are edge-disjoint, and so in order just to remove such triangles we have to remove from the graph at least $\delta N^2$ edges. So the Triangle Removal Lemma implies that there are at least $\epsilon(\delta)N^3$ triangles in the graph, and so at least $\epsilon(\delta)N^2$ triples in arithmetic progression in $A$. If $N > \delta/\epsilon(\delta)$, then some of those arithmetic progressions must be non-trivial, and we have the length-3 case of Szemeredi’s theorem.

We see that both for the “property testing” application and for the Szemeredi Theorem application it is important to have good quantitative bounds on $\epsilon(\delta)$. We know that, in Szemeredi’s theorem, $N$ must be super-polynomial in $1/\delta$, and so, in the Triangle Removal Lemma, $1/\epsilon(\delta)$ must be super-polynomial in $1/\delta$.

The known bounds, however, are quite unreasonable: we only know how to bound $1/\epsilon(\delta)$ by a tower of exponentials whose height is $poly(1/\delta)$. Unfortunately, such unreasonable bounds are unavoidable in any proof of the Triangle Removal Lemma based on Szemeredi’s Regularity Lemma. This is because, as proved by Gowers, such tower-of-exponentials bounds are necessary in the Regularity Lemma.

Can we find a different proof of the Triangle Removal Lemma that has only a singly- or doubly-exponential $\epsilon(\delta)$? That would be wonderful, both for property testing (because the proof would probably apply to other sub-graph homomorphism problems) and for additive combinatorics. Over the last year, I have found at least three such proofs. Unfortunately they were all fatally flawed.

Or, is there a more-than-exponential lower bound for the Triangle Removal Lemma? This would also be a major result: it would show that several results in property testing for dense graphs have no hope of being practical, and that there is a “separation” between the quantitative version of Szemeredi’s Theorem provable with analytical methods versus what can be proved with the above reduction. Besides, it’s not often that we have super-exponential lower bounds for natural problems, with Gowers’s result being a rare exception.

By the way, what about Szemeredi’s Theorem for longer sequences? For sequences of length $k$ one needs a “$k$-clique removal lemma” for $(k-1)$-uniform hypergraphs, which in turn can be derived from the proper generalization of the Regularity Lemma to hypergraphs. This turns out to be quite complicated, and it has been accomplished only very recently in independent work by Nagle, Rödl and Schacht; and by Gowers. An alternative proof has been given by Tao. The interested reader can find more in expository paper by Gowers and Tao.

What about Szemeredi’s own proof? It does use the Regularity Lemma, which was conceived and proved specifically for this application, and it involves a reduction to a graph-theoretic problem. I have to admit that I don’t know much else.

# More on the Poincare conjecture

The Strings 2006 conference is under way in Beijing, and yesterday Field Medalist Yau Shing-Tung gave a popular talk on the Poincare conjecture. Yau’s talk was preceded by taped remarks by Richard Hamilton. A transcript of Hamilton’s remarks is here, the People’s Daily article is here and a first-hand account is here.

# The True North strong and free

The 30th Frameline film festival is under way. It can exist, and be such a big production, thanks to the contributions of the Frameline members and to the major sponsors. In addition, each screening has its own sponsor. The movie I saw today was sponsored by … Canada!

No kidding.

This is not an isolated act of kindness. The Frameline programmer who introduced the movie said that, over time, the Canadian government has contributed more than the US federal government to the festival.

I have two things to say: (1) foreign aid to needy countries is very noble; (2) I blame the electoral college.

# In Theory: non-judgemental for five more years

Found today in a fortune cookie:

At 20 years of age the will reigns; at 30 the wit; at 40 the judgement.

# My gut is better than yours

Scott complains that “art snobs” want to have it both ways: to assert that the beauty of art is an aesthetic experience, a gut feeling, and, at the same time, that certain tastes are more valid than others. What is this, do they think that their guts are better than ours? That’s certainly not how we do things in complexity theory. Or is it?

When we talk about the “beauty” of a theorem or of a proof, we rarely refer to the literal statement of the theorem, much less to the fact that the proof is correct.

The beauty of a theorem is typically found in the way it fits into the bigger fabric of a theory, how it is explained by, and how it explains, other results. The statement of a theorem can feel comforting, surprising, or even unsettling, or worrisome. In a proof, we appreciate economy, a way of getting to the point in what feels like the “right” way, an unexpected use of techniques, a quick turn and a surprising ending.

If we think of, say, Reingold’s proof that L=SL, we (meaning, I and some other people) appreciate the statement for being a major milestone in the program of derandomization, and we appreciate the proof for the simplicity of its structure and for the cleverness of the way its technical tools are used.

Although, in the end, it is a matter of taste to see that the statement is important and that the proof is beautiful, I don’t think that the taste of the expert in the area is equally valid as the taste of the non-expert who says, oh this is an algorithm for connectivity that runs in n100 time and it does not even work on directed graphs?

(I am so delighted to have a discussion where one can take the notion of mathematical beauty for granted, and then use it to argue about artistic beauty.)

This wouldn’t be a non-technical In Theory post without an unnecessary personal story, so let me conclude with my experience with modern dance.

At one point, a few years ago, I was taken to see several modern dance performances. My first experience was not unlike Woody Allen’s in Small Time Crooks. While the sound system was playing annoying and discordant sounds, people on stage were moving around in what looked like a random way. I was convinced that the nearly sold-out audience at Zellerbach (it’s a big theater) was there simply to feel good about themeselves and nobody could really like that stuff. There was something that puzzled me, however: at one point, everybody laughed, presumably because something (intentionally) funny happened in the choreography. Everybody got it (except me, of course), so perhaps there was something going on in those random movements, after all. After a few more shows, the experience started to feel less agonizing, and I started to notice that I would like some segments better than others, and that this would agree with what others thought. Finally, one night, I laughed out when a dancer did something really funny on stage, and so did the rest of the audience. Aha, the brainwashing had succeeded! I left the theater feeling good about myself… No, wait, this is so not the point I wanted to make…

# Polynomials and subspaces

My latest post on Szemeredi’s theorem was long and rambling, mostly because I had no idea what I was talking about. There are, however, a couple of neat questions, and I am afraid they got lost in the rambling, so let me state them again.

1. It is either known or probably derivable from known proof techniques that

If f:{0,1}^n -> {0,1} has agreement 1/2+eps with a degree-k polynomial, then there is an affine subspace V of {0,1}n, of dimension n/c, and a linear function L, such that f and L have agreement 1/2 + eps’ when restricted to V. Here c=c(k) is a constant that depends only on k and eps’=eps'(eps,k) is a constant that depends only on eps and on k.

This is somewhat surprising, even if you look at the special case of a function f that is a polynomial (as opposed to being correlated with a polynomial).

Is there a simple proof of this result? Is the result true if you replace “subspace” by “restriction”? (Meaning, you are only allowed to fix all but a constant fraction of the variables.) Shouldn’t this be useful in complexity theory somewhere?

[Update 7/14/06: the degree-2 case has indeed a very simple proof. A more general result is proved here, in section 7. For higher degree, the result is actually false, in the sense that it is impossible to find a subspace of dimension constant*n. Avi Wigderson and I have found a counterexample already for degree 3, and Ben Green has sharpened our lower bound. The best one can do for degree-k polynomials is to find a subspace of dimension about n(1/(k-1)).]

2. If f has agreement 1/2+eps with a linear function L in a subspace V of dimension n-t, then there is a linear function L’ such that f and L’ agree on a fraction at least 1/2+eps/2t of all of {0,1}n

This is fairly easy to see.

3. This is a fascinating challenge: find an simple proof (or any proof that fits in fewer than, say, 20 pages) that if f:{0,1}n->{-1,1} has dimension-k Gowers uniformity at least eps, then there is a linear subspace V of dimension n/c and a linear function L, such that f and L have agreement at least 1/2+eps’ on V. (Where c=c(k) and eps’=eps'(eps,k).)

Chances are that such a proof could be turned into a simpler proof of Gowers’s quantitative version of Szemeredi’s theorem. (A proof of the above result for boolean functions can probably be “translated back” from Gowers’s paper, and it would probably be at least 50-60 pages. As I understand it, such a proof would first argue that f is correlated with a polynomial in a large subspace, and then proceed with (1) above. I wonder: can one prove such a statement without mentioning polynomials at all?)

4. Unfortunately the conclusion cannot be strenghtened to “… there is a linear subspace V of dimension n-o(n)…”. Consider

f(x1,…,x1):= x1*x2+…+ xn-1*xn mod 2

Its dimension-3 Gowers uniformity is 1, but, if you could find a subspace V of dimension n-o(n) on which f agrees with a linear function on a 1/2+eps’ fraction of inputs, then you could also find a linear function that agrees with f on a 1/2+eps/2o(n) fraction of inputs. But we know that no linear function agrees with f on more than a 1/2+1/2n/2 fraction of inputs.