# Talagrand’s Generic Chaining

Welcome to phase two of in theory, in which we again talk about math. I spent last Fall teaching two courses and getting settled, I mostly traveled in January and February, and I have spent the last two months on my sofa catching up on TV series. Hence I will reach back to last Spring, when I learned about Talagrand’s machinery of generic chaining and majorizing measures from Nikhil Bansal, in the context of our work with Ola Svensson on graph and hypergraph sparsification. Here I would like to record what I understood about the machinery, and in a follow-up post I plan to explain the application to hypergraph sparsification.

# Lies, Damns Lies, and Herbert London

I am grading the final projects of my class, I am trying the clear the backlog of publishing all the class notes, I am way behind on my STOC reviews, and in two days I am taking off for a complicated two-week trips involving planes, trains and a rented automobile, as well as an ambitious plan of doing no work whatsoever from December 20 to December 31.

So, today I was browsing Facebook, and when I saw a post containing an incredibly blatant arithmetic mistake (which none of the several comments seemed to notice) I spent the rest of the morning looking up where it came from.

The goal of the post was to make the wrong claim that people have been paying more than enough money into social security (through payroll taxes) to support the current level of benefits. Indeed, since the beginning, social security has been paying individuals more than they put in, and now that population and salaries have stop growing, social security is also paying out retired people more than it gets from working people, so that the “trust fund” (whether one believes it is a real thing or an accounting fiction) will run out in the 2030s unless some change is made.

This is a complicated matter, but the post included a sentence to the extent that $4,500 a year, with an interest of 1% per year “compounded monthly”, would add up to$1,3 million after 40 years. This is not even in the right order of magnitude (it adds up to about $220k) and it should be obvious without making the calculation. Who would write such a thing, and why? My first stop was a July 2012 post on snopes, which commented on a very similar viral email. Snopes points out various mistakes (including the rate of social security payroll taxes), but the calculation in the snopes email, while based on wrong assumptions, has correct arithmetic: it says that$4,500 a year, with a 5% interest, become about $890k after 49 years. So how did the viral email with the wrong assumptions and correct arithmetic morph into the Facebook post with the same wrong assumptions but also the wrong arithmetic? I don’t know, but here is an August 2012 post on, you can’t make this stuff up, Accuracy in Media, which wikipedia describes as a “media watchdog.” The post is attributed to Herbert London, who has PhD from Columbia, is a member of the Council on Foreign Relation and used to be the president of a conservative think-tank. Currently, he has an affiliation with King’s College in New York. London’s post has the sentence I saw in the Facebook post: (…) an employer’s contribution of$375 per month at a modest one percent rate compounded over a 40 year work experience the total would be $1.3 million. The rest of the post is almost identical to the July 2012 message reported by Snopes. Where did Dr. London get his numbers? Maybe he compounded this hypothetical saving as 1% per month? No, because that would give more than$4 million. One does get about $1.3 million if one saves$375 a month for thirty years with a return of 1% per month, though.

Perhaps a more interesting question is why this “fake math” is coming back after five years. In 2012, Paul Ryan put forward a plan to “privatize” Social Security, and such a plan is now being revived. The only way to sell such a plan is to convince people that if they saved in a private account the amount of payroll taxes that “goes into” Social Security, they would get better benefits. This may be factually wrong, but that’s hardly the point.

# Ellenberg’s announcement of a solution to the cap-set problem

Jordan Ellenberg has just announced a resolution of the “cap problem” using techniques of Croot, Lev and Pach, in a self-contained three-page paper. This is a quite unexpected development for a long-standing open problem in the core of additive combinatorics.

Perhaps the right starting point for this story is 1936, when Erdos and Turan conjectured that, for every ${k}$, if ${A}$ is a subset of ${\{1,\ldots, N\}}$ without ${k}$-terms arithmetic progressions, then ${|A|= o_k(n)}$, or, equivalently, that if ${A}$ is a subset of the integers of positive density, then it must have arbitrarily long arithmetic progressions. Their goal in stating this conjecture was that resolving it would be a stepping stone to proving that the prime numbers have arbitrarily long arithmetic progressions. This vision came true several decades later. Szemeredi proved the conjecture in 1975, and Green and Tao proved that the primes contain arbitrarily long arithmetic progressions in 2004, with Szemeredi’s theorem being a key ingredient in their proof.

Rewinding a bit, the first progress on the Erdos-Turan conjecture came from Roth, who proved the ${k=3}$ case In 1955. Roth’s proof establishes that if ${A \subseteq \{ 1,\ldots, N\}}$ does not have length-3 arithmetic progressions, then ${|A|}$ is at most, roughly ${N/\log\log N}$. Erdos also conjectured that the bound should be ${o(N/\log N)}$, and if this were true it would imply that the primes have infinitely many length-3 arithmetic progressions simply because of their density.

Roth’s proof uses Fourier analysis, and Meshulam, in 1995, noted that the proof becomes much cleaner, and it leads to better bounds, if one looks at the analog problem in ${{\mathbb F}_p^n}$, where ${{\mathbb F}_p}$ is a finite field (of characteristic different from 2). In this case, the question is how big can ${A\subseteq {\mathbb F}_p^n}$ be if it does not have three points on a line. An adaptation of Roth’s techniques gives an upper bound of the order of ${p^n/n}$, which, for constant ${p}$, is of the order of ${N/\log N}$ if ${N}$ is the size of the universe of which ${A}$ is a subset.

Bourgain introduced a technique to work on ${{\mathbb Z}/N{\mathbb Z}}$ “as if” it where a vector space over a finite field, and proved upper bounds of the order of ${N/\sqrt {\log N}}$ and then ${N/(\log N)^{3/4}}$ to the size of a subset of ${\{ 1,\ldots , N\}}$ without length-3 arithmetic progressions. The latest result in this line is by Sanders, who proved a bound of ${(N poly\log\log N)/\log N}$, very close to Erdos’s stronger conjecture.

How far can these results be pushed? A construction of Behrend’s shows that there is a set ${A\subseteq \{ 1,\ldots, N\}}$ with no length-3 arithmetic progression and size roughly ${N/2^{\sqrt {\log N}}}$. The construction is simple (it is a discretization of a sphere in ${\sqrt {\log N}}$ dimensions) and it has some unexpected other applications. This means that the right bound in Roth’s theorem is of the form ${N^{1-o(1)}}$ and that the “only” question is what is the ${N^{-o(1)}}$ term.

In the finite vector space case, there is no analog of Behrend’s construction, and so the size of say, the largest subset of ${{\mathbb F}_3^n}$ without three points on a line, was completely open, with an upper bound of the order of ${3^n/n}$ and lower bounds of the order of ${c^n}$ for some constant ${c<3}$. The cap problem was the question of whether the right bound is of the form ${3^{(1-o(1)) }}$ or not.

Two weeks ago, Croot, Lev and Pach proved that if ${A}$ is a subset of ${({\mathbb Z}/4{\mathbb Z})^n}$ without length-3 arithmetic progressions, then ${|A|}$ is at most of the order of ${4^{.926 \cdot n}}$. This was a strong indication that the right bound in the cap problem should be sub-exponential.

This was done a couple of days ago by Ellenberg, who proved an upper bound of the form ${(2.756)^n}$ holds in ${{\mathbb F}_3^n}$. The proof is not specific to ${{\mathbb F}_3}$ and generalizes to all finite fields.

Both proofs use the polynomial method. Roughly speaking, the method is to associate a polynomial to a set of interest (for example, by finding a non-zero low-degree polynomial that is zero for all points in the set), and then to proceed with the use of simple properties of polynomials (such as the fact that the space of polynomials of a certain degree has a bounded dimension, or that the set of zeroes of a univariate non-zero polynomial is at most the degree) applied either to the polynomial that we constructed or to the terms of its factorization.

Let ${P_d}$ be the vector space of ${n}$-variate polynomials over ${{\mathbb F}_3}$ of total degree ${d}$ that are cube-free (that is, such that all variables occur in monomials with degree 0, 1, or 2), and let ${m_d}$ be its dimension.

If ${A\subseteq {\mathbb F}_3^n}$ is a set such that there are no distinct ${a,b,c}$ such that ${a+b+c=0}$ (a different property from being on a line, but the draft claims that the same argument works for the property of not having three points on a line as well), then Ellenberg shows that

$\displaystyle m_d - 3^n + |A| \leq 3 m_{d/2}$

then the bound follows from computing that ${m_{2n/3} \geq 3^n - c^n}$ and ${m_{n/3} \leq c^n}$ for ${c \approx 2.756\cdots}$.

The finite field Kakeya problem is another example of a problem that had resisted attacks from powerful Fourier-analytic proofs, and was solved by Zeev Dvir with a relatively simple application of the polynomial method. One may hope that the method has not yet exhausted its applicability.

Gil Kalai has posted about further consequence of the results of Croot, Lev, Pach and Ellenberg.

# Paul Erdös’s 102-ennial

Paul Erdös would be 102 year old this year, and in celebration of this the Notices of the AMS have published a two-part series of essays on his life and his work: [part 1] and [part 2].

Of particular interest to me is the story of the problem of finding large gaps between primes; recently Maynard, Ford, Green, Konyagin, and Tao solved an Erdös $10,000 question in this direction. It is probably the Erdös open question with the highest associated reward ever solved (I don’t know where to look up this information — for comparison, Szemeredi’s theorem was a$1,000 question), and it is certainly the question whose statement involves the most occurrences of “$\log$“.

# Harald Helfgott on Growth in Groups

The Bulletin of the AMS is going to publish a 57-page survey on growth in groups, which is already online, and which touches several topics of interest to readers of in theory, including the recent work of Bourgain and Gamburd on expander Cayley graphs of $SL_2(p)$ and the work of Helfgott and Seress on the diameter of permutation groups.

# Alexander Grothendieck

Alexander Grothendieck died on Thursday at age 86 in Saint-Girons.

Grothendieck, who started his work in functional analysis, is known for his far-reaching (and still incomplete) program of creating new foundations for algebraic geometry, a work that he led at IHES in the 50s and 60s and that is documented in the thousands of pages of EGA and SGA. If modern algebraic geometry is built on schemes, has applications to number theory, has definitions given in a category-theoretic language, and is completely incomprehensible to the novice, it is all thanks to Grothendieck’s vision.

In his 40s, and the top of his game, Grothendieck left IHES over the issue of military funding, and progressively detached from the mathematical community, while embracing environmental and anti-war causes.

Grothendieck’s life story, from his escaping Nazi Germany, to his revolution in mathematics, to his radical politics and later hermit life, is spellbinding. Some of it is told in a great two-part article in the Notices of the AMS (part 1, part 2) and I would also recommend this more technical essay and this more philosophical one.

Grothendieck has a flair for effective naming, he had a way with words, and he liked elaborate metaphors. Here is his famous “how to break a nut” analogy describing his style of mathematical research

I can illustrate the second approach with the same image of a nut to be opened. The first analogy that came to my mind is of immersing the nut in some softening liquid, and why not simply water? From time to time you rub so the liquid penetrates better, and otherwise you let time pass. The shell becomes more flexible
through weeks and months—when the time is ripe, hand pressure is enough, the shell opens like a perfectly ripened avocado!

# Old and new allegiances

The Italian prime minister is in the United States for the UN general assembly meeting, and he was in the San Francisco bay area on Sunday and Monday. (No, this is not the one who paid money to an underage prostitute and had she released from custody by falsely claiming she was the niece of Mubarak, it’s the new one.)

Those two days were busy, and he met Italian start-up founders in the area, went to a dinner at Stanford hosted by John Hennessy, presided the inauguration of a new Italian-language school, went to Twitter, Google, Facebook and Yahoo, and he met the local Italian research community.

For the last event, a few colleagues and I were asked to give a short presentation. Being not sure what to say to a prime minister, I asked a colleague who is the department chair at an Italian computer science department for some data on state funding of university research in computer science, and if there was a way to turn this data into a recommendation, and his response could be summarized as “we cannot be saved, there is no hope.” This might have made a good theme for a presentation, but instead I talked about the importance of fundamental research, and of working on ideas for which the technology is not ready yet, so that when the technology is ready the ideas are mature. Politicians are good at feigning attention when their mind is elsewhere, and he feigned it well.

Yesterday I was “interviewed” as part of the process to naturalize as an American citizen. Part of the interview is a test of the knowledge that an American is supposed to have. I liked to think that the officer would bring up a map of the world and ask me to point to France, then I would point to Brazil, and he would say, “great, now you are ready to be an American!” (Instead he asked how many US senators there are, when was the constitution written, and things like that.) The vibe was very different from any other interaction I have had with the American immigration system before; now it’s not any more “who are you, why are you stealing our jobs, and how do we know you are not a terrorist,” but it’s all “yay, you are going to be one us.”

# Riemann zeta functions and linear operators

A conjectural approaches to the Riemann hypothesis is to find a correspondence between the zeroes of the zeta function and the eigenvalues of a linear operator. This was first conceived by Hilbert and by Pólya in the 1910s. (See this correspondence with Odlyzko.) In the subsequent century, there have been a number of rigorous results relating the zeroes of zeta functions in other settings to eigenvalues of linear operators. There is also a connection between the distribution of known zeroes of the Riemann zeta function, as derived from explicit calculations, and the distribution of eigenvalues of the Gaussian unitary ensemble.

In a previous post we talked about the definition of the Ihara zeta function of a graph, and Ihara’s explicit formula for it, in terms of a determinant that is closely related to the characteristic polynomial of the adjacency matrix of the graph, so that the zeroes of the zeta function determine the eigenvalue of the graph. And if the zeta function of the graph satisfies a “Riemann hypothesis,” meaning that, after a change of variable, all zeroes and poles of the zeta function are ${1/\sqrt {d-1}}$ in absolute value, then the graph is Ramanujan. Conversely, if the graph is Ramnujan then its Ihara zeta function must satisfy the Riemann hypothesis.

As I learned from notes by Pete Clark, the zeta functions of curves and varieties in finite fields also involve a determinant, and the “Riemann hypothesis” for such curves is a theorem (proved by Weil for curves and by Deligne for varieties), which says that (after an analogous change of variable) all zeroes and poles of the zeta function must be ${\sqrt q}$ in absolute value, where ${q}$ is the size of the finite field. This means that one way to prove that a family of ${(q+1)}$-regular graphs is Ramanujan is to construct for each graph ${G_n}$ in the family a variety ${V_n}$ over ${{\mathbb F}_q}$ such that the determinant that comes up in the zeta function of ${G_n}$ is the “same” (up to terms that don’t affect the roots, and to the fact that if ${x}$ is a root for one characteristic polynomial, then ${1/x}$ has to be a root for the other) as the determinant that comes up in the zeta function of the variety ${V_n}$. Clark shows that one can constructs such families of varieties (in fact, curves are enough) for all the known algebraic constructions of Ramanujan graphs, and one can use families of varieties for which the corresponding determinant is well understood to give new constructions. For example, for every ${d}$, if ${k}$ is the number of distinct prime divisors of ${d-1}$ (which would be one if ${d-1}$ is a prime power) Clark gets a family of graphs with second eigenvalue ${2^k \sqrt{d-1}}$. (The notes call ${k}$ the number of distinct prime divisors of ${d}$, but it must be a typo.)

So spectral techniques underlie fundamental results in number theory and algebraic geometry, which then give us expander graphs. Sounds like something that we (theoretical computer scientists) should know more about.

The purpose of these notes is to explore a bit more the statements of these results, although we will not even begin to discuss their proofs.

One more reason why I find this material very interesting is that all the several applications of polynomials in computer science (to constructing error-correcting codes, secret sharing schemes, ${k}$-wise independent generators, expanders of polylogarithmic degree, giving the codes used in PCP constructions, self-reducibility of the permanent, proving combinatorial bounds via the polynomial method, and so on), always eventually rely on three things:

• A multivariate polynomial restricted to a line can be seen as a univariate polynomial (and restricting to a ${k}$-dimensional subspace gives a ${k}$-variate polynomial); this means that results about multivariate polynomials can often be proved via a “randomized reduction” to the univariate case;
• A univariate polynomial of degree ${d}$ has at most ${d}$ roots, which follows from unique factorization

• Given ${d}$ desired roots ${a_1,\ldots,a_d}$ we can always find an univariate polynomial of degree ${d}$ which has those roots, by defining it as ${\prod_i (x-a_i)}$.

This is enough to explain the remarkable pseudo-randomness of constructions that we get out of polynomials, and it is the set of principles that underlie much of what is below, except that we are going to restrict polynomials to the set of zeroes of another polynomial, instead of a line, and this is where things get much more complicated, and interesting.

Before getting started, however, I would like to work out a toy example (which is closely related to what goes on with the Ihara zeta function) showing how an expression that looks like the one defining zeta functions can be expressed as a characteristic polynomial of a linear operator, and how its roots (which are then the eigenvalues of the operator) help give bound to a counting problem, and how one can get such bounds directly from the trace of the operator.

If we have quantities ${C_n}$ that we want to bound, then the “zeta function” of the sequence is

$\displaystyle z(T) := \exp\left( \sum_n \frac {C_n}{n} T^n \right)$

For example, and this will be discussed in more detail below, if ${P(\cdot,\cdot)}$ if a bivariate polynomial over ${{\mathbb F}_q}$, we may be interested in the number ${C_1}$ of solutions of the equation ${P(x,y) = 0}$ over ${{\mathbb F}_q}$, as well as the number of solutions ${C_n}$ over the extension ${{\mathbb F}_{q^n}}$. In this case, the zeta function of the curve defined by ${P}$ is precisely

$\displaystyle z(T) = \exp\left( \sum_n \frac {C_n}{n} T^n \right) \ \ \ \ \ (1)$

and Weil proved that ${z(\cdot)}$ is a rational function, and that if ${\alpha_1,\ldots,\alpha_{2g}}$ are the zeroes and poles of ${z()}$, that is, the roots of the numerator and the denominator, then they are all at most ${\sqrt q}$ in absolute value, and one has

$\displaystyle | C_n - q^n | \leq \sum_i |\alpha_i|^n \leq 2g q^{n/2} \ \ \ \ \ (2)$

How does one get an approximate formula for ${C_n}$ as in (2) from the zeroes of a function like (1), and what do determinants and traces have got to do with it?

Here is our toy example: suppose that we have a graph ${G=(V,E)}$ and that ${c_\ell}$ is the number of cycles of the graph of length ${\ell}$. Here by “cycle of length ${\ell}$” we mean a sequence ${v_0,\ldots,v_\ell}$ of vertices such that ${v_0=v_\ell}$ and, for ${i=0,\ldots,\ell-1}$, ${\{ v_i,v_{i-1} \} \in E}$. So, for example, in a graph in which the vertices ${a,b,c}$ form a triangle, ${a,b,c,a}$ and ${a,c,b,a}$ are two distinct cycles of length 3, and ${a,b,a}$ is a cycle of length 2.

We could define the function

$\displaystyle z(T) := \exp\left( \sum_\ell \frac {c_\ell}{\ell} T^\ell \right)$

and hope that its zeroes have to do with the computation of ${c_\ell}$, and that ${z}$ can be defined in terms of the characteristic polynomial.

Indeed, we have (remember, ${T}$ is a complex number, not a matrix):

$\displaystyle z(T) = \frac 1 {\det (I - TA) }$

and, if ${\alpha_1,\ldots,\alpha_n}$ are the zeroes and poles of ${z(T)}$, then

$\displaystyle c_\ell \leq \sum_i |\alpha_i|^{-\ell}$

How did we get these bounds? Well, we cheated because we already knew that ${c_\ell = {\rm Tr}(A^\ell) = \sum_i \lambda_i^\ell}$, where ${\lambda_i}$ are the eigenvalues of ${A}$. This means that

$\displaystyle z(T) = \exp \left( \sum_\ell \sum_i \frac {\lambda_i^\ell}{\ell} T^\ell \right)$

$\displaystyle = \prod_i \exp \left( \sum_\ell \frac {\lambda_i^\ell}{\ell} T^\ell \right)$

$\displaystyle = \prod_i \frac 1 {1- T\lambda_i }$

$\displaystyle = \frac 1 { \det(I - TA) }$

Where we used ${\frac 1 {1-x} = \exp\left( \sum_n \frac { x^n}{n} \right)}$. Now we see that the poles of ${z}$ are precisely the inverses of the eigenvalues of ${A}$.

Now we are ready to look more closely at various zeta functions.

I would like to thank Ken Ribet and Jared Weinstein for the time they spent answering questions. Part of this post will follow, almost word for word, this post of Terry Tao. All the errors and misconceptions below, however, are my own original creation.