Breaking News!

You may remember that, in July, Italy won the football Word Cup. No? Best defense ever? Materazzi? Headbutt? 意大利万岁? Anybody?

Anyways, I was in Rome that night, I borrowed a friend’s camera, and I took some pictures on the street. I gave the camera back, and he told me he would email me the pictures “in a few days.” I got the pictures today, which is what he meant. I suppose anybody who has sent me a paper to referee, only to be assured I would send the review “in a few days,” is now nodding knowingly.

I did not know how the settings of the camera worked, it was night, we were never able to stop the car (except in traffic), so the pictures are dark and shaky, then the battery run out just when we got to the center, plus I ran out of gas, I had a flat tire, I didn’t have enough money for cab fare, my tux didn’t come back from the cleaners, an old friend came in from out of town, someone stole my car, there was an earthquake, a terrible flood, LOCUSTS!

Having dispensed with the excuses, in the interest of timely dissemination here are some of the pictures.

We start driving from Monte Sacro, a neighborhood in the North-East of the city, about 5 miles away from the center. It’s less than two hours since the game is over, and a newspaper kiosk is selling day-after newspapers with chronicles of the game.

In this much time they wrote the articles, printed the papers, and got them all over the city, which is, of course, completely gridlocked. This shows that when something is really important, Italians can be efficient. (No, I don’t know why there is advertising for Newsweek in a Monte Sacro newspaper kiosk.)

After more than an hour, we get to the Muro Torto, the wide road (with tunnels) that runs along historical walls and goes toward Piazza del Popolo.

Almost all the traffic is, of course, in the direction toward the center, which is where we are trying to go.

This gentleman has “W la fica” writeen on his chest. (Sorry, no translation.)

Since the traffic is not moving, one guy has the time to get out of his car and climb on top of a truck.

Then there is the group of guys running around in tighty whiteys.

This guy, instead, is jumping up and down on a Mercedes S-series. Note that he removed his shoes, so there is not risk of damaging the car.

The friend behind him, instead, his standing on the windshield. The Germans sure know how to make sturdy cars.

Then there is the Zidane coffin.

The lady in the red car is not showing a lot of enthusiasm. Seven people have fit into this small Citroen cabrio.

Note again the serious lady in the red car, and the fact that nobody is driving the Citroen. Carrying open alchoolic beverages in a car is actually legally in Italy. Driving this way, however, is allowed only on special occasions.

This is a Fiat 500, the car on which I learned how to drive. (No, I am not that old, it was a used car!)

This is the closest I got to taking a picture of Piazza del Popolo.

Catenaccio (heavy chain – the kind used to lock a gate) is the term used to define Italy’s defense.


Slaughtering the Cow to Get the Butter

Americans are fond of rankings and lists, and the end of the year is the time when you see top ten this and worst seven that wherever you look.

Media Matters has compiled a list of the 11 most outrageous comments of 2006 by right-wing commentators.

There is, for example, nationally syndacated obese radio host Rush Limbaugh pointing out that obesity is more prevalent in heavily Republican states and least prevalent in heavily Democratic ones, thus showing that obesity is caused by leftist liberal policies. But the best part is when he explains that, in dealing with the poor, the government is not teaching the poor how to slaughter the cow to get the butter, it just gives them the butter. Just listen to it.

Among the honorable mentions, nationally syndacated radio host Neal Boortz commented on the hairstyle of Representative Cynthia McKinney’s of Georgia as follows: “She looks like a ghetto slut.” Representative McKinney is black.

It is, however, the screen captions on Fox News which are the most hilarious.

Indonesia, Papua New Guinea, and Berkeley

This is were the notable earthquakes of the last few days have taken place, according to the US geological survey web site. And Berkeley had the distinction of three earthquakes in four days, as reported, among other sources, in the People’s Daily.

It must be especially annoying if you happen to be living on a tree.

An associated press article is so very reassuring:

But the minor earthquakes should not be interpreted as omens of a more destructive one to come, said Jessica Sigala, a geophysicist with the National Earthquake Information Center.

“It could mean there’s something coming, it could mean there’s nothing coming,” Sigala said.

Merry Christmas! Happy Holydays!

[Update 12/27: now that a real earthquake has hit Taiwan, this post sounds especially needless and petulant.]

Characters and Expansion

In which we look at the representation theory of Abelian groups and at the discrete Fourier transform, we completely understand the eigenvalues and eigenvectors of the Cayley graph of an Abelian group, and we apply this powerful theory to prove that the cycle is a connected graph.

When we talked about group representations, we saw how to map each element of a group into a matrix in such a way that group addition becomes matrix multiplication.

In general, if G is a finite group, then an n-dimensional representation of G is a map

\rho: G \rightarrow {\mathbb C}^{n\times n}

such that \rho is a group homomorphism between G and the group of matrices \rho(G). That is, we have that \rho(g) is an invertible matrix for every g, \rho(0)=I, \rho(a+b) = \rho(a) \times \rho(b) and \rho(-g) = (\rho(g))^{-1}.

(Scott will be delighted to know that the theory need not stop at finite matrices, and one may also look at representations mapping group elements into invertible linear transformations over an infinite-dimensional Hilbert space. To study finite groups, however, finite-dimensional representations are good enough.)

Note that \rho need not be injective, and so, for every group, we can always define the 1-dimensional representation \rho(g) = 1 for every g. Perhaps unsurprisingly, this is usually called the trivial representation. The precise statement of a theorem of Frobenius that we mentioned earlier is that every non-trivial representation of PSL(2,p) has dimension at least (p-1)/2.

In an Abelian group, however, there are plenty of 1-dimensional representations, and, in fact, all the information about the group is “encoded” in the set of its 1-dimensional representations. We will call a 1-dimensional representation a character of the group. (This equivalence holds only for finite groups, otherwise we need an extra condition.) Hence a character of a group G is a mapping

\chi: G \rightarrow {\mathbb C}

such that \chi(g) \neq 0 for every g, \chi(0)=1, \chi(a+b) = \chi(a)\cdot \chi(b) and \chi(-g) = (\chi(g))^{-1}. In a finite group G, we always have the property that if we add an element to itself |G| times we get 0; from this we can deduce that, if \chi() is a character, then \chi(g) must be a |G|-th root of 1, that is, it must be a number of the form e^{2\pi x i/|G|} for some x. In particular, |\chi(g)|=1 for every g.

(For general groups, we say that \chi is a character if it is a 1-dimensional representation and it has the property that |\chi(g)|=1 for every g. While the latter property is a consequence of the former for finite groups and, more generally, for torsion groups, this is not the case for other groups. For example, \chi(x) := e^{x} is a 1-dimensional representation for \mathbb Z but it is not a character.)

At first sight, it might look like a finite group might have an infinite number of characters (or at least that’s what I thought when I first saw the definition). There is however a simple argument that shows that a finite group G cannot have more than |G| characters.

Consider the set of all functions of the form f: G \rightarrow {\mathbb C}; we can think of it as a |G|-dimensional vector space over \mathbb C. We have that the characters of G are linearly independent, and so there cannot be more than |G| of them. In fact, even more is true: the characters of G are orthogonal to each other. For two functions f,h, define their inner product as

(f,g) := \frac1 {|G|} \sum_{a\in G} f(a) \overline{g(a)}

Then, if \chi and \eta are two different characters, we have

(\chi,\eta) = \frac1 {|G|} \sum_{a\in G} \chi(a) \overline{\eta(a)}=0

The normalization in the definition of inner product is convenient because then we also have (\chi,\chi)=1 if \chi is a character. (We will not prove these claims, but the proofs are very simple.)

We will now show that the cyclic group {\mathbb Z}_N has indeed N distinct characters \chi_0,\ldots,\chi_{N-1}. This means that this set of characters forms an orthonormal basis for the set of functions f: {\mathbb Z}_N \rightarrow {\mathbb C}, and that each such function can be written as a linear combination

f(x) = \sum_{c=0}^{N-1} \hat f(c) \chi_c (x) (*)
where the coefficients \hat f(c) can be computed as the inner product
\hat f(c) = (f,\chi_c)

(The equation (*) is the Fourier expansion of f, and the function \hat f() is called the Fourier transform of f.)

Here is how to define the characters \chi_c: for each c, define \chi_c(x):= e^{2\pi c x /N}. It is immediate to verify that each such function is a character, and that, for c=0,\ldots,N-1 we are, indeed, defining N different functions. By the previously mentioned results, there is no other function.

Let’s see one more example: consider ({\mathbb Z}_2)^n, that is, \{0,1\}^n with the operation of bit-wise XOR. For each {\mathbf a} = (a_1,\ldots,a_n)\in \{0,1\}^n, define the function

\chi_{\mathbf a} (x_1,\ldots,x_n) := (-1)^{\sum_i a_i x_i }
We can again verify that all these 2^n functions are distinct, that each of them is a character, and so that they include all the characters of ({\mathbb Z}_2)^n. The reader should be able to see the pattern and to reconstruct the N_1\times \cdots N_k characters of the group
{\mathbb Z}_{N_1} \times \cdots \times {\mathbb Z}_{N_k}
thus having a complete understanding of the set of characters of any given finite Abelian group.

At long last, we can get to the subject of expansion.

Let A be a finite Abelian group, and S be a set of generators such that if a\in S then -a\in S. Definite the Cayley graph G=(A,E) where every element of A is a vertex and where (a,b) is an edge if a-b\in S. This is clearly an |S|-regular undirected graph. Let \chi_1,\ldots,\chi_{|A|} be the set of characters of A. Think of each character \chi_i as an |A|-dimensional vector. (The vector that has the value \chi_i(a) in the a-th coordinate.) Then these vectors are orthogonal to each other. We claim that they are also eigenvectors for the adjacency matrix of G.

Note a surprising fact here: the graph G depends on the group A and on the choice of generators S. The eigenvectors of the adjacency matrix, however, depend on A alone. (The eigenvalues, of course, will depend on S.)

The proof of the claim is immediate. Let M be the adjacency matrix of G, then the b-th entry of the vector \chi_i \times M is

\sum_{a\in A} \chi_i(a) M(a,b) = \sum_{s\in S} \chi_i(b+s) =
\sum_{s\in S} \chi_i(b)\chi_i (s) = \chi_i (b) \sum_{s\in S} \chi_i(s)
We are done! The vector \chi_i is indeed an eigenvector, of eigenvalue \sum_{s\in S} \chi_i(s). Since we have |A| eigenvectors, and they are linearly independent, we have found all eigenvalues, with multiplicities.

Let’s look at the cycle with N vertices. It is the Cayley graph of {\mathbb Z}_N with generators \{-1,1\}. We have N eigenvalues, one for each c=0,\ldots,N, and we have
\lambda_c = \chi_c(-1) + \chi_c(1) = e^{-2\pi c i/N} +  e^{-2\pi c i/N}
this may seem to contradict our earlier claim that all eigenvalues were going to be real. But two lines of trigonometry show that, in fact,
\lambda_c = 2 \cos (2\pi c/N)
As expected in a 2-regular graph, the largest eigenvalue is \lambda_0 = 2\cos(0)=2. The second largest eigenvalue, however, is 2\cos(2\pi /N), which is 2-\Theta(1/N^2). This means that the expansion of the cycle is at least \Omega(1/N^2) and, in particular, the cycle is a connected graph!

(After we are done having a good laugh at the expense of algebraic graph theory, I should add that this calculation, plus some relatively easy facts, implies that a random walk on the cycle mixes in O(N^2) steps. Not only this is the right bound, but it is also something that does not have a completely straightforward alternative proof.)

We do a little better with the hypercube. The hypercube is the Cayley graph of the group ({\mathbb Z}_2)^n with the generator set that contains all the vectors that have precisely one 1. Now we have one eigenvalue for each choice of (a_1,\ldots,a_n), and the corresponding eigenvalue is
\sum_{i=1}^n (-1)^{a_i}.

The largest eigenvalue, when all the a_i are zero, is n, as expected in an n-regular graph. The second largest eigenvalue, however, is at most n-2. This means that the expansion of the hypercube is at least 1. (As it happens, it is precisely 1, as shown by a dimension cut.)

Applied Philosophy

This weekend the New York Times magazine has an article by Princeton philosopher Peter Singer, on the subject of philantropy, poverty and ethics.

Singer is known for his position that all human lives have the same value, a position that sounds entirely uncontroversial until one starts to explore its ramifications. Consider for example the case of real estate investor Zell Kravinsky.

Kravinsky gave almost all of his \$45 million real estate fortune to health-related charities, retaining only his modest family home in Jenkintown, near Philadelphia, and enough to meet his family’s ordinary expenses. After learning that thousands of people with failing kidneys die each year while waiting for a transplant, he contacted a Philadelphia hospital and donated one of his kidneys to a complete stranger.

[…] Kravinsky has a mathematical mind — a talent that obviously helped him in deciding what investments would prove profitable — and he says that the chances of dying as a result of donating a kidney are about 1 in 4,000. For him this implies that to withhold a kidney from someone who would otherwise die means valuing one’s own life at 4,000 times that of a stranger, a ratio Kravinsky considers “obscene.”

I have to say that I have never found Utilitarianism convincing (and you don’t want to get an Utilitarian started with his mental experiments), even though I admit that there is hardly any known better premise on which to reason about Ethics.

The article has a lot of interesting information, including the following argument, which I had never seen before in these terms:

Thomas Pogge, a philosopher at Columbia University, has argued that at least some of our affluence comes at the expense of the poor. He bases this claim not simply on the usual critique of the barriers that Europe and the United States maintain against agricultural imports from developing countries but also on less familiar aspects of our trade with developing countries. For example, he points out that international corporations are willing to make deals to buy natural resources from any government, no matter how it has come to power. This provides a huge financial incentive for groups to try to overthrow the existing government. Successful rebels are rewarded by being able to sell off the nation’s oil, minerals or timber.

In their dealings with corrupt dictators in developing countries, Pogge asserts, international corporations are morally no better than someone who knowingly buys stolen goods […]

Zeilberger on why P differs from NP

Scott Aaronson, Lance Fortnow and Bill Gasarch have discussed the reasons why they believe P differs from NP here, here and here.

Doron Zeilberger, motivated by Avi Wigderson’s talk in Madrid devotes his latest opinion to the subject. (Via Luca A.)

His dismissal of the creativity cannot be mechanized argument is based on his long-standing belief (which I share) that human creativity, especially human mathematical creativity can in fact be emulated by algorithms and that, in the long run, algorithms will end up being superior. I think this is a misunderstanding of the argument, whose point is rather that creating something good (by humans and by algorithms) seems to require much more effort than appreciating something good, and that there are levels of “genius” which we would be able to recognize if we saw them but that we are prepared to consider infeasible. In the end, this is the same as the a fool can ask more questions than a wise man can answer argument that Zeilberger himself proposes.

Then there is the issue of whether it is fair to say that “P $\neq$ NP is a statement that affirms its own intractability.” Indeed, the P versus NP question is a statement about asymptotics, while proving it is a problem of finite size.

I have two observations.

One is that the “natural proofs” results show that, assuming strong one-way functions exist (an assumption in the “ballpark” of P $\neq$ NP) there are boolean functions that are efficiently computable but have all the efficiently computable properties of random functions. This means that any circuit lower bound proof must work in a way that either would fail when applied to random functions (and there are reasons why it is difficult to come up with such proofs) or would rely on hard-to-compute properties of the function in question. So although the proof is a finite object, it does define an “algorithm” (the one that describes the properties of the function that are used in the proof) and such algorithm cannot be asymptotically efficient.

The other is that, however cleaner our theories are when formulated asymptotically, we should not lose sight of the fact that the ultimate goals of complexity theory are finite results. It will be a historic milestone when we prove that P $\neq$ NP by showing that SAT requires time at least $2^{-300} \cdot n^{(\log n) / 100 }$, but the real deal is to show that there is no circuit of size less than $2^{300}$ that solves SAT on all formulae having 10,000 clauses or fewer. The statements that we care about are indeed finite.

Russian humor

The January issue of the Notices of the AMS is out, and it contains an article by Anatoly Vershik on the Clay Millenium prize. You may remember a discussion we had on the wisdom of awards in general, in the context of the AMS idea of creating a fellows program and of the pros and cons of having a best paper award at STOC and FOCS.

Vershik returns to some of the standard points in this debate, and makes a few new ones. Although the tone of the article is completely serious, there are hints of deadpan humor (especially in the way he characterizes the opinions of others).

There is really no connection, but I was reminded of Closing the collapse gap, the hilarious presentation by Dmitry Orlov (found via the Peking Duck), in which he argues that when the U.S. economy will collapse, we will be much less prepared than the Russians were, and we will be in much deeper troubles. As stated in the comments at the Peking Duck, when Russians are deadly serious about something, they deal with it through dark humour.

Eigenvalues and expansion

Before trying to describe how one reasons about the eigenvalues of the adjacency matrix of a Cayley graph, I would like to describe the “easy direction” of the relationships between eigenvalues and expansion.

First, a one-minute summary of linear algebra. If A is an n\times n matrix, then a (column) non-zero vector x is an eigenvector of A provided that, for some scalar \lambda, we have Ax= \lambda x. If so, then \lambda is an eigenvalue of A. A couple of immediate observations:

  • The equation Ax=\lambda x can be rewritten as (A-\lambda I)x = 0, so if \lambda is an eigenvalue of A then the columns of the matrix A-\lambda I are not linearly independent, and so the determinant of A-\lambda I is zero. But we can write det(A-\lambda I) as a degree-n polynomial in the unknown \lambda, and a degree-n polynomial cannot have more than n roots.

    So, an n\times n matrix has at most n eigenvalues.

  • If \lambda is an eigenvalue of A, then the set of vectors x such that Ax=\lambda x is a linear space. If \lambda_1 and \lambda_2 are different eigenvalues, then their respective spaces of eigenvectors are independent (they only intersect at 0).

If A is a symmetric real-valued matrix, then a series of miracles (or obvious things, depending on whether you have studied this material or not) happen:

  • All the roots of the polynomial det(A-\lambda I) are real; so, counting multiplicities, A has n real eigenvalues.
  • If \lambda is an eigenvalue of A, and it is a root of multiplicity k of det(A-\lambda I), then the space of vectors x such that Ax=\lambda x has dimension k.
  • If \lambda_1 \neq \lambda_2 are two different eigenvalues, then their respective spaces of eigenvectors are orthogonal

From this fact, it is easy to conclude that starting from symmetric real matrix A we can find a sequence of n numbers

\lambda_1\geq \ldots \geq \lambda_n

and a sequence of n unit vectors v_1,\ldots,v_n such that the v_i are orthogonal to each other and such that A v_i = \lambda_i v_i.

Now the claim is that these n numbers and n vectors completely describe A. Indeed, suppose that we are given a vector x and that we want to compute Ax. First, we can write x as a linear combination of the v_i:

x= \alpha_1 v_1 + \cdots \alpha_n v_n

and since the v_i are an orthonormal basis, we know what the coefficients \alpha_i are like: each \alpha_i is just the inner product of x and v_i.

Now, by linearity, we have

Ax = \alpha_1 A v_1 + \cdots \alpha_n A v_n

and, since the v_i are eigenvectors,

= \alpha_1 \lambda_1 v_1 + \cdots + \alpha_n \lambda_n v_n

So, once we express vectors in the basis defined by the v_i, we see that multiplication by A is just the effect of multiplying coordinate i by the value \lambda_i.

But there is more. Recall that \alpha_i = v_i^T x and so we can also write

Ax = (\lambda_1 v_1 v_1^T) x + \cdots + (\lambda_n v_n v_n^T) x

and so, in particular, matrix A is just

A = \lambda_1 v_1 v_1^T + \cdots + \lambda_n v_n v_n^T

a sum of n rank-1 matrices, with the coefficients of the sum being the eigenvalues.

Let now A be the adjacency matrix of a d-regular undirected graph (so A is real-valued and symmetric), \lambda_1\geq \cdots \geq \lambda_n be its eigenvalues, and v_1,\ldots,v_n be a corresponding set of orthonormal eigenvectors.

It is not hard to see (no, really, it’s two lines, see the bottom of page 2 of these notes) that all eigenvalues are between d and -d. Also, d is always an eigenvalue, as shown by the eigenvector (1,\ldots,1). This means that \lambda_1 = d and we can take

v_1 = \left( \frac 1 {\sqrt n},\cdots, \frac 1 {\sqrt n} \right)

and we have

A = (d v_1 v_1^T) x + \cdots + (\lambda_n v_n v_n^T) x

Suppose now that all the other \lambda_i, i=2,\ldots,n are much smaller than d in absolute value. Then, intuitively, the sum is going to be dominated by the first term, and A behaves sort of like the matrix (d v_1 v_1^T), that is, the matrix that has the value \frac{d}{n} in each entry.

You may be skeptical that such an approximation makes sense, but see where this leads. If our graph is G=(V,E), and (S,V-S) is a cut of the set of vertices into two subsets, let 1_S be the n-dimensional vector that is 1 in coordinates corresponding to S and 0 in the other coordinates, and define 1_{V-S} similarly. Then the number of edges crossing the cut is precisely

1_S^T A 1_{V-S}

and if we replace A by the matrix that has \frac{d}{n} in each coordinate we get the approximation |S|\cdot|V-S|\cdot \frac {d}{n}, which is the expected number of edges that would cross the cut if we had a random graph of degree d. This is pretty much the right result: if |\lambda_2|,\ldots, |\lambda_n| are all much smaller than d, and if |S| and |V-S| are at least a constant times n, then the number of edges crossing the cut is indeed very close to |S|\cdot |V-S| \cdot \frac {d}{n}. This result is known as the Expander Mixing Lemma, and the proof is just a matter of doing the “obvious” calculations. (Write 1_S and 1_{V-S} in the eigenvector basis, distribute the product, remember that the v_i are orthogonal, use Cauchy-Schwarz …)

It is perhaps less intuitive that, provided that all other eigenvalues are at least a little bit smaller than d, then a reasonably large number of edges crosses the cut. In particular, if |S| \leq n/2 and if \lambda_2 \leq d-\epsilon, then at least \epsilon  |S|/2 edges cross the cut. That is, if all eigenvalues except the largest one are at most d-\epsilon then the edge expansion of the graph is at least \epsilon/2. (It takes about five lines to prove this, see the beginning of the proof of Thm 1 in these notes.)

To summarize: the adjacency matrix of an undirected graph is symmetric, and so all eigenvalues are real. If the graph is d-regular, then d is the largest eigenvalue, and all others are between d and -d. If all the others are between d-\epsilon and -d, then the graph has edge-expansion \frac{\epsilon}{2} (or better).

Group representation

(As will become clear soon, I am totally out of my depth: Readers who find glaring mistakes, and even small ones, should note them in the comments.)

In the previous post on expanders we left off with the observation that if we want to construct an expanding Cayley graph, then we need to start from a group where all n elements of the group can be obtained by a sum of O(log n) terms chosen, with repetitions, from a set of O(1) generators; this means that a constant root of the n^{O(1)} ways of rearranging such sums must lead to different results. The group, then, has to be non-commutative in a very strong sense.

I will describe a way in which a group used to construct expanders is “very non-commutative.” This will not be a sufficient property for the expansion, but it is closely related and it will come up in the paper of Gowers that we will talk about later.

With this objective in mind, let’s see the notion of group representation. Let’s start from the simplest group, the cyclic group of N elements Z_N with the operation of addition (or Z/NZ as mathematicians write). One can visualize this group as N points equally spaced along a circle in the plane, with element a forming an angle of 2\pi a/ N with the x axis. If we want to add an element that forms an angle \alpha to an element that forms an angle \beta, that’s the same as putting the “two angles next to each other” and we get a point that forms an angle \alpha+\beta. More algebraically, we identify an element a with the complex number e^{2\pi a i/n} and we realize group addition as multiplication of complex numbers. There is, however, another way of visualizing the cyclic group on a cycle. We can think of group element a as the operation that takes a point on the cycle and moves it by an angle of 2\pi a/N. Addition of group elements now corresponds to “composition,” that is, corresponds to applying one rotation after the other.

In this view, element a is now the function that maps complex number x into complex number x\cdot e^{2\pi a i/N}.

Suppose now that our group is a product Z_{N_1} \times \cdots \times Z_{N_k} of cyclic groups. This means that a group elements if a k-tuple of the form (a_1,\ldots,a_d), and that

(a_1,\ldots,a_d) + (b_1,\ldots,b_d) = (a_1+b_1 \bmod N_1,\ldots,a_k+b_k \bmod N_k)

It now makes sense to view a group element (a_1,\ldots,a_d) as a “high-dimensional rotation” operation that takes in input k complex numbers (x_1,\ldots,x_k) and outputs the k complex numbers

(x_1 \cdot e^{2\pi a_1 i /N_1},\ldots, x_k \cdot e^{2\pi a_k i /N_k})

If we take this view, we have, again, the property that group addition becomes composition of functions. Note, also, that the function that we have associated to each group element is a very simple type of linear function: it is simply multiplication of x times a diagonal matrix that has diagonal

(e^{2\pi a_1 i /N_1},\ldots, e^{2\pi a_k i /N_k})

Notice, also, that if f is a function of the form f(x) = A\cdot x, where A is a matrix, and g is a function of the form B\cdot x where B is a matrix, then f(g(x))= A\cdot B\cdot x. That is, for linear functions, function composition is the same as matrix multiplication.

To summarize, we have started from the group Z_{N_1} \times \cdots \times Z_{N_k}, and we have found a way to associate a complex-valued diagonal matrix to each group element in such a way that group addition becomes matrix multiplication.

It is known that all finite Abelian groups can be written as Z_{N_1} \times \cdots \times Z_{N_k}, so this type of representation via diagonal matrices is possible for every finite Abelian group.

What about more general groups? If G is an arbitrary finite group, it is possible to associate to each element a block-diagonal matrix in such a way that group addition becomes matrix multiplication.

(It is common to call a group operation “multiplication” when a group is not Abelian, but for consistency with the rest of the post I will keep calling it addition.)

(By the way, it is also possible to represent infinite groups by associating a linear operator to each group element, but we will only discuss finite groups here.)

If the group is Abelian, then the matrices are diagonal, that is, they are block-diagonal matrices with “block size one.” So one way of quantifying “how non-Abelian” is a group is to consider how small the blocks can be in such matrix representations. That’s the dimension of the representation.

Here is an example of a family of groups whose representations cannot be low-dimensional. Let p be a prime (it could also be a prime power) and let us consider 2 \times 2 matrices whose entries are integers \bmod p. Let us restrict ourselves to matrices whose determinant is 1 modulo p, and consider the operation of matrix multiplication (where, also, all operations are mod p). This set of matrices forms a group, because the matrices are invertible (the determinant is non-zero) and the set contains the identity matrix and is closed under product. This group is called SL(2,p). The group SL(2,p) contains the tiny subgroup of two elements \{ I,-I\}; in the representation of SL(2,p) this shows up as a block of size 1. If we take the quotient of SL(2,p) by \{ I,-I \} then we get another group, which is called PSL(2,p).

It is now a theorem of Frobenius that every representation of PSL(2,p) has dimension at least (p-1)/2. This is really large compared to the size of PSL(2,p): the group SL(2,p) has p(p-1)^2 elements, and so PSL(2,p) has p(p-1)^2/2 elements. The dimension of the representation is thus approximately the cube root of the number of elements of the group.

Going back to representations of Abelian groups, we see that, in that case, not only each block had size one, but also that the entire matrix had size at most logarithmic in the number of elements of the group. This shows that SL(2,p) and PSL(2,p) are, at least in this particular sense, “very non-Abelian,” and it is an encouraging sign that they may be useful in constructing expanders and other quasi-random objects.

On his way toward a question about subsets of groups not containing triples of the form \{ a, b, a+b\}, Gowers shows that every dense Cayley graph constructed on PSL(2,p) has constant edge expansion. (By dense, we mean that the degree is \Omega(n), where n is the number of vertices.) This is a result that, assuming Frobenius theorem, has a simple proof, which I hope to understand and describe at a later point.

The celebrated Ramanujan graphs of Lubotzky, Phillips and Sarnak are constant-degree Cayley graphs constructed from PGL(2,p), which is the same as PSL(2,p) except that we put no restriction on the determinant. Their relationship between degree and eigenvalue gap is best possible. The analysis of Lubotzky et al. is, unfortunately, completely out of reach.