Eigenvalues and expansion

Before trying to describe how one reasons about the eigenvalues of the adjacency matrix of a Cayley graph, I would like to describe the “easy direction” of the relationships between eigenvalues and expansion.

First, a one-minute summary of linear algebra. If A is an n\times n matrix, then a (column) non-zero vector x is an eigenvector of A provided that, for some scalar \lambda, we have Ax= \lambda x. If so, then \lambda is an eigenvalue of A. A couple of immediate observations:

  • The equation Ax=\lambda x can be rewritten as (A-\lambda I)x = 0, so if \lambda is an eigenvalue of A then the columns of the matrix A-\lambda I are not linearly independent, and so the determinant of A-\lambda I is zero. But we can write det(A-\lambda I) as a degree-n polynomial in the unknown \lambda, and a degree-n polynomial cannot have more than n roots.

    So, an n\times n matrix has at most n eigenvalues.

  • If \lambda is an eigenvalue of A, then the set of vectors x such that Ax=\lambda x is a linear space. If \lambda_1 and \lambda_2 are different eigenvalues, then their respective spaces of eigenvectors are independent (they only intersect at 0).

If A is a symmetric real-valued matrix, then a series of miracles (or obvious things, depending on whether you have studied this material or not) happen:

  • All the roots of the polynomial det(A-\lambda I) are real; so, counting multiplicities, A has n real eigenvalues.
  • If \lambda is an eigenvalue of A, and it is a root of multiplicity k of det(A-\lambda I), then the space of vectors x such that Ax=\lambda x has dimension k.
  • If \lambda_1 \neq \lambda_2 are two different eigenvalues, then their respective spaces of eigenvectors are orthogonal

From this fact, it is easy to conclude that starting from symmetric real matrix A we can find a sequence of n numbers

\lambda_1\geq \ldots \geq \lambda_n

and a sequence of n unit vectors v_1,\ldots,v_n such that the v_i are orthogonal to each other and such that A v_i = \lambda_i v_i.

Now the claim is that these n numbers and n vectors completely describe A. Indeed, suppose that we are given a vector x and that we want to compute Ax. First, we can write x as a linear combination of the v_i:

x= \alpha_1 v_1 + \cdots \alpha_n v_n

and since the v_i are an orthonormal basis, we know what the coefficients \alpha_i are like: each \alpha_i is just the inner product of x and v_i.

Now, by linearity, we have

Ax = \alpha_1 A v_1 + \cdots \alpha_n A v_n

and, since the v_i are eigenvectors,

= \alpha_1 \lambda_1 v_1 + \cdots + \alpha_n \lambda_n v_n

So, once we express vectors in the basis defined by the v_i, we see that multiplication by A is just the effect of multiplying coordinate i by the value \lambda_i.

But there is more. Recall that \alpha_i = v_i^T x and so we can also write

Ax = (\lambda_1 v_1 v_1^T) x + \cdots + (\lambda_n v_n v_n^T) x

and so, in particular, matrix A is just

A = \lambda_1 v_1 v_1^T + \cdots + \lambda_n v_n v_n^T

a sum of n rank-1 matrices, with the coefficients of the sum being the eigenvalues.

Let now A be the adjacency matrix of a d-regular undirected graph (so A is real-valued and symmetric), \lambda_1\geq \cdots \geq \lambda_n be its eigenvalues, and v_1,\ldots,v_n be a corresponding set of orthonormal eigenvectors.

It is not hard to see (no, really, it’s two lines, see the bottom of page 2 of these notes) that all eigenvalues are between d and -d. Also, d is always an eigenvalue, as shown by the eigenvector (1,\ldots,1). This means that \lambda_1 = d and we can take

v_1 = \left( \frac 1 {\sqrt n},\cdots, \frac 1 {\sqrt n} \right)

and we have

A = (d v_1 v_1^T) x + \cdots + (\lambda_n v_n v_n^T) x

Suppose now that all the other \lambda_i, i=2,\ldots,n are much smaller than d in absolute value. Then, intuitively, the sum is going to be dominated by the first term, and A behaves sort of like the matrix (d v_1 v_1^T), that is, the matrix that has the value \frac{d}{n} in each entry.

You may be skeptical that such an approximation makes sense, but see where this leads. If our graph is G=(V,E), and (S,V-S) is a cut of the set of vertices into two subsets, let 1_S be the n-dimensional vector that is 1 in coordinates corresponding to S and 0 in the other coordinates, and define 1_{V-S} similarly. Then the number of edges crossing the cut is precisely

1_S^T A 1_{V-S}

and if we replace A by the matrix that has \frac{d}{n} in each coordinate we get the approximation |S|\cdot|V-S|\cdot \frac {d}{n}, which is the expected number of edges that would cross the cut if we had a random graph of degree d. This is pretty much the right result: if |\lambda_2|,\ldots, |\lambda_n| are all much smaller than d, and if |S| and |V-S| are at least a constant times n, then the number of edges crossing the cut is indeed very close to |S|\cdot |V-S| \cdot \frac {d}{n}. This result is known as the Expander Mixing Lemma, and the proof is just a matter of doing the “obvious” calculations. (Write 1_S and 1_{V-S} in the eigenvector basis, distribute the product, remember that the v_i are orthogonal, use Cauchy-Schwarz …)

It is perhaps less intuitive that, provided that all other eigenvalues are at least a little bit smaller than d, then a reasonably large number of edges crosses the cut. In particular, if |S| \leq n/2 and if \lambda_2 \leq d-\epsilon, then at least \epsilon  |S|/2 edges cross the cut. That is, if all eigenvalues except the largest one are at most d-\epsilon then the edge expansion of the graph is at least \epsilon/2. (It takes about five lines to prove this, see the beginning of the proof of Thm 1 in these notes.)

To summarize: the adjacency matrix of an undirected graph is symmetric, and so all eigenvalues are real. If the graph is d-regular, then d is the largest eigenvalue, and all others are between d and -d. If all the others are between d-\epsilon and -d, then the graph has edge-expansion \frac{\epsilon}{2} (or better).

About these ads

3 thoughts on “Eigenvalues and expansion

  1. One minor correction,
    Luca:”…, and $A$ behaves sort of like the matrix $(d v_i^T v_i)$, that is, the matrix …”

    $v_i$ should be $v_1$.

    Great job, Luca. We are awaiting the rest of this series of posts.

  2. I greatly enjoy your technical posts – they’re both intuitive and informative. Looking forward to more of them…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s