In which we finish the proof of Cheeger’s inequalities.
It remains to prove the following statement.
In which we finish the proof of Cheeger’s inequalities.
It remains to prove the following statement.
In which we generalize the notion of normalized Laplacian to irregular graphs, we extend the basic spectral graph theory results from last lecture to irregular graphs, and we prove the easy direction of Cheeger’s inequalities.
1. Irregular Graphs
Let be an undirected graph, not necessarily regular. We will assume that every vertex has non-zero degree. We would like to define a normalized Laplacian matrix associated to so that the properties we proved last time are true: that the multiplicity of 0 as an eigenvalue is equal to the number of connected components of , that the largest eigenvalue is at most 2, and that it is 2 if and only if (a connected component of) the graph is bipartite.
In which we introduce the Laplacian matrix and we prove our first results in spectral graph theory.
1. The Basics of Spectral Graph Theory
Given an undirected graph , the approach of spectral graph theory is to associate a symmetric real-valued matrix to , and to related the eigenvalues of the matrix to combinatorial properties of .
For the sake of this lecture, we will restrict ourselves to the case in which is a -regular graph, and we will then see how to extend our results to apply to irregular graphs as well.
The most natural matrix to associate to is the adjacency matrix such that if and otherwise. In the second part of the course, in which we will study expander graphs, the adjacency matrix will indeed be the most convenient matrix to work with. For the sake of the algorithms that we will analyze in the first part of the course, however, a slight variation called the normalized Laplacian is more convenient.
In which we describe what this course is about.
This is class is about applications of linear algebra to graph theory and to graph algorithms. In the finite-dimensional case, linear algebra deals with vectors and matrices, and with a number of useful concepts and algorithms, such as determinants, eigenvalues, eigenvectors, and solutions to systems of linear equations.
The application to graph theory and graph algorithms comes from associating, in a natural way, a matrix to a graph , and then interpreting the above concepts and algorithms in graph-theoretic language. The most natural representation of a graph as a matrix is via the adjacency matrix of a graph, and certain related matrices, such as the Laplacian and normalized Laplacian matrix will be our main focus. We can think of -dimensional Boolean vectors as a representing a partition of the vertices, that is, a cut in the graph, and we can think of arbitrary vectors as fractional cuts. From this point of view, eigenvalues are the optima of continuous relaxations of certain cut problems, the corresponding eigenvectors are optimal solutions, and connections between spectrum and cut structures are given by rounding algorithms converting fractional solutions into integral ones. Flow problems are dual to cut problems, so one would expect linear algebraic techniques to be helpful to find flows in networks: this is the case, via the theory of electrical flows, which can be found as solutions to linear systems.
The course can be roughly subdivided into three parts: in the first part of the course we will study spectral graph algorithms, that is, graph algorithms that make use of eigenvalues and eigenvectors of the normalized Laplacian of the given graph. In the second part of the course we will look at constructions of expander graphs, and their applications. In the third part of the course, we will look at fast algorithms for solving systems of linear equations of the form , where is Laplacian of a graph, their applications to finding electrical flows, and the applications of electrical flows to solving the max flow problem.
In which we review linear algebra prerequisites.
The following background from linear algebra will be sufficient for the sake of this course: to know what is an eigenvalue and an eigenvector, to know that real symmetric matrices have real eigenvalues and their real eigenvectors are orthogonal, and to know the variational characterization of eigenvalues.
1. Basic Definitions
If is a complex number, then we let denote its conjugate. Note that a complex number is real if and only if . If is a matrix, then denotes the conjugate transpose of , that is, . If the entries of are real, then , where is the transpose of , that is, the matrix such that .
We say that a matrix is Hermitian if . In particular, real symmetric matrices are Hermitian.
If are two vectors, then their inner product is defined as
Notice that, by definition, we have and . Note also that, for two matrices , we have , and that for every matrix and every two vectors , , we have
If is a square matrix, is a scalar, is a non-zero vector and we have
then we say that is an eigenvalue of and that is eigenvector of corresponding to the eigenvalue .
2. The Spectral Theorem
We want to prove
Theorem 1 (Spectral Theorem) Let be a symmetric matrix with real-valued entries, then there are real numbers (not necessarily distinct) and orthonormal real vectors , such that is an eigenvector of .
Assuming the fundamental theorem of algebra (that every polynomial has a complex root) and basic properties of the determinant, the cleanest proof of the spectral theorem is to proceed by induction on , and to show that must have a real eigenvalue with a real eigenvector , and to show that maps vectors orthogonal to to vectors orthogonal to . Then one applies the inductive hypothesis to restricted to the -dimensional space of vectors orthogonal to and one recovers the remaining eigenvalues and eigenvectors.
The cleanest way to formalize the above proof is to give all definitions and results in terms of linear operators where is an arbitrary vector space over the reals. This way, however, we would be giving several definitions that we would never use in the future, so, instead, the inductive proof will use a somewhat inelegant change of basis to pass from to an matrix .
We begin by showing that a real symmetric matrix has real eigenvalues and eigenvectors.
We begin by noting that every matrix has a complex eigenvalue.
Lemma 3 For every matrix , there is an eigenvalue and an eigenvector such that .
Proof: Note that is an eigenvalue for if and only if
which is true if and only if the rows of are not linearly independent, which is true if and only if
Now note that the mapping is a univariate polynomial of degree in , and so it must have a root by the fundamental theorem of algebra.
Next we show that if is real and symmetric, then its eigenvalues are real.
Lemma 4 If is Hermitian, then, for every and ,
Lemma 5 If is Hermitian, then all the eigenvalues of are real.
Proof: Let be an Hermitian matrix and let be a scalar and be a non-zero vector such that . We will show that , which implies that is a real number.
We note that
and by the fact that , we have .
In order to prove Theorem 2, it remains to argue that, for a real eigenvalue of a real symmetric matrix, we can find a real eigenvector.
Proof: } Let be a real symmetric matrix, then has a real eigenvalue and a (possibly complex valued) eigenvector , where and are real vectors. We have
from which (recalling that the entries of and the scalar are real) it follows that and that ; since and cannot both be zero, it follows that has a real eigenvector.
We are now ready to prove the spectral theorem
Proof: We proceed by induction on . The case is trivial.
Assume that the statement is true for dimension . Let be a real eigenvalue of and be a real eigenvector .
Now we claim that for every vector that is orthogonal to , then is also orthogonal to . Indeed, the inner product of and is
Let be the -dimensional subspace of that contains all the vectors orthogonal to . We want to apply the inductive hypothesis to restricted to ; we cannot literally do that, because the theorem is not stated in terms of arbitrary linear operators over vector spaces, so we will need to do that by fixing an appropriate basis for .
let be a matrix that computes a bijective map from to . (If is an orthonormal basis for , then is just the matrix whose columns are the vectors .) Let also be the matrix such that, for every , . (We can set where is as described above.) We apply the inductive hypothesis to the matrix
and we find eigenvalues and orthonormal eigenvectors for .
For every , we have
Since is orthogonal to , it follows that is also orthogonal to , and so , so we have
and, defining , we have
Finally, we observe that the vectors are orthogonal. By construction, is orthogonal to , and, for every , we have that
We have not verified that the vectors have norm 1 (which is true), but we can scale them to have norm 1 if not.
3. Variational Characterization of Eigenvalues
We conclude these notes with the variational characterization of eigenvalues for real symmetric matrices.
Theorem 6 Let be a symmetric matrix, and be the eigenvalues of in non-increasing order. Then
The quantity is called the Rayleigh quotient of with respect to , and we will denote it by .
Proof: Let be orthonormal eigenvectors of the eigenvalues , as promised by the spectral theorem. Consider the -dimensional space spanned by . For every vector in such a space, the numerator of the Rayleigh quotient is
The denominator is clearly , and so . This shows that
For the other direction, let be any -dimensional space: we will show that must contain a vector of Rayleigh quotient . Let be the span of ; since has dimension and has dimension , they must have some non-zero vector in common. Let be one such vector, and let us write . The numerator of the Rayleigh quotient of is
and the denominator is , so .
We have the following easy consequence.
Proof: The identity is the case of the previous theorem. For the furthermore part, let be the list of eigenvalues of in non-decreasing order, and be corresponding eigenvectors. If is any vector, then
If , then for every such that , that is, is a linear combination of eigenvectors of , and hence it is an eigenvector of .
Fact 8 If is the largest eigenvalue of a real symmetric matrix , then
Furthermore, every maximizer is an eigenvector of .
Proof: Apply Fact 7 to the matrix .
Fact 9 If is the smallest eigenvalue of a real symmetric matrix , and is an eigenvector of , then
Furthermore, every minimizer is an eigenvector of .
Proof: A more conceptual proof would be to consider the restriction of to the space orthogonal to , and then apply Fact 7 to such a linear operator. But, since we have not developed the theory for general linear operators, we would need to explicitly reduce to an -dimensional case via a projection operator as in the proof of the spectral theorem.
Instead, we will give a more hands-on proof. Let be the list of eigenvalues of , with multiplicities, and be orthonormal vectors as given by the spectral theorem. We may assume that , possibly by changing the orthonormal basis of the eigenspace of . For every vector orthogonal to , its Rayleigh quotient is
and the minimum is achieved by vectors such that for every , that is, for vectors which are linear combinations of the eigenvectors of , and so every minimizer is an eigenvector of .
This semester, starting tomorrow, I am teaching a course on spectral methods and expanders. This is similar to a course I offered twice at Stanford, but this time it will be a 15-week course instead of a 10-week one.
The Stanford course had two main components: (1) spectral algorithms for sparsest cut, and comparisons with LP and SDP based methods, and (2) properties and constructions of expanders.
I will use the additional time to talk a bit more about spectral algorithms, including clustering algorithms, and about constructions of expanders, and to add a third part about electrical networks, sparsification, and max flow.
Lecture notes will be posted here after each lecture.
In some more detail, the course will start with a review of linear algebra and a proof of basic spectral graph theory facts, such as the multiplicity of 0 as an eigenvalue of the Laplacian being the same as the number of connected components of a graph.
Then we will introduce expansion and conductance, and prove Cheeger’s inequality. We will do so in the language of approximation algorithms, and we will see how the analysis of Fiedler’s algorithm given by Cheeger’s inequality compares to the Leighton-Rao analysis of the LP relaxation and the Arora-Rao-Vazirani analysis of the SDP relaxation. Then we will prove several variants of Cheeger’s inequality, interpreting them as analyses of spectral algorithms for clustering and max cut.
In the second part of the course, we will see properties of expanders and combinatorial and algebraic constructions of expanders. We will talk about the theory that gives eigenvalues and eigenvectors of Abelian Cayley graphs, the zig-zag graph product, and the Margulis-Gabber-Galil construction. I would also like to talk about the expansion of random graphs, and to explain how one gets expander constructions from Selberg’s “3/16 theorem,” although I am not sure if there will be time for that.
The first two parts will be tied together by looking at the MCMC algorithm to approximate the number of perfect matchings in a dense bipartite graph. The analysis of the algorithm depends on the mixing time of a certain exponentially big graph, the mixing time will be determined (as shown in a previous lecture on properties of expanders) by the eigenvalue gap, the eigenvalue gap will be determined (as shown by Cheeger’s inequality) by the conductance, and the conductance can be bounded by constructing certain multicommodity flows (as shown in the analysis of the Leighton-Rao algorithms).
In the third part, we will talk about electrical networks, effective resistance and electrical flows, see how to get sparsifiers using effective resistance, a sketch of how to salve Laplacian equations in nearly linear time, and how to approximate max flow using electrical flows.
Suppose that we want to construct a very good family of -regular expander graphs. The Alon-Boppana theorem says that the best we can hope for, from the point of view of spectral expansion, is to have , and we would certainly be very happy with a family of graphs in which .
Known constructions of expanders produce Cayley graphs (or sometimes Schreier graphs, which is a related notion), because it is easier to analyze the spectra of such graphs. If is a group with operation and is the inverse of element , and is a symmetric set of generators, then the Cayley graph is the graph whose vertices are the elements of and whose edges are the pairs such that .
When the group is Abelian, there is good news and bad news. The good news is that the eigenvectors of the graphs are completely characterized (they are the characters of ) and the eigenvalues are given by a nice formula. (See here and here.) The bad news is that constant-degree Cayley graphs of Abelian groups cannot be expanders.
That’s very bad news, but it is still possible to get highly expanding graphs of polylogarithmic degree as Cayley graphs of Abelian groups.
Here we will look at the extreme case of a family of graphs of degree , where is the number of vertices. Even with such high degree, the weak version of the Alon-Boppana theorem still implies that we must have , and so we will be happy if we get a graph in which . Highly expanding graphs of degree are interesting because they have some of the properties of random graphs from the distribution. In turn, graphs from have all kind of interesting properties with high probabilities, including being essentially the best known Ramsey graphs and having the kind of discrepancy property that gives seedless extractors for two independent sources. Unfortunately, these properties cannot be certified by spectral methods. The graph that we will study today is believed to have such stronger properties, but there is no known promising approach to prove such conjectures, so we will content ourselves with proving strong spectral expansion.
The graph is the Paley graph. If is a prime, is the group of addition modulo , and is the set of elements of of the form , then the graph is just . That is, the graph has a vertex for each , and two vertices are adjacent if and only if there is an such that .
Let be a -regular graph, and let
be the eigenvalues of the adjacency matrix of counted with multiplicities and sorted in descending order.
How good can the spectral expansion of be?
1. Simple Bounds
The simplest bound comes from a trace method. We have
by using one definition of the trace and
using the other definition and observing that counts the paths that go from to in two steps, of which there are at least : follow an edge to a neighbor of , then follow the same edge back. (There could be more if has multiple edges or self-loops.)
So we have
The condition is necessary to get lower bounds of ; in the clique, for example, we have and .
A trace argument does not give us a lower bound on , and in fact it is possible to have and , for example in the bipartite complete graph.
If the diameter of is at least 4, it is easy to see that . Let be two vertices at distance 4. Define a vector as follows: , if is a neighbor of , and if is a neighbor of . Note that there cannot be any edge between a neighbor of and a neighbor of . Then we see that , that (because there are edges, each counted twice, that give a contribution of to ) and that is orthogonal to .
2. Nilli’s Proof of the Alon-Boppana Theorem
Nilli’s proof of the Alon-Boppana theorem gives
where is the diameter of . This means that if one has a family of (constant) degree- graphs, and every graph in the family satisfies , then one must have . This is why families of Ramanujan graphs, in which , are special, and so hard to construct, or even to prove existence of.
Friedman proves a stronger bound, in which the error term goes down with the square of the diameter. Friedman’s proof is the one presented in the Hoory-Linial-Wigderson survey. I like Nilli’s proof, even if it is a bit messier than Friedman’s, because it starts off with something that clearly is going to work, but the first two or three ways you try to establish the bound don’t work (believe me, I tried, because I didn’t see why some steps in the proof had to be that way), but eventually you find the right way to break up the estimate and it works.
So here is Nilli’s proof. Continue reading
There are no references and, most likely, plenty of errors. If you use the notes and find mistakes, please let me know by either emailing luca at berkeley or leaving a comment here.
In preparation for the special program on spectral graph theory at the Simons Institute, which starts in a week, I have been reading on the topics of the theory that I don’t know much about: the spectrum of random graphs, properties of highly expanding graphs, spectral sparsification, and so on.
I have been writing some notes for myself, and here is something that bothers me: How do you call the second largest, in absolute value, eigenvalue of the adjacency matrix of a graph, without resorting to the sentence I just wrote? And how do you denote it?
I have noticed that the typical answer to the first question is “second eigenvalue,” but this is a problem when it creates confusion with the actual second largest eigenvalue of the adjacency matrix, which could be a very different quantity. The answer to the second question seems to be either a noncommittal “” or a rather problematic “.”
For my own use, I have started to used the notation , which can certainly use some improvement, but I am still at a loss concerning terminology.
Perhaps one should start from where this number is coming from, and it seems that its important property is that, if the graph is regular and has vertices, and has adjacency matrix A, this number is the spectral norm of (where is the matrix with ones everywhere), so that it measures the distance of from the “perfect -regular expander” in a norm that is useful to reason about cuts and also tractable to compute.
So, since it is the spectral norm of a modification of the adjacency matrix, how about calling it adjective spectral norm? I would vote for shifted spectral norm because I would think of subtracting as a sort of shift.
Please, do better in the comments!