CS294 Lecture 6: Higher-order Cheeger Inequality

In which we state an analog of Cheeger’s inequalities for the {k}-th smallest Laplacian eigenvalue, and we discuss the connection between this result and the analysis of spectral partitioning algorithms

1. Cheeger-type Inequalities for {\lambda_k}

Let {G=(V,E)} be an undirected {d}-regular graph, {A} its adjacency matrix, {L = I - \frac 1d A} its normalized Laplacian matrix, and {0 = \lambda_1 \leq \cdots \leq \lambda_n \leq 2} be the eigenvalues of {L}, counted with multiplicities and listed in non-decreasing order.

In Handout 2, we proved that {\lambda_k = 0} if and only if {G} has at least {k} connected components, that is, if and only if there are {k} disjoint sets {S_1,\ldots,S_k} such that {\phi(S_i) = 0} for {i=1,\ldots,k}. In this lecture and the next one we will prove a robust version of this fact.

First we introduce the notion of higher-order expansion. If {S_1,\ldots,S_k} is a collection of disjoint sets, then their order-{k} expansion is defined as

\displaystyle  \phi_k (S_1,\ldots,S_k) = \max_{i=1,\ldots,k} \ \ \phi(S_i)

and the order-{k} expansion of a graph {G} is

\displaystyle \phi_k(G )= \min_ {S_1,\ldots,S_k \ \rm disjoint } \ \ \phi(S_1,\ldots,S_k)

If the edges of a graph represent a relation of similarity of affinity, a low-expansion collection of sets {S_1,\ldots,S_k} represents an interesting notion of clustering, because the vertices in each set {S_i} are more related to each other than to the rest of the graph. (Additional properties are desirable in a good clustering, and we will discuss this later.)

We will prove the following higher-order Cheeger inequalities:

\displaystyle  \frac { \lambda_k } 2 \leq \phi_k(G) \leq O(k^{3.5} ) \sqrt {\lambda_k}

Stronger upper bounds are known, but the bound above is easier to prove from scratch. It is known that {\phi_k (G) \leq O(k^2) \sqrt{\lambda_k}} and that {\phi_k(G) \leq O_\epsilon ( \sqrt {\log k} ) \cdot \sqrt {\lambda_{(1+\epsilon) \cdot k} } }.

Continue reading

CS294 Lecture 5: Cheeger-type Inequalities for λn

In which we prove an analog of Cheeger’s inequalities for the largest Laplacian eigenvalue and we show how to use it to develop a spectral approximation algorithm for Max Cut.

1. Cheeger-type Inequalities for {\lambda_n}

Let {G=(V,E)} be an undirected graph (not necessarily regular), {D} its diagonal matrix of degrees, {A} its adjacency matrix, {L = I - D^{-1/2} A D^{-1/2}} its normalized Laplacian matrix, and {0 = \lambda_1 \leq \cdots \leq \lambda_n \leq 2} be the eigenvalues of {L}, counted with multiplicities and listed in non-decreasing order.

In Handout 2, we proved that {\lambda_k = 0} if and only if {G} has at least {k} connected component and {\lambda_n = 2} if and only if there is a connected component of {G} (possibly, all of {G}) that is bipartite.

A special case of the former fact is that {\lambda_2 = 0} if and only if the graph is disconnected, and the Cheeger inequalities give a “robust” version of this fact, showing that {\lambda_2} can be small if and only if the expansion of the graph is small. In these notes we will see a robust version of the latter fact; we will identify a combinatorial parameter that is zero if and only if the graph has a bipartite connected component, and it is small if and only if the graph is “close” (in an appropriate sense) to having a bipartite connected components, and we will show that this parameter is small if and only if {2-\lambda_n} is small.

Recall that

\displaystyle  2- \lambda_n = \min_{{\bf x} \in {\mathbb R}^n - \{ {\bf 0} \}} \ \ \frac { \sum_{\{ u,v \} \in E} (x_u + x_v)^2 } {\sum_{v\in V} d_v x_v^2 }

We will study the following combinatorial problem, which formalizes the task of finding an “almost bipartite connected component:” we are looking for a non-empty subset of vertices {S\subseteq V} (we allow {S=V}) and a bipartition {(A,B)} of {S} such that there is a small number of “violating edges” compared to the number of edges incident on {S}, where an edge {\{ u,v\}} is violating if it is in the cut {(S,V-S)}, if it has both endpoints in {A}, or if it has both endpoints in {B}. (Note that if there are no violating edges, then {S} is a bipartite connected component of {G}.)

It will be convenient to package the information about {A,B,S} as a vector {{\bf y} \in \{ -1,0,1 \}^n}, where the non-zero coordinates correspond to {S}, and the partition of {S} is given by the positive versus negative coordinates. We define the “bipartiteness ratio” of {{\bf y}} as

\displaystyle  \beta( {\bf y} ) := \frac { \sum_{\{ u,v \} \in E} |y_u + y_v| } {\sum_{v\in V} d_v| y_v| }

Note that in the numerator we have the number of violating edges, with edges contained in {A} or in {B} counted with a weight of 2, and edges from {S} to {V-S} counted with a weight of 1. In the denominator we have the sum of the degrees of the vertices of {S} (also called the volume of {S}, and written {vol(S)}) which is, up to a factor of 2, the number of edges incident on {S}.

(Other definitions would have been reasonable, for example in the numerator we could just count the number of violating edges without weights, or we could have the expression {\sum_{\{ u,v \} \in E} (y_u + y_v)^2 }. Those choices would give similar bounds to the ones we will prove, with different multiplicative constants.)

We define the bipartiteness ratio of {G} as

\displaystyle  \beta(G) = \min_{{\bf y} \in \{-1,0,1\}^n - \{ {\bf 0} \} } \ \ \beta({\bf y} )

We will prove the following analog of Cheeger’s inequalities:

\displaystyle  \frac {2-\lambda_n}{2} \leq \beta(G) \leq \sqrt { 2 \cdot (2 - \lambda_n) }

Continue reading

CS294 Lecture 3: Cheeger Inequalities

In which we generalize the notion of normalized Laplacian to irregular graphs, we extend the basic spectral graph theory results from last lecture to irregular graphs, and we prove the easy direction of Cheeger’s inequalities.

1. Irregular Graphs

Let {G=(V,E)} be an undirected graph, not necessarily regular. We will assume that every vertex has non-zero degree. We would like to define a normalized Laplacian matrix associated to {G} so that the properties we proved last time are true: that the multiplicity of 0 as an eigenvalue is equal to the number of connected components of {G}, that the largest eigenvalue is at most 2, and that it is 2 if and only if (a connected component of) the graph is bipartite.

Continue reading

CS294 Lecture 2: Basics of Spectral Graph Theory

In which we introduce the Laplacian matrix and we prove our first results in spectral graph theory.

1. The Basics of Spectral Graph Theory

Given an undirected graph {G= (V,E)}, the approach of spectral graph theory is to associate a symmetric real-valued matrix to {G}, and to related the eigenvalues of the matrix to combinatorial properties of {G}.

For the sake of this lecture, we will restrict ourselves to the case in which {G} is a {d}-regular graph, and we will then see how to extend our results to apply to irregular graphs as well.

The most natural matrix to associate to {G} is the adjacency matrix {A} such that {A_{i,j} = 1} if {\{ i,j\} \in E} and {A_{i,j} = 0} otherwise. In the second part of the course, in which we will study expander graphs, the adjacency matrix will indeed be the most convenient matrix to work with. For the sake of the algorithms that we will analyze in the first part of the course, however, a slight variation called the normalized Laplacian is more convenient.

Continue reading

CS294 Lecture 1: Introduction

In which we describe what this course is about.

1. Overview

This is class is about applications of linear algebra to graph theory and to graph algorithms. In the finite-dimensional case, linear algebra deals with vectors and matrices, and with a number of useful concepts and algorithms, such as determinants, eigenvalues, eigenvectors, and solutions to systems of linear equations.

The application to graph theory and graph algorithms comes from associating, in a natural way, a matrix to a graph {G=(V,E)}, and then interpreting the above concepts and algorithms in graph-theoretic language. The most natural representation of a graph as a matrix is via the {|V| \times |V|} adjacency matrix of a graph, and certain related matrices, such as the Laplacian and normalized Laplacian matrix will be our main focus. We can think of {|V|}-dimensional Boolean vectors as a representing a partition of the vertices, that is, a cut in the graph, and we can think of arbitrary vectors as fractional cuts. From this point of view, eigenvalues are the optima of continuous relaxations of certain cut problems, the corresponding eigenvectors are optimal solutions, and connections between spectrum and cut structures are given by rounding algorithms converting fractional solutions into integral ones. Flow problems are dual to cut problems, so one would expect linear algebraic techniques to be helpful to find flows in networks: this is the case, via the theory of electrical flows, which can be found as solutions to linear systems.

The course can be roughly subdivided into three parts: in the first part of the course we will study spectral graph algorithms, that is, graph algorithms that make use of eigenvalues and eigenvectors of the normalized Laplacian of the given graph. In the second part of the course we will look at constructions of expander graphs, and their applications. In the third part of the course, we will look at fast algorithms for solving systems of linear equations of the form {L {\bf x} = {\bf b}}, where {L} is Laplacian of a graph, their applications to finding electrical flows, and the applications of electrical flows to solving the max flow problem.

Continue reading

CS294 Lecture 0: Linear Algebra Review

In which we review linear algebra prerequisites.

The following background from linear algebra will be sufficient for the sake of this course: to know what is an eigenvalue and an eigenvector, to know that real symmetric matrices have real eigenvalues and their real eigenvectors are orthogonal, and to know the variational characterization of eigenvalues.

1. Basic Definitions

If {x=a+ib} is a complex number, then we let {\bar x = a- ib} denote its conjugate. Note that a complex number {x} is real if and only if {x = \bar x}. If {M \in {\mathbb C}^{m \times n}} is a matrix, then {M^*} denotes the conjugate transpose of {M}, that is, {(M^*)_{i,j} = \overline{M_{j,i}}}. If the entries of {M} are real, then {M^* = M^T}, where {M^T} is the transpose of {M}, that is, the matrix such that {(M^T)_{i,j} = M_{j,i}}.

We say that a matrix {M} is Hermitian if {M = M^*}. In particular, real symmetric matrices are Hermitian.

If {{\bf x},{\bf y} \in {\mathbb C}^n} are two vectors, then their inner product is defined as

\displaystyle  \langle {\bf v},{\bf w} \rangle := {\bf v}^* {\bf w} = \sum_i \overline{v_i} \cdot w_i \ \ \ \ \ (1)

Notice that, by definition, we have {\langle {\bf v}, {\bf w} \rangle = (\langle {\bf w}, {\bf v} \rangle)^*} and {\langle {\bf v}, {\bf v} \rangle = || {\bf v} ||^2}. Note also that, for two matrices {A,B}, we have {(A \cdot B)^* = B^* \cdot A^*}, and that for every matrix {M} and every two vectors {{\bf x}}, {{\bf y}}, we have

\displaystyle  \langle M {\bf x} , {\bf y} \rangle = {\bf x}^* M^* {\bf y} = \langle {\bf x} , M^* {\bf y} \rangle

If {M\in {\mathbb C}^{n \times n}} is a square matrix, {\lambda \in {\mathbb C}} is a scalar, {{\bf v} \in {\mathbb C}^n - \{ {\bf {0}} \}} is a non-zero vector and we have

\displaystyle   M {\bf v} = \lambda {\bf v} \ \ \ \ \ (2)

then we say that {\lambda} is an eigenvalue of {M} and that {{\bf v}} is eigenvector of {M} corresponding to the eigenvalue {\lambda}.

2. The Spectral Theorem

We want to prove

Theorem 1 (Spectral Theorem) Let {M\in {\mathbb R}^{n\times n}} be a symmetric matrix with real-valued entries, then there are {n} real numbers (not necessarily distinct) {\lambda_1,\ldots,\lambda_n} and {n} orthonormal real vectors {{\bf x}_1,\ldots,{\bf x}_n}, {{\bf x}_i \in {\mathbb R}^n} such that {{\bf x}_i} is an eigenvector of {\lambda_i}.

Assuming the fundamental theorem of algebra (that every polynomial has a complex root) and basic properties of the determinant, the cleanest proof of the spectral theorem is to proceed by induction on {n}, and to show that {M} must have a real eigenvalue {\lambda_1} with a real eigenvector {{\bf v}_1}, and to show that {M} maps vectors orthogonal to {{\bf v}_1} to vectors orthogonal to {{\bf v}_1}. Then one applies the inductive hypothesis to {M} restricted to the {(n-1)}-dimensional space of vectors orthogonal to {{\bf v}_1} and one recovers the remaining {(n-1)} eigenvalues and eigenvectors.

The cleanest way to formalize the above proof is to give all definitions and results in terms of linear operators {T: X \rightarrow X} where {X} is an arbitrary vector space over the reals. This way, however, we would be giving several definitions that we would never use in the future, so, instead, the inductive proof will use a somewhat inelegant change of basis to pass from {M} to an {(n-1) \times (n-1)} matrix {M'}.

We begin by showing that a real symmetric matrix has real eigenvalues and eigenvectors.

Theorem 2 If {M\in {\mathbb R}^{n\times n}} is symmetric, then there is a real eigenvalue {\lambda \in {\mathbb R}} and a real eigenvector {{\bf v} \in {\mathbb R}^n} such that {M {\bf v} = \lambda {\bf v}}.

We begin by noting that every matrix has a complex eigenvalue.

Lemma 3 For every matrix {M\in {\mathbb C}^{n \times n}}, there is an eigenvalue {\lambda \in {\mathbb C}} and an eigenvector {{\bf v} \in {\mathbb C}^n} such that {M {\bf v} = \lambda {\bf v}}.

Proof: Note that {\lambda} is an eigenvalue for {M} if and only if

\displaystyle  \exists {\bf x} \neq {\bf 0}. \ ( M - \lambda I ) {\bf x} = {\bf 0}

which is true if and only if the rows of {M-\lambda I} are not linearly independent, which is true if and only if

\displaystyle  \det (M - \lambda I ) = 0

Now note that the mapping {t \rightarrow \det (M - t I )} is a univariate polynomial of degree {n} in {t}, and so it must have a root {\lambda} by the fundamental theorem of algebra. \Box

Next we show that if {M} is real and symmetric, then its eigenvalues are real.

Lemma 4 If {M} is Hermitian, then, for every {{\bf x}} and {{\bf y}},

\displaystyle  \langle M {\bf x} , {\bf y} \rangle = \langle {\bf x} , M {\bf y} \rangle

Proof:

\displaystyle  \langle M {\bf x} , {\bf y} \rangle = \langle {\bf x} , M^* {\bf y} \rangle = \langle {\bf x} , M {\bf y} \rangle

\Box

Lemma 5 If {M} is Hermitian, then all the eigenvalues of {M} are real.

Proof: Let {M} be an Hermitian matrix and let {\lambda} be a scalar and {{\bf x}} be a non-zero vector such that {M {\bf x} = \lambda {\bf x}}. We will show that {\lambda = \lambda^*}, which implies that {\lambda} is a real number.

We note that

\displaystyle  \langle M{\bf x}, {\bf x} \rangle = \langle \lambda {\bf x}, {\bf x} \rangle = \lambda^* ||x||^2

and

\displaystyle  \langle {\bf x}, M {\bf x} \rangle = \langle {\bf x}, \lambda {\bf x} \rangle = \lambda ||x||^2

and by the fact that {\langle M{\bf x}, {\bf x} \rangle = \langle {\bf x}, M{\bf x} \rangle} , we have {\lambda = \lambda^*}. \Box

In order to prove Theorem 2, it remains to argue that, for a real eigenvalue of a real symmetric matrix, we can find a real eigenvector.

Proof: } Let {M \in {\mathbb R}^{n \times n}} be a real symmetric matrix, then {M} has a real eigenvalue {\lambda} and a (possibly complex valued) eigenvector {{\bf z} = {\bf x} + i {\bf y}}, where {{\bf x}} and {{\bf y}} are real vectors. We have

\displaystyle  M {\bf x} + i M {\bf y} = \lambda {\bf x} + i \lambda {\bf y}

from which (recalling that the entries of {M} and the scalar {\lambda} are real) it follows that {M {\bf x} = \lambda {\bf x}} and that {M {\bf y} = \lambda {\bf y}}; since {{\bf x}} and {{\bf y}} cannot both be zero, it follows that {\lambda} has a real eigenvector. \Box

We are now ready to prove the spectral theorem

Proof: We proceed by induction on {n}. The case {n=1} is trivial.

Assume that the statement is true for dimension {n-1}. Let {\lambda_1} be a real eigenvalue of {M} and {{\bf x}_1} be a real eigenvector {\lambda_1}.

Now we claim that for every vector {{\bf y}} that is orthogonal to {{\bf x}_1}, then {M{\bf y}} is also orthogonal to {{\bf x}_1}. Indeed, the inner product of {M{\bf y}} and {{\bf x}_1} is

\displaystyle  \langle {\bf x}_1, M{\bf y} \rangle = \langle M {\bf x}_1 , {\bf y} \rangle = \langle \lambda {\bf x}_1,{\bf y} \rangle = 0

Let {X} be the {n-1}-dimensional subspace of {{\mathbb R}^n} that contains all the vectors orthogonal to {{\bf x}_1}. We want to apply the inductive hypothesis to {M} restricted to {X}; we cannot literally do that, because the theorem is not stated in terms of arbitrary linear operators over vector spaces, so we will need to do that by fixing an appropriate basis for {X}.

let {B \in {\mathbb R}^{n \times (n-1)}} be a matrix that computes a bijective map from {{\mathbb R}^{n-1}} to {X}. (If {{\bf b}_1, \ldots,{\bf b}_{n-1}} is an orthonormal basis for {X}, then {B} is just the matrix whose columns are the vectors {{\bf b}_i}.) Let also {B' \in {\mathbb R}^{(n-1) \times n}} be the matrix such that, for every {{\bf y} \in X}, {BB' {\bf y} = {\bf y}}. (We can set {B' = B^T} where {B} is as described above.) We apply the inductive hypothesis to the matrix

\displaystyle  M' := B'MB \in {\mathbb R}^{(n-1) \times (n-1)}

and we find eigenvalues {\lambda_2,\ldots,\lambda_n} and orthonormal eigenvectors {{\bf y}_2,\ldots,{\bf y}_n} for {M'}.

For every {i=2,\ldots,n}, we have

\displaystyle  B'MB {\bf y}_i = \lambda_i {\bf y}_i

and so

\displaystyle  BB'MB{\bf y}_i = \lambda_i B {\bf y}_i

Since {B{\bf y}_i} is orthogonal to {{\bf x}_1}, it follows that {MB{\bf y}_i} is also orthogonal to {{\bf x}_1}, and so {BB'MB{\bf y}_i = MB{\bf y}_i}, so we have

\displaystyle  MB{\bf y}_i = \lambda_i B{\bf y}_i

and, defining {{\bf x}_i := B{\bf y}_i}, we have

\displaystyle  M {\bf x}_i = \lambda_i {\bf x}_i

Finally, we observe that the vectors {{\bf x}_i} are orthogonal. By construction, {{\bf x}_1} is orthogonal to {{\bf x}_2,\ldots,{\bf x}_n}, and, for every {2 \leq i < j \leq n}, we have that

\displaystyle  \langle {\bf x}_i , {\bf x}_j \rangle = \langle B {\bf y}_i , B {\bf y}_j \rangle = \langle {\bf y}_i , B^T B {\bf y} _j \rangle = \langle {\bf y}_i , {\bf y}_j \rangle = 0

We have not verified that the vectors {{\bf x}_i} have norm 1 (which is true), but we can scale them to have norm 1 if not. \Box

3. Variational Characterization of Eigenvalues

We conclude these notes with the variational characterization of eigenvalues for real symmetric matrices.

Theorem 6 Let {M\in {\mathbb R}^{n \times n}} be a symmetric matrix, and {\lambda_1 \leq \lambda_2 \leq \cdots \lambda_n} be the eigenvalues of {M} in non-increasing order. Then

\displaystyle  \lambda_k = \min_{k-{\rm dim} \ X} \ \ \ \max_{{\bf x} \in X - \{ {\bf 0} \} } \ \ \frac{ {\bf x}^T M {\bf x}}{{\bf x}^T {\bf x}}

The quantity {\frac { {\bf x}^T M {\bf x}}{{\bf x}^T {\bf x}} } is called the Rayleigh quotient of {{\bf x}} with respect to {M}, and we will denote it by {R_M({\bf x})}.

Proof: Let {{\bf v}_1,\ldots,{\bf v}_n} be orthonormal eigenvectors of the eigenvalues {\lambda_1,\ldots,\lambda_n}, as promised by the spectral theorem. Consider the {k}-dimensional space spanned by {{\bf v}_1, \ldots,{\bf v}_k}. For every vector {{\bf x} = \sum_{i=1}^k a_i {\bf v}_i} in such a space, the numerator of the Rayleigh quotient is

\displaystyle  \sum_{i,j} a_i a_j {\bf v}_i^T M {\bf v}_j = \sum_{i,j} a_i a_j \lambda_j {\bf v}_i^T {\bf v}_j = \sum_{i=1}^k \lambda_i a_i^2 \leq \lambda_k \cdot \sum_{i=1}^k a_i^2

The denominator is clearly {\sum_{i=1}^k a_j^2}, and so {R_M({\bf x}) \leq \lambda_k}. This shows that

\displaystyle  \lambda_k \geq \min_{k-{\rm dim} \ X} \ \ \ \max_{{\bf x} \in X - \{ {\bf 0} \}} \ \ \frac{ {\bf x}^T M {\bf x}}{{\bf x}^T {\bf x}}

For the other direction, let {X} be any {k}-dimensional space: we will show that {X} must contain a vector of Rayleigh quotient {\geq \lambda_k}. Let {S} be the span of {{\bf v}_k,\ldots,{\bf v}_n}; since {S} has dimension {n-k+1} and {X} has dimension {k}, they must have some non-zero vector in common. Let {{\bf x}} be one such vector, and let us write {{\bf x} = \sum_{i=k}^n a_i {\bf v}_i}. The numerator of the Rayleigh quotient of {{\bf x}} is

\displaystyle  \sum_{i=k}^n \lambda_i a_i^2 \geq \lambda_k \sum_i a_i^2

and the denominator is {\sum_i a_i^2}, so {R_M({\bf x}) \geq \lambda_k}. \Box

We have the following easy consequence.

Fact 7 If {\lambda_1} is the smallest eigenvalue of a real symmetric matrix {M}, then

\displaystyle  \lambda_1 = \min_{{\bf x} \neq {\bf 0}} R_M({\bf x})

Furthermore, every minimizer is an eigenvector of {\lambda_1}.

Proof: The identity is the {k=1} case of the previous theorem. For the furthermore part, let {\lambda_1 \leq \cdots \lambda_n} be the list of eigenvalues of {M} in non-decreasing order, and {{\bf v}_1,\ldots,{\bf v}_n} be corresponding eigenvectors. If {{\bf x} = \sum_i a_i {\bf v}_i} is any vector, then

\displaystyle  R_M( {\bf x} ) = \frac { \sum_i \lambda_i a_i^2}{\sum_i a_i^2}

If {R_M({\bf x}) = \lambda_1}, then {a_i = 0} for every {i} such that {\lambda_i > \lambda_1}, that is, {{\bf x}} is a linear combination of eigenvectors of {\lambda_1}, and hence it is an eigenvector of {\lambda_1}. \Box

Fact 8 If {\lambda_n} is the largest eigenvalue of a real symmetric matrix {M}, then

\displaystyle  \lambda_n = \max_{{\bf x} \neq {\bf 0}} R_M({\bf x})

Furthermore, every maximizer is an eigenvector of {\lambda_n}.

Proof: Apply Fact 7 to the matrix {-M}. \Box

Fact 9 If {\lambda_1} is the smallest eigenvalue of a real symmetric matrix {M}, and {{\bf x}_1} is an eigenvector of {\lambda_1}, then

\displaystyle  \lambda_2 = \min_{{\bf x} \neq {\bf 0},\ {\bf x} \perp {\bf x}_1} \ \ \ R_M({\bf x})

Furthermore, every minimizer is an eigenvector of {\lambda_2}.

Proof: A more conceptual proof would be to consider the restriction of {M} to the space orthogonal to {{\bf x}_1}, and then apply Fact 7 to such a linear operator. But, since we have not developed the theory for general linear operators, we would need to explicitly reduce to an {(n-1)}-dimensional case via a projection operator as in the proof of the spectral theorem.

Instead, we will give a more hands-on proof. Let {\lambda_1 \leq \lambda_2 \leq \cdots \lambda_n} be the list of eigenvalues of {M}, with multiplicities, and {{\bf v}_1,\ldots,{\bf v}_n} be orthonormal vectors as given by the spectral theorem. We may assume that {{\bf v}_1 = {\bf x}_1}, possibly by changing the orthonormal basis of the eigenspace of {\lambda_1}. For every vector {{\bf x} = \sum_{i=2}^k a_i {\bf v}_i } orthogonal to {{\bf v}_1}, its Rayleigh quotient is

\displaystyle  R_M({\bf x}) = \frac { \sum_{i=2}^n \lambda_i a_i^2}{\sum_i a_i^2} \geq \lambda_2

and the minimum is achieved by vectors {{\bf x}} such that {a_i = 0} for every {\lambda_i > \lambda_2}, that is, for vectors {{\bf x}} which are linear combinations of the eigenvectors of {\lambda_2}, and so every minimizer is an eigenvector of {\lambda_2}. \Box

Course on spectral methods and expanders

This semester, starting tomorrow, I am teaching a course on spectral methods and expanders. This is similar to a course I offered twice at Stanford, but this time it will be a 15-week course instead of a 10-week one.

The Stanford course had two main components: (1) spectral algorithms for sparsest cut, and comparisons with LP and SDP based methods, and (2) properties and constructions of expanders.

I will use the additional time to talk a bit more about spectral algorithms, including clustering algorithms, and about constructions of expanders, and to add a third part about electrical networks, sparsification, and max flow.

Lecture notes will be posted here after each lecture.

In some more detail, the course will start with a review of linear algebra and a proof of basic spectral graph theory facts, such as the multiplicity of 0 as an eigenvalue of the Laplacian being the same as the number of connected components of a graph.

Then we will introduce expansion and conductance, and prove Cheeger’s inequality. We will do so in the language of approximation algorithms, and we will see how the analysis of Fiedler’s algorithm given by Cheeger’s inequality compares to the Leighton-Rao analysis of the LP relaxation and the Arora-Rao-Vazirani analysis of the SDP relaxation. Then we will prove several variants of Cheeger’s inequality, interpreting them as analyses of spectral algorithms for clustering and max cut.

In the second part of the course, we will see properties of expanders and combinatorial and algebraic constructions of expanders. We will talk about the theory that gives eigenvalues and eigenvectors of Abelian Cayley graphs, the zig-zag graph product, and the Margulis-Gabber-Galil construction. I would also like to talk about the expansion of random graphs, and to explain how one gets expander constructions from Selberg’s “3/16 theorem,” although I am not sure if there will be time for that.

The first two parts will be tied together by looking at the MCMC algorithm to approximate the number of perfect matchings in a dense bipartite graph. The analysis of the algorithm depends on the mixing time of a certain exponentially big graph, the mixing time will be determined (as shown in a previous lecture on properties of expanders) by the eigenvalue gap, the eigenvalue gap will be determined (as shown by Cheeger’s inequality) by the conductance, and the conductance can be bounded by constructing certain multicommodity flows (as shown in the analysis of the Leighton-Rao algorithms).

In the third part, we will talk about electrical networks, effective resistance and electrical flows, see how to get sparsifiers using effective resistance, a sketch of how to salve Laplacian equations in nearly linear time, and how to approximate max flow using electrical flows.

End-of-year traditions

Having spent some time in Japan, I have learnt of the tradition of holding a Bōnenkai, literally a party to forget the year. Held either as a company end-of-year party, or by groups of friends, it’s a get-together in which people drink a lot and forget the bad things that happened to them during the year.

It occurred to me that this is the complement of Thanksgiving, in which you get together to remember the good things that happened during the year.

I don’t think there is anything else left to say about the difference between Japanese and American culture.

Interestingly, there are a couple more possibilities. One could remember the bad things that happened during the year, as in the airing of grievances during Festivus.

Finally, one could forget the good things, which is very much the Italian attitude.

Edited to add: I don’t know how I forgot (ah!) but there is a famous Neapolitan folk song that goes

Chi ha avuto, ha avuto, ha avuto
Chi ha dato, ha dato, ha dato,
Scurdammuce ‘o passato,
simm’e Napule, paisa’

which is roughly

Who has received, has received
Who has given, has given,
Let’s forget the past
We are [all] from Naples