In which we talk about even more generalizations of Cheeger’s inequalities, and we analyze the power method to find approximate eigenvectors, thus having a complete description of a polynomial-time approximation algorithm for sparsest cut
1. Irregular Graphs
For simplicity, we proved our results on and for regular graphs. Those results extend, essentially with the same proofs, to the case of irregular undirected graphs. In an irregular graph , the notion that generalizes edge expansion is called conductance. If is the degree of vertex , then the conductance of set of vertices is
We will call the sum of the degrees of a set of vertices the volume of the set, and denote it . The conductance of the graph is
Higher-order conductance is defined as higher-order expansion, but with conductance replacing expansion in the definition.
The Cheeger inequalities
still hold, with the same proof. With some abuse of notation, we will call the following quantity the Rayleigh quotient of
even if, technically, it is the Rayleigh quotient of , where is the diagonal matrix of degrees.
We can also adapt the proof of the higher-order Cheeger inequality to show
2. More Cheeger-type Bounds
We proved that if is the cut found by Fiedler’s algorithm using the eigenvector of , then
which is a good bound, although it usually underestimates the quality of the solutions found in practice. (There are graphs, however, for which the above inequality is tight within a constant factor.)
One case in which we can improve the analysis is when there are not too many eigenvalues close to
So we have
which is a better bound for families of graphs in which, for some , .
We will not have time to prove Theorem 1, but we will state the two main pieces of its proof.
That is, if , then there are values such that most entries of are close to one of those values.
The above lemma should be compared to the fact, which was a major piece in the proof of Cheeger’s inequality, that if is an arbitrary non-negative vector, then there is a threshold such that
One obtains Theorem 1 in the following. Start from an eigenvector of and, using the first step in the proof of Cheeger’s inequality, obtain a vector with non-negative entries such that and such that the support of contains at most vertices.
The set contains at most vertices, it is one of the cuts considered by Fiedler’s algorithm on input .
Another property of graphs in which is large for small is that they contain large expanders as induced subgraphs.
Theorem 4 There is a constant such that, for every graph and every , there exists a partition of the vertices into sets such that, if we call the subgraph induced by the vertex set , we have
Theorem 5 If , then there is a partition of the vertices into subsets such that
3. The Power Method
Earlier in this class, we showed that, if is a -regular graph, and is its normalized Laplacian matrix with eigenvalues , given an eigenvector of , Fiedler’s algorithm finds, in nearly-linear time , a cut such that .
More generally, if, instead of being given an eigenvector such that , we are given a vector such that , then the algorithm finds a cut such that . We will now see how to compute such a vector using arithmetic operations.
A symmetric matrix is positive semi-definite (abbreviated PSD) if all its eigenvalues are nonnegative. We begin by describing an algorithm that approximates the largest eigenvalue of a given symmetric PSD matrix. This might not seem to help very much because because we want to compute the second smallest, not the largest, eigenvalue. We will see, however, that the algorithm is easily modified to accomplish what we want.
3.1. The Power Method to Approximate the Largest Eigenvalue
The algorithm works as follows
Input: PSD matrix , parameter
- Pick uniformly at random
- for to
That is, the algorithm simply picks uniformly at random a vector with coordinates, and outputs .
Note that the algorithm performs arithmetic operations, where is the number of non-zero entries of the matrix .
where is the largest eigenvalue of .
Note that, in particular, we can have and .
Let be the eigenvalues of , with multiplicities, and be a system of orthonormal eigenvectors such that . Theorem 6 is implied by the following two lemmas
It remains to prove the two lemmas.
Proof: (Of Lemma 7) Let . The inner product is the random variable
Let us compute the first, second, and fourth moment of .
Recall that the Paley-Zygmund inequality states that if is a non-negative random variable with finite variance, then, for every , we have
which follows by noting that
We apply the Paley-Zygmund inequality to the case and , and we derive
Remark 1 The proof of Lemma 7 works even if is selected according to a 4-wise independent distribution. This means that the algorithm can be derandomized in polynomial time.
Proof: (Of Lemma 8) Let us write as a linear combination of the eigenvectors
where the coefficients can be computed as . We have
We need to prove a lower bound to the ratio of the above two quantities. We will compute a lower bound to the numerator and an upper bound to the denominator in terms of the same parameter.
Let be the number of eigenvalues larger than . Then, recalling that the eigenvalues are sorted in non-increasing order, we have
We also see that
So we have
Remark 2 Where did we use the assumption that is positive semidefinite? What happens if we apply this algorithm to the adjacency matrix of a bipartite graph?
3.2. Approximating the Second Largest Eigenvalue
Suppose now that we are interested in finding the second largest eigenvalue of a given PSD matrix . If has eigenvalues , and we know the eigenvector of , then is a PSD linear map from the orthogonal space to to itself, and is the largest eigenvalue of this linear map. We can then run the previous algorithm on this linear map.
Input: PSD matrix , vector parameter
- Pick uniformly at random
- for to
If is an orthonormal basis of eigenvectors for the eigenvalues of , then, at the beginning, we pick a random vector
that, with probability at least , satisfies . (Cf. Lemma 7.) Then we compute , which is the projection of on the subspace orthogonal to , that is
Note that and that .
The output is the vector
If we apply Lemma 8 to subspace orthogonal to , we see that when we have that, for every ,
We have thus established the following analysis.
Theorem 9 For every PSD matrix , positive integer and parameter , if is a length-1 eigenvector of the largest eigenvalue of , then with probability over the choice of , the algorithm Power2 outputs a vector such that
where is the second largest eigenvalue of , counting multiplicities.
3.3. The Second Smallest Eigenvalue of the Laplacian
Finally, we come to the case in which we want to compute the second smallest eigenvalue of the normalized Laplacian matrix of a -regular graph , where is the adjacency matrix of .
Consider the matrix . Then if are the eigenvalues of , we have that
are the eigenvalues of , and that is PSD. and have the same eigenvectors, and so is a length-1 eigenvector of the largest eigenvalue of .
By running algorithm Power2, we can find a vector such that
so, rearranging, we have
If we want to compute a vector whose Rayleigh quotient is, say, at most , then the running time will be , because we need to set , which is not nearly linear in the size of the graph if is, say .
For a running time that is nearly linear in for all values of , one can, instead, apply the power method to the pseudoinverse of . (Assuming that the graph is connected, is the unique vector such that , if , and if is parallel to .) This is because has eigenvalues , and so is PSD and is its largest eigenvalue.
Although computing is not known to be doable in nearly linear time, there are nearly linear time algorithms that, given , solve in the linear system , and this is the same as computing the product , which is enough to implement algorithm Power applied to .
(Such algorithms will be discussed in the third part of the course. The algorithms will find an approximate solution to the linear system , but this will be sufficient. In the following, we proceed as if the solution was exact.)
In time , we can find a vector such that , where is a random vector in , shifted to be orthogonal to and . What is the Rayleigh quotient of such a vector with respect to ?
Let be a basis of orthonormal eigenvectors for and . If are the eigenvalues of , then we have
and, for , we have
Write , where , and ssume that, as happens with probability at least , we have . Then
and the Rayleigh quotient of with respect to is
and the analysis proceeds similarly to the analysis of the previous section. If we let be the index such that then we can upper bound the numerator as
and we can lower bound the denominator as
and the Rayleigh quotient is