In which we analyze a nearly-linear time algorithm for finding an approximate eigenvector for the second eigenvalue of a graph adjacency matrix, to be used in the spectral partitioning algorithm.
In past lectures, we showed that, if is a -regular graph, and is its normalized adjacency matrix with eigenvalues , given an eigenvector of , the algorithm SpectralPartition finds, in nearly-linear time , a cut such that .
More generally, if, instead of being given an eigenvector such that , we are given a vector such that , then the algorithm finds a cut such that . In this lecture we describe and analyze an algorithm that computes such a vector using arithmetic operations.
A symmetric matrix is positive semi-definite (abbreviated PSD) if all its eigenvalues are nonnegative. We begin by describing an algorithm that approximates the largest eigenvalue of a given symmetric PSD matrix. This might not seem to help very much because the adjacency matrix of a graph is not PSD, and because we want to compute the second largest, not the largest, eigenvalue. We will see, however, that the algorithm is easily modified to approximate the second eigenvalue of a PSD matrix (if an eigenvector of the first eigenvalue is known), and that the adjacency matrix of a graph can easily be modified to be PSD.
1. The Power Method to Approximate the Largest Eigenvalue
The algorithm works as follows
- Input: PSD symmetric matrix , positive integer
- Pick uniformly at random
- for to
That is, the algorithm simply picks uniformly at random a vector with coordinates, and outputs .
Note that the algorithm performs arithmetic operations, where is the number of non-zero entries of the matrix .
where is the largest eigenvalue of .
Note that, in particular, we can have and .
Let be the eigenvalues of , with multiplicities, and be a system of orthonormal eigenvectors such that . Theorem 1 is implied by the following two lemmas
It remains to prove the two lemmas.
Proof: (Of Lemma 2) Let . The inner product is the random variable
Let us compute the first, second, and fourth moment of .
Recall that the Paley-Zygmund inequality states that if is a non-negative random variable with finite variance, then, for every , we have
which follows by noting that
We apply the Paley-Zygmund inequality to the case and , and we derive
Remark 1 The proof of Lemma 2 works even if is selected according to a 4-wise independent distribution. This means that the algorithm can be derandomized in polynomial time.
Proof: (Of Lemma 3) Let us write as a linear combination of the eigenvectors
where the coefficients can be computed as . Note that, by assumption, , and that, by orthonormality of the eigenvectors, .
We need to prove a lower bound to the ratio of the above two quantities. We will compute a lower bound to the numerator and an upper bound to the denominator in terms of the same parameter.
Let be the number of eigenvalues larger than . Then, recalling that the eigenvalues are sorted in non-increasing order, we have
We also see that
So we have
Remark 2 Where did we use the assumption that is positive semidefinite? What happens if we apply this algorithm to the adjacency matrix of a bipartite graph?
2. Approximating the Second Eigenvalue
If is a PSD matrix, and if we know a unit-length eigenvector of the largest eigenvalue of , we can approximately find the second eigenvalue with the following adaption of the algorithm from the previous section.
- Input: PSD symmetric matrix , positive integer , vector
- Pick uniformly at random
- for to
If is an orthonormal basis of eigenvectors for the eigenvalues of , then, at the beginning, we pick a random vector
that, with probability at least , satisfies . (Cf. Lemma 2.) Then we compute , which is the projection of on the subspace orthogonal to , that is
Note that and that .
The output is the vector
If we apply Lemma 3 to subspace orthogonal to , we see that when we have that, for every ,
We have thus established the following analysis.
Theorem 4 For every PSD matrix , positive integer and parameter , if is a length-1 eigenvector of the largest eigenvalue of , then with probability over the choice of , the algorithm Power2 outputs a vector such that
where is the second largest eigenvalue of , counting multiplicities.
Finally, we come to the case in which is the normalized adjacency matrix of a regular graph.
We know that has eigenvalues and that is an eigenvector of .
Consider now the matrix . Every eigenvector of with eigenvalue is clearly also an eigenvector of with eigenvalue , and vice versa, thus has eigenvalues and it is PSD.
This means that we can run algorithm Power2 on the matrix using the vector and a parameter . The algorithm finds, with probability , a vector such that
which is equivalent to