You are currently browsing the tag archive for the ‘Spectral partitioning’ tag.
In which we prove the difficult direction of Cheeger’s inequality.
As in the past lectures, consider an undirected -regular graph , call its adjacency matrix, and its scaled adjacency matrix. Let be the eigenvalues of , with multiplicities, in non-increasing order. We have been studying the edge expansion of a graph, which is the minimum of over all non-trivial cuts of the vertex set (a cut is trivial if or ), where the expansion of a cut is
We have also been studying the (uniform) sparsest cut problem, which is the problem of finding the non-trivial cut that minimizes , where the sparsisty of a cut is
We are proving Cheeger’s inequalities:
and we established the left-hand side inequality in the previous lecture, showing that the quantity can be seen as the optimum of a continuous relaxation of , so that , and follows by the definition.
Today we prove the more difficult, and interesting, direction. The proof will be constructive and algorithmic. The proof can be seen as an analysis of the following algorithm.
- Input: graph and vector
- Sort the vertices of in non-decreasing order of values of entries in , that is let where
- Let be such that is minimal
We note that the algorithm can be implemented to run in time , assuming arithmetic operations and comparisons take constant time, because once we have computed it only takes time to compute .
We have the following analysis of the quality of the solution:
and let be the output of algorithm SpectralPartitioning on input and . Then
Remark 1 If we apply the lemma to the case in which is an eigenvector of , then , and so we have
which is the difficult direction of Cheeger’s inequalities.
Remark 2 If we run the SpectralPartitioning algorithm with the eigenvector of the second eigenvalue , we find a set whose expansion is
Even though this doesn’t give a constant-factor approximation to the edge expansion, it gives a very efficient, and non-trivial, approximation.
As we will see in a later lecture, there is a nearly linear time algorithm that finds a vector for which the expression in the lemma is very close to , so, overall, for any graph we can find a cut of expansion in nearly linear time.
In June, I wrote about my work on using spectral partitioning to approximate Max Cut.
I have now posted a revised paper with a couple new things.
One is an improved analysis due to Moses Charikar, of the algorithm described in the June paper. Moses shows that if one starts from a graph in which the optimum cuts a fraction of edges, then the algorithm cuts at least a fraction of edges (and also at least half of the edges). Cutting more than a fraction of edges is Unique-Games-hard. Optimizing the fraction
we see that the approximation ratio of the algorithm is always at least .
The second new result answers a question raised by an anonymous commenter in June: what about Max Cut Gain? Read the rest of this entry »
In the Max Cut problem, we are given an undirected graph and we want to find a partition of the set of vertices such that as many edges as possible have one endpoint in and one endpoint in , and are hence cut by the partition.
It is easy, as recognized since the 1970s, to find a partition that cuts half of the edges and that, thus, is at least half as good as an optimal solution. No approximation better than 1/2 was known for this problem, until Goemans and Williamson famously proved that one could achieve a .878… approximation using semidefinite programming (SDP).
No other approximation algorithm achieving an approximation asymptotically better than 1/2 is known, and it seems that a fundamental difficulty is the following. Suppose we prove that a certain algorithm achieves approximation . Then, given a graph in which the optimum is, say, , the algorithm and its analysis must provide a certificate that the optimum cut in the given graph is , and there is no general technique to prove upper bounds to the Max Cut optimum of a general graph other than Semidefinite Programming. (And see here and here for negative results showing that large classes of linear programming relaxations are unable to give such certificates.)
Spectral techniques can prove upper bounds to Max Cut in certain cases (and can be seen as special cases of the upper bounds provided by the Goemans-Williamson relaxation).
In the simplified case in which is a -regular graph, let be the adjacency matrix of and be the eigenvalues of ; then it is easy to show that
where is the fraction of edges cut by an optimal solution. Unfortunately (1) does not quite have an approximate converse: there are graphs where but .
The following fact, however, is always true and well known:
- if and only if contains a bipartite connected component.
Is there an “approximate” version of the above statement characterizing the cases in which is small? Surprisingly, as far as I know the question had not been considered before.
For comparison, the starting point of the theory of edge expansion is the related fact
- if and only if is disconnected.
Which can be rephrased as:
- if and only if there is a non-empty , such that .
Cheeger’s inequality characterizes the case in which is small:
- If there is a non-empty , such that , then ;
- If then there is a non-empty , such that .
For a subset , and a bipartition of , we say that an edge fails [to be cut by] the bipartition if is incident on but it is not the case that one endpoint is in and one endpoint is in . (This means that either both endpoints are in , or both endpoints are in , or one endpoint is in and one endpoint is not in .) Then we can express the well-known fact about as
- if and only if there is and a bipartition of with zero failed edges.
In this new paper I prove the following approximate version
- If there is a non-empty , and a bipartition of with at most failed edges, then ;
- If , then there is a non-empty , and a partition of with at most failed edges.
The following notation makes the similarity with Cheeger’s inequality clearer. Define the edge expansion of a graph as
Let us define the bipartiteness ratio of as
that is, as the minimum ratio between failed edges of a partition of a set over .
Then Cheeger’s inequality gives
and our results give
This translates into an efficient algorithm that, given a graph such that , finds a set and a bipartition of such that at least a fraction of the edges incident on are cut by the bipartition. Removing the vertices in and continuing recursively on the residual graph yields a .50769… approximation algorithm for Max Cut. (The algorithm stops making recursive calls, and uses a random partition, when the partition of found by the algorithm has too many failed edges.)
The paper is entirely a by-product of the ongoing series of posts on edge expansion: the question of relations between spectral techniques to max cut was asked by a commenter and the probabilistic view of the proof of Cheeger’s inequality that I wrote up in this post was very helpful in understanding the gap between and .
We continue to talk about the problem of estimating the expansion of a graph , focusing on the closely related sparsest cut, defined as
The spectral paritioning algorithm first finds a vector minimizing
(where is the adjacency matrix of ) and then finds the best cut where is of the form for a threshold .
We proved that if the quantity in (1) is and is -regular, then the algorithm will find a cut of sparsity at most , and that if is the eigenvector of the second eigenvalue, then it is an optimal solution to (1), and the cost of an optimal solution to (1) is a lower bound to . This means that the algorithm finds a cut of sparsity at most .
Can the analysis be improved? Read the rest of this entry »
[In which we prove the “difficult part” of Cheeger’s inequality by analyzing a randomized rounding algorithm for a continuous relaxation of sparsest cut.]
We return to this month’s question: if is a -regular graph, how well can we approximate its edge expansion defined as
and its sparsest cut defined as
where is the adjacency matrix of .
We have looked at three continuous relaxations of , the spectral gap, the Leighton-Rao linear program, and the Arora-Rao-Vazirani semidefinite program.
As we saw, the spectral gap of , defined as the difference between largest and second largest eigenvalue of , can be seen as the solution to a continuous optimization problem:
It follows from the definitions that
which is the “easy direction” of Cheeger’s inequality, and the interesting thing is that is never much smaller, and it obeys
which is the difficult part of Cheeger’s inequality. When we normalize all quantities by the degree, the inequality reads as
I have taught (1) in three courses and used it in two papers, but I had never really understood it, where I consider a mathematical proof to be understood if one can see it as a series of inevitable steps. Many steps in the proofs of (1) I had read, however, looked like magic tricks with no explanation. Finally, however, I have found a way to describe the proof that makes sense to me. (I note that everything I will say in this post will be completely obvious to the experts, but I hope some non-expert will read it and find it helpful.)
We prove (1), as usual, by showing that given any such that
we can find a threshold such that the cut defined by satisfies
This not only gives us a proof of (1), but also an algorithm for finding sparse cuts when they exist: take a vector which is an eigenvector of the second eigenvalue (or simply a vector for which the Rayleigh quotient in (2) is small), sort the vertices according to the value of , and find the best cut among the “threshold cuts.” This is the “spectral partitioning” algorithm.
This means that proving (1) amounts to studying an algorithm that “rounds” the solution of a continuous relaxation to a combinatorial solution, and there is a standard pattern to such arguments in computer science: we describe a randomized rounding algorithm, study its average performance, and then argue that there is a fixed choice that is at least as good as an average one. Here, in particular, we would like to find a distribution over threshold, such that if we define as a random variable in terms of we have
and so, using linearity of expectation,
from which we see that there must be a threshold in our sample space such that (3) holds. I shall present a proof that explicitly follows this pattern. It would be nice if we could choose uniformly at random in the interval , but I don’t think it would work. (Any reader can see a counterexample? I couldn’t, but we’ll come back to it.) Instead, the following works: assuming with no loss of generality that the median of is zero, can be chosen so that is distributed uniformly in the interval . (This means that thresholds near the median are less likely to be picked than thresholds far from the median.) This choice seems to be a magic trick in itself, voiding the point I made above, but I hope it will become natural as we unfold our argument. Read the rest of this entry »