In which we introduce semi-definite programming and a semi-definite programming relaxation of sparsest cut, and we reduce its analysis to a key lemma that we will prove in the next lecture(s)
Tag Archives: sparsest cut
CS294 Lecture 9: The Sparsest Cut Problem
In which we introduce the sparsest cut problem and the Leighton-Rao relaxation.
1. The Uniform Sparsest Cut problem, Edge Expansion and
Let be an undirected graph with vertices.
We define the uniform sparsity of a cut as
(we will omit the subscript when clear from the context) and the uniform sparsest cut of a graph is
In -regular graphs, approximating the uniform sparsest cut is equivalent (up to a factor of 2 in the approximation) to approximating the edge expansion, because, for every cut , we have
and, noting that, for every, ,
we have, for every ,
and so
It will be instructive to see that, in -regular graphs, is a relaxation of , a fact that gives an alternative proof of the easy direction of Cheeger’s inequalities.
CS261 Lecture 16: Multicommodity flow
In which we define a multi-commodity flow problem, and we see that its dual is the relaxation of a useful graph partitioning problem. The relaxation can be rounded to yield an approximate graph partitioning algorithm.
CS359G Lecture 10: Non-uniform Sparsest Cut
In which we prove that there are -point metric spaces that cannot be embedded into L1 with distortion , and we see further applications of Leighton-Rao type relaxations and of the use of metric embeddings as rounding schemes.
CS359G Lecture 8: the Leighton-Rao Relaxation
In which we introduce the Leighton-Rao relaxation of sparsest cut.
Let be an undirected graph. Unlike past lectures, we will not need to assume that is regular. We are interested in finding a sparsest cut in , where the sparsity of a non-trivial bipartition of the vertices is
which is the ratio between the fraction of edges that are cut by and the fraction of pairs of vertices that are disconnected by the removal of those edges.
Another way to write the sparsity of a cut is as
where is the adjacency matrix of and is the indicator function of the set .
The observation that led us to see as the optimum of a continuous relaxation of was to observe that , and then relax the problem by allowing arbitrary functions instead of indicator functions .
The Leighton-Rao relaxation of sparsest cut is obtained using, instead, the following observation: if, for a set , we define , then defines a semi-metric over the set , because is symmetric, , and the triangle inequality holds. So we could think about allowing arbitrary semi-metrics in the expression for , and define
This might seem like such a broad relaxation that there could be graphs on which bears no connection to . Instead, we will prove the fairly good estimate
Furthermore, we will show that , and an optimal solution can be computed in polynomial time, and the second inequality above has a constructive proof, from which we derive a polynomial time -approximate algorithm for sparsest cut.
1. Formulating the Leighton-Rao Relaxation as a Linear Program
The value and an optimal can be computed in polynomial time by solving the following linear program
that has a variable for every unordered pair of distinct vertices . Clearly, every solution to the linear program (3) is also a solution to the right-hand side of the definition (1) of the Leighton-Rao parameter, with the same cost. Also every semi-metric can be normalized so that by multiplying every distance by a fixed constant, and the normalization does not change the value of the right-hand side of (1); after the normalization, the semimetric is a feasible solution to the linear program (3), with the same cost.
In the rest of this lecture and the next, we will show how to round a solution to (3) into a cut, achieving the logarithmic approximation promised in (2).
2. An L1 Relaxation of Sparsest Cut
In the Leighton-Rao relaxation, we relax distance functions of the form to completely arbitrary distance functions. Let us consider an intermediate relaxation, in which we allow distance functions that can be realized by an embedding of the vertices in an space.
Recall that, for a vector , its norm is defined as , and that this norm makes into a metric space with the distance function
The distance function is an example of a distance function that can be realized by mapping each vertex to a real vector, and then defining the distance between two vertices as the norm of the respective vectors. Of course it is an extremely restrictive special case, in which the dimension of the vectors is one, and in which every vertex is actually mapping to either zero or one. Let us consider the relaxation of sparsest cut to arbitrary mappings, and define
This may seem like another very broad relaxation of sparsest cut, whose optimum might bear no correlation with the sparsest cut optimum. The following theorem shows that this is not the case.
Theorem 1 For every graph , .
Furthermore, there is a polynomial time algorithm that, given a mapping , finds a cut such that
Proof: We use ideas that have already come up in the proof the difficult direction of Cheeger’s inequality. First, we note that for every nonnegative reals and positive reals we have
Let be the -th coordinate of the vector , thus . Then we can decompose the right-hand side of (4) by coordinates, and write
This already shows that, in the definition of , we can map, with no loss of generality, to 1-dimensional spaces.
Let be the coordinate that achieves the minimum above. Because the cost function is invariant under the shifts and scalings (that is, the cost of a function is the same as the cost of for every two constants and ) there is a function such that has the same cost function as and it has a unit-length range .
Let us now pick a threshold uniformly at random from the interval , and define the random variables
We observe that for every pairs of vertices we have
and so we get
Finally, by an application of (5), we see that there must be a set among the possible values of such that (4) holds. Notice that the proof was completely constructive: we simply took the coordinate of with the lowest cost function, and then the “threshold cut” given by with the smallest sparsity.
3. A Theorem of Bourgain
We will derive our main result (2) from the L1 “rounding” process of the previous section, and from the following theorem of Bourgain (the efficiency considerations are due to Linial, London and Rabinovich).
Theorem 2 (Bourgain) Let be a semimetric defined over a finite set . Then there exists a mapping such that, for every two elements ,
where is an absolute constant. Given , the mapping can be found with high probability in randomized polynomial time in .
To see that the above theorem of Bourgain implies (2), consider a graph , and let be the optimal solution of the Leighton-Rao relaxation of the sparsest cut problem on , and let be a mapping as in Bourgain’s theorem applied to . Then
CS359G Lecture 1: Overview
In which we describe what this course is about.
1. Overview
This class is about the following topics:
- Approximation algorithms for graph partitioning problems. We will study approximation algorithms for the sparsest cut problem, in which one wants to find a cut (a partition into two sets) of the vertex set of a given graph so that a minimal number of edges cross the cut compared to the number of pairs of vertices that are disconnected by the removal of such edges.
This problem is related to estimating the edge expansion of a graph and to find balanced separators, that is, ways to disconnect a constant fraction of the pairs of vertices in a graph after removing a minimal number of edges.
Finding balanced separators and sparse cuts arises in clustering problems, in which the presence of an edge denotes a relation of similarity, and one wants to partition vertices into few clusters so that, for the most part, vertices in the same cluster are similar and vertices in different clusters are not. For example, sparse cut approximation algorithms are used for image segmentation, by reducing the image segmentation problem to a graph clustering problem in which the vertices are the pixels of the image and the (weights of the) edges represent similarities between nearby pixels.
Balanced separators are also useful in the design of divide-and-conquer algorithms for graph problems, in which one finds a small set of edges that disconnects the graph, recursively solves the problem on the connected components, and then patches the partial solutions and the edges of the cut, via either exact methods (usually dynamic programming) or approximate heuristic. The sparsity of the cut determines the running time of the exact algorithms and the quality of approximation of the heuristic ones.
We will study three approximation algorithms:
- The Spectral Partitioning Algorithm, based on linear algebra;
- The Leighton-Rao Algorithm, based on a linear programming relaxation;
- The Arora-Rao-Vazirani Algorithm, based on a semidefinite programming relaxation.
The three approaches are related, because the continuous optimization problem that underlies the Spectral Partitioning algorithm is a relaxation of the ARV semidefinite programming relaxation, and so is the Leighton-Rao relaxation. Rounding the Leighton-Rao and the Arora-Rao-Vazirani relaxations raise interesting problems in metric geometry, some of which are still open.
- Explicit Constructions of Bounded-Degree Expanders. Expander graphs are graphs with very strong connectivity and “pseudorandomness” properties. Constructions of constant-degree expanders are useful in a variety of applications, from the design of data structures, to the derandomization of algorithms, from efficient cryptographic constructions to being building blocks of more complex quasirandom objects.
There are two families of approaches to the explicit (efficient) construction of bounded-degree expanders. One is via algebraic constructions, typically ones in which the expander is constructed as a Cayley graph of a finite group. Usually these constructions are easy to describe but rather difficult to analyze. The study of such expanders, and of the related group properties, has become a very active research program, involving mostly ergodic theorists and number theorists. There are also combinatorial constructions, which are somewhat more complicated to describe but considerably simpler to analyze.
- Bounding the Mixing Time of Random Walks and Approximate Counting and Sampling. If one takes a random walk in a regular graph that is connected and not bipartite, then, regardless of the starting vertex, the distribution of the -th step of the walk is close to the uniform distribution over the vertices, provided that is large enough. It is always sufficient for to be quadratic in the number of vertices; in some graphs, however, the distribution is near-uniform even when is just poly-logarithmic. Among other applications, the study of the “mixing time” (the time that it takes to reach the uniform distribution) of random walks has applications to analyzing the convergence time of certain randomized algorithms.
The design of approximation algorithms for combinatorial counting problems, in which one wants to count the number of solutions to a given NP-type problem, can be reduced to the design of approximately uniform sampling in which one wants to approximately sample from the set of such solutions. For example, the task of approximately counting the number of perfect matchings can be reduced to the task of sampling almost uniformly from the set of perfect matchings of a given graph. One can design approximate sampling algorithms by starting from an arbitrary solution and then making a series of random local changes. The behavior of the algorithm then corresponds to performing a random walk in the graph that has a vertex for every possible solution and an edge for each local change that the algorithm can choose to make. Although the graph can have an exponential number of vertices in the size of the problem that we want to solve, it is possible for the approximate sampling algorithm to run in polynomial time, provided that a random walk in the graph converges to uniform in time poly-logarithmic in its size.
The study of the mixing time of random walks in graphs is thus a main analysis tool to bound the running time of approximate sampling algorithms (and, via reductions, of approximate counting algorithms).
These three research programs are pursued by largely disjoint communities, but share the same mathematical core.