CS359G Lecture 11: ARV

In which we introduce the Arora-Rao-Vazirani relaxation of sparsest cut, and discuss why it is solvable in polynomial time.

1. The Arora-Rao-Vazirani Relaxation

Recall that the sparsest cut {\phi(G)} of a graph {G=(V,E)} with adjacency matrix {A} is defined as

\displaystyle  \phi(G) = \min_{S\subseteq V} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} |1_S(u) - 1_S(v)|} {\frac 1{|V|^2} \sum_{u,v} |1_S(u) - 1_S(v)|}

and the Leighton-Rao relaxation is obtained by noting that if we define {d(u,v) := |1_S(u) - 1_S(v)|} then {d(\cdot,\cdot)} is a semimetric over {V}, so that the following quantity is a relaxation of {\phi(G)}:

\displaystyle  LR(G) = \min_{\begin{array}{l} d:V\times V \rightarrow R\\ d\ {\rm semimetric} \end{array}} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} d(u,v)} {\frac 1{|V|^2} \sum_{u,v} d(u,v)}

If {G} is {d}-regular, and we call {M:= \frac 1d \cdot A} the normalized adjacency matrix of {A}, and we let {\lambda_1=1\geq \lambda_2 \geq \cdots \geq \lambda_n} be the eigenvalues of {M} with multiplicities, then we proved in a past lecture that

\displaystyle   1-\lambda_2 = \min_{x: V \rightarrow {\mathbb R}} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} |x(u)-x(v)|^2} {\frac 1{|V|^2} \sum_{u,v} |x(u)-x(v)|^2} \ \ \ \ \ (1)

which is also a relaxation of {\phi(G)}, because, for every {S}, every {u} and every {v}, {|1_S(u) - 1_S(v)| = |1_S(u)-1_S(v)|^2}.

We note that if we further relax (1) by allowing {V} to be mapped into a higher dimension space {{\mathbb R}^m} instead of {{\mathbb R}}, and we replace {| \cdot - \cdot|} by {||\cdot - \cdot||^2}, the optimum remains the same.

Fact 1

\displaystyle  1-\lambda_2 = \min_{m, {\bf x}: V \rightarrow {\mathbb R}^m} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} ||{\bf x}(u)-{\bf x}(v)||^2} {\frac 1{|V|^2} \sum_{u,v} ||{\bf x}(u)-{\bf x}(v)||^2}

Proof: For a mapping {{\bf x}:V\rightarrow {\mathbb R}^m}, define

\displaystyle  \delta({\bf x}) := \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} ||{\bf x}(u)-{\bf x}(v)||^2} {\frac 1{|V|^2} \sum_{u,v} ||{\bf x}(u)-{\bf x}(v)||^2}

It is enough to show that, for every {{\bf x}}, {1-\lambda_2 \leq \delta({\bf x})}. Let {x_i(v)} be the {i}-th coordinate of {{\bf x}(v)}. Then

\displaystyle  \delta({\bf x}) = \frac {\frac 1{2|E|} \sum_i \sum_{u,v} A_{u,v} |x_i (u)-x_i (v)|^2} {\frac 1{|V|^2} \sum_i \sum_{u,v} |x_i(u)-x_i(v)|^2}

\displaystyle  \geq \min_{i} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} |x_i (u)-x_i (v)|^2} {\frac 1{|V|^2} \sum_{u,v} |x_i(u)-x_i(v)|^2}

\displaystyle  \geq 1-\lambda_2

where the second-to-last inequality follows from the fact, which we have already used before, that for nonnegative {a_1,\ldots,a_m} and positive {b_1,\ldots,b_m} we have

\displaystyle  \frac{a_1 + \cdots a_m}{b_1 + \cdots + b_m} \geq \min_i \frac {a_i}{b_i}

\Box

The above observations give the following comparison between the Leighton-Rao relaxation and the spectral relaxation: both are obtained by replacing {|1_S(u) - 1_S(v)|} with a “distance function” {d(u,v)}; in the Leighton-Rao relaxation, {d(u,v)} is constrained to satisfy the triangle inequality; in the spectral relaxation, {d(u,v)} is constrained to be the square of the Euclidean distance between {{\bf x}(u)} and {{\bf x}(v)} for some mapping {{\bf x} : V \rightarrow {\mathbb R}^m}.

The Arora-Rao-Vazirani relaxation is obtained by enforcing both conditions, that is, by considering distance functions {d(u,v)} that satisfy the triangle inequality and can be realized of {||{\bf x}(u) - {\bf x}(v)||^2} for some mapping {{\bf x}: V \rightarrow {\mathbb R}^m}.

Definition 2 A semimetric {d:V\rightarrow V \rightarrow {\mathbb R}} is called of negative type if there is a dimension {m} and a mapping {{\bf x}: V \rightarrow {\mathbb R}^m} such that {d(u,v) = || {\bf x}(u)-{\bf x}(v)||^2} for every {u,v\in V}.

With the above definition, we can formulate the Arora-Rao-Vazirani relaxation as

\displaystyle   ARV(G):= \min_{\begin{array}{c} d:V\times V \rightarrow R\\ d\ {\rm semimetric\ of\ negative\ type} \end{array}} \frac {\frac 1{2|E|} \sum_{u,v} A_{u,v} d(u,v)} {\frac 1{|V|^2} \sum_{u,v} d(u,v)} \ \ \ \ \ (2)

Remark 1 The relaxation (2) was first proposed by Goemans and Linial. Arora, Rao and Vazirani were the first to prove that it achieves an approximation guarantee which is better than the approximation guarantee of the Leighton-Rao relaxation.

We have, by definition,

\displaystyle  \phi(G) \leq ARV(G) \leq \min \{ LR(G), 1-\lambda_2(G) \}

and so the approximation results that we have proved for {1-\lambda_2} and {LR} apply to {ARV}. For every graph {G=(V,E)}

\displaystyle  ARV(G) \leq O(\log |V|) \cdot \phi(G)

and for every regular graph

\displaystyle  ARV(G) \leq \sqrt{8 \cdot \phi(G)}

Interestingly, the examples that we have given of graphs for which {LR} and {1-\lambda_2} give poor approximation are complementary. If {G} is a cycle, then {1-\lambda_2} is a poor approximation of {\phi(G)}, but {LR(G)} is a good approximation of {\phi(G)}; if {G} is a constant-degree expander then {LR(G)} is a poor approximation of {\phi(G)}, but {1-\lambda_2} is a good approximation.

When Goemans and Linial (separately) proposed to study the relaxation (2), they conjectured that it would always provide a constant-factor approximation of {\phi(G)}. Unfortunately, the conjecture turned out to be false, but Arora, Rao and Vazirani were able to prove that (2) does provide a strictly better approximation than the Leighton-Rao relaxation. In the next lectures, we will present parts of the proof of the following results.

Theorem 3 There is a universal constant {c} such that, for every graph {G=(V,E)},

\displaystyle  ARV(G) \leq c\cdot \sqrt{\log |V|} \cdot \phi(G)

Theorem 4 There is an absolute constant {c} and an infinite family of graphs {G_n = (V_n,E_n)} such that

\displaystyle  ARV(G) \geq c \cdot \log\log |V_n| \cdot \phi(G)

In the rest of this lecture we discuss the polynomial time solvability of (2).

2. The Ellipsoid Algorithm and Semidefinite Programming

Definition 5 If {C\subseteq {\mathbb R}^m} is a set, then a separation oracle for {C} is a procedure that, on input {{\bf x} \in R^m},

  • If {{\bf x}\in C}, outputs “yes”
  • If {{\bf x}\not\in C}, outputs coefficients {a_1,\ldots,a_m,b} such that

    \displaystyle  \sum_i x_i a_i < b

    but, for every {{\bf z} \in C},

    \displaystyle  \sum_i z_i a_i \geq b

Note that a set can have a separation oracle only if it is convex. Under certain additional mild conditions, if {C} has a polynomial time computable separation oracle, then the optimization problem

\displaystyle  \begin{array}{lll} {\rm minimize} & \sum_i {\bf c}^T {\bf x} \\ {\rm subject\ to}\\ & A{\bf x} \geq b\\ \in C \end{array}

is solvable in polynomial time using the Ellipsoid Algorithm.

It remains to see how to put the Arora-Rao-Vazirani relaxation into the above form.

Recall that a matrix {X\in {\mathbb R}^{n \times n}} is positive semidefinite if all its eigenvalues are nonnegative. We will use the set of all {n\times n} positive semidefinite matrices as our set {C} (thinking of them as {n^2}-dimensional vectors). If we think of two matrices {M,M' \in {\mathbb R}^{n \times n}} as {n^2}-dimensional vectors, then their “inner product” is

\displaystyle  M \bullet M' := \sum_{i,j} M_{i,j} \cdot M'_{i,j}

Lemma 6 The set of {n\times n} positive semidefinite matrices has a separation oracle computable in time polynomial in {n}.

Proof: Given a symmetric matrix {X}, its smallest eigenvalue is

\displaystyle  \min_{{\bf z} \in {\mathbb R}^n, \ ||{\bf z}||=1 } {\bf z}^T X {\bf z}

the vector achieving the minimum is a corresponding eigenvector, and both the smallest eigenvalue and the corresponding eigenvector can be computed in polynomial time.

If we find that the smallest eigenvalue of {X} is non-negative, then we answer “yes.” Otherwise, if {{\bf z}} is an eigenvector of the smallest eigenvalue we output the matrix {A = {\bf z}^T {\bf z}}. We see that we have

\displaystyle  A\bullet X = {\bf z}^T X {\bf z} < 0

but that, for every positive semidefinite matrix {M}, we have

\displaystyle  A \bullet M = {\bf z}^T M {\bf z} \geq 0

\Box

This implies that any optimization problem of the following form can be solved in polynomial time

\displaystyle   \begin{array}{lll} {\rm minimize} & C \bullet X \\ {\rm subject\ to}\\ & A^1\bullet X \geq b_1\\ & \cdots \\ & A^m \bullet X \geq b_m\\ & X \succeq 0 \end{array} \ \ \ \ \ (3)

where {C,A^1,\ldots,A^m} are square matrices of coefficients, {b_1,\ldots,b_m} are scalars, and {X} is a square matrix of variables. An optimization problem like the one above is called a semidefinite program.

It remains to see how to cast the Arora-Rao-Vazirani relaxation as a semidefinite program.

Lemma 7 For a symmetric matrix {M\in {\mathbb R}^{n \times n}}, the following properties are equivalent:

  1. {M} is positive semidefinite;
  2. there are vectors {{\bf x}_1,\ldots,{\bf x}_n \in {\mathbb R}^d} such that, for all {i,j}, {M_{i,j} = \langle {\bf x}_i , {\bf x}_j \rangle};
  3. for every vector {{\bf z} \in {\mathbb R}^n}, {{\bf z}^T M{\bf z} \geq 0}

Proof: That (1) and (3) are equivalent follows from the characterization of the smallest eigenvalue of {M} as the minimum of {{\bf z}^T M {\bf z}} over all unit vectors {{\bf z}}.

To see that (2) {\Rightarrow} (3), suppose that vectors {{\bf x}_1,\ldots,{\bf x}_n} exist as asserted in (2), take any vector {{\bf z}}, and see that

\displaystyle  {\bf z}^T M {\bf z} = \sum_{i,j} z(i) M_{i,j} z(j)

\displaystyle  = \sum_{i,j,k} z(i) x_i(k) x_j (k) z(j) = \sum_k \left( \sum_i z(i) x_i(k) \right)^2 \geq 0

Finally, to see that (1) {\Rightarrow} (2), let {\lambda_1,\ldots,\lambda_n} be the eigenvalues of {M} with multiplicities, and let {{\bf v}_1,\ldots,{\bf v}_n} be a corresponding orthonormal set of eigenvectors. Then

\displaystyle  M = \sum_i \lambda_k {\bf v}_k {\bf v}_k^T

that is,

\displaystyle  M_{i,j} = \sum_k \lambda_k v_k(i) v_k(j) = \langle {\bf x}_i, {\bf x}_j \rangle

if we define {{\bf x}_1,\ldots,{\bf x}_n} as the vectors such that {x_i(k) := \sqrt{\lambda_k} v_k(i)}. \Box

This means that the generic semidefinite program (4) can be rewritten as an optimization problem in which the variables are the vectors {{\bf x}_1,\ldots,{\bf x}_n} as in part (2) of the above lemma.

\displaystyle   \begin{array}{lll} {\rm minimize} & \sum_{i,j} C_{i,j} \langle {\bf x}_i , {\bf x}_j \rangle \\ {\rm subject\ to}\\ & \sum_{i,j} A^1_{i,j} \langle {\bf x}_i , {\bf x}_j \rangle \geq b_1\\ & \cdots \\ & \sum_{i,j} A^m_{i,j} \langle {\bf x}_i , {\bf x}_j \rangle \geq b_m\\ _i \in {\mathbb R}^d & \forall i \in \{ 1,\ldots, n\} \end{array} \ \ \ \ \ (4)

where the dimension {d} is itself a variable (although one could fix it, without loss of generality, to be equal to {n}). In this view, a semidefinite program is an optimization problem in which we wish to select {n} vectors such that their pairwise inner products satisfy certain linear inequalities, while optimizing a cost function that is linear in their pairwise inner product.

The square of the Euclidean distance between two vectors is a linear function of inner products

\displaystyle  || {\bf x} - {\bf y} ||^2 = \langle {\bf x} -{\bf y}, {\bf x} - {\bf y} \rangle = \langle {\bf x},{\bf x} \rangle - 2 \langle {\bf x},{\bf y} \rangle + \langle {\bf y},{\bf y} \rangle

and so, in a semidefinite program, we can include expressions that are linear in the pairwise squared distances (or squared norms) of the vectors. The ARV relaxation can be written as follows

\displaystyle  \begin{array}{lll} {\rm minimize} & \sum_{u,v} A_{u,v} || {\bf x}_u - {\bf x}_v ||^2\\ {\rm subject\ to}\\ & \sum_{u,v} || {\bf x}_u - {\bf x}_v ||^2 = \frac {|V|^2}{2|E|}\\ & || {\bf x}_u - {\bf x}_v||^2 \leq || {\bf x}_u - {\bf x}_w||^2 + || {\bf x}_w - {\bf x}_v||^2 & \forall u,v,w \in V\\ & \mbox{} {\bf x}_u \in {\mathbb R}^d & \forall u\in V \end{array}

and so it is a semidefinite program, and it can be solved in polynomial time.

Remark 2 Our discussion of polynomial time solvability glossed over important issues about numerical precision. To run the Ellipsoid Algorithm one needs, besides the separation oracle, to be given a ball that is entirely contained in the set of feasible solutions and a ball that entirely contains the set of feasible solutions, and the running time of the algorithm is polynomial in the size of the input, polylogarithmic in the ratio of the volumes of the two balls, and polylogarithmic in the desired amount of precision. At the end, one doesn’t get an optimal solution, which might not have a finite-precision exact representation, but an approximation within the desired precision. The algorithm is able to tolerate a bounded amount of imprecision in the separation oracle, which is an important feature because we do not have exact algorithms to compute eigenvalues and eigenvectors (the entries in the eigenvector might not have a finite-precision representation).

The Ellipsoid algorithm is typically not a practical algorithm. Algorithms based on the interior point method have been adapted to semidefinite programming, and run both in worst-case polynomial time and in reasonable time in practice.

Arora and Kale have developed an {\tilde O( (|V|+|E|)^2 /\epsilon^{O(1)})} time algorithm to solve the ARV relaxation within a multiplicative error {(1+\epsilon)}. The dependency on the error is worse than that of generic algorithms, which achieve polylogarithmic dependency, but this is not a problem in this application, because we are going to lose an {O(\sqrt {\log |V|})} factor in the rounding, so an extra constant factor coming from an approximate solution of the relaxation is a low-order consideration.

4 thoughts on “CS359G Lecture 11: ARV

  1. Michel Goemans once said to me that he didn’t view the Goemans-Linial conjecture as an actual conjecture, and he could remember whether he believed the integrality gap of (2) to be a constant or not.

    I believe his discussion of relaxation (2) is from this paper:
    http://www-math.mit.edu/~goemans/PAPERS/semidef-survey.ps
    On page 18, he says (paraphrasing) “If you can embed negative type metrics into l1 with constant distortion, then this shows the integrality gap of (2) is a constant. This is a very intriguing question”.

    I think it’s probably more accurate to say the “Goemans-Linial Question” rather than “Goemans-Linial Conjecture”. That said, I haven’t read what Linial wrote about it.

  2. In the display just after Remark 1, shouldn’t it be max{LR(G),1-\lambda_2<= ARV(G) <= phi(G) (and similarly in a couple of other places in the post)?

    Really enjoying this series of blogposts, by the way. Thanks!

  3. I checked Linial’s ICM paper. Linial asked “what is the minimal \alpha = \alpha(n) such that” an n-point negative type metric can be embedded in \ell_1 with distortion \alpha, which indicates that he was not necessarily expecting the answer to be a constant.

  4. 5 years later… The paper of mine from 1997 that Nick is referring to actually says (without paraphrasing): “If one could show that negative type metrics can be embedded into $l_2$ with distortion $O(\sqrt{\log n})$ (or possibly even into $l_1$ within a constant), this would give a worst-case ratio that is $O(\sqrt{\log n})$ (resp. constant)”. Definitely a question, but not a conjecture! We should always go back to the original sources…🙂 The statement of O(\sqrt{\log n}) was motivated partly by a result of mine (see Magen and Moharrami http://www.cs.toronto.edu/~avner/papers/no-dimred.pdf ) that a negative type metric in dimension d can be embedded into $l_2$ with distortion $\sqrt{d}$.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s