How Good is the Leighton-Rao Relaxation?

Continuing our seemingly never-ending series on approximations to edge expansion, we come to the question of how good is the Leighton-Rao relaxation.

Given a graph G=(V,E), we are trying to approximate the sparsest cut problem (which, in turn, approximates the edge expansion problem within a factor of two) defined as

\displaystyle \phi(G) := \min_{S\subseteq V} \frac {edges(S,V-S)} {\frac 1n |S| \cdot |V-S|}

And we have seen that an equivalent characterization of sparsest cut is:

(1) \ \ \ \displaystyle \phi(G) = \min_{x\in {\mathbb R}^V}  \frac{ \sum_{i,j} A(i,j) | x_i - x_j | } {\frac 1n \sum_{i,j} |x_i - x_j | }

In the Leighton-Rao relaxation, we have

(2) \ \ \ \displaystyle LR(G) := \min_{d\in {\cal M}(V)} \frac{ \sum_{i,j} A(i,j) d(i,j) }{\frac 1n \sum_{i,j} d(i,j)}

where {\cal M}(V) is the set of all semi-metrics d: V \times V \rightarrow {\mathbb R}, that is, of all functions such that d(i,j) = d(j,i) \geq 0, d(i,i)=0, and that obey the “triangle inequality”

d(i,j) \leq d(i,k) + d(k,j)

for all i,j,k \in V.

This is clearly a relaxation of the sparsest cut problem, because for every x \in {\mathbb R}^V the “distances” |x_i - x_j| define indeed a semi-metric over V.

How bad can the relaxation be? Take a family of regular constant-degree expander graphs, that is graphs where \phi is a constant and the degree k is a constant independent of n=|V|. Define d(i,j) to be the shortest path distance between i and j in the graph. This is clearly a semi-metric, and we have

\displaystyle \sum_{i,j} A(i,j) d(i,j) = kn = O(n)

because along every edge the distance is precisely one, and

\displaystyle \sum_{i,j} d(i,j) = \Omega (n^2 \log_k n)

because, for every vertex i, at most k^t other vertices can be within
distance t, and so at least half the vertices are at distance \Omega( \log_k n).

So we have

\displaystyle LR(G) = O \left( \frac 1 {\log n} \right)

even though \phi(G) = \Omega(1), so the error in the approximation can be as bad as a factor of \log n. We shall prove that this is as bad as it gets.

Our task is then: given a feasible solution to (2), that is, a distance function defined on the vertices of G, find a feasible solution to sparsest cut whose cost is as small as possible, and no more than a O(\log n) factor off from the cost of the solution to (2) that we started from. Instead of directly looking for a cut, we shall look for a solution to (1), which is a more flexible formulation.

The general strategy will be, given d, to find a distribution X over vectors x \in {\mathbb R}^V such that for every two vertices i,j we have

(3) \ \ \ \displaystyle \Omega\left( \frac {1}{\log n} \right) \cdot d(i,j) \leq {\mathbb E} | X_i - X_j | \leq d(i,j)

Suppose that we start from a solution to (2) of cost c, that is, we have a semimetric d(\cdot,\cdot) such that

\displaystyle \sum_{i,j} A_{i,j} d(i,j) = c \cdot \frac 1n \sum_{i,j} d(i,j)

Then our distribution X is such that

\displaystyle {\mathbb E}  \sum_{i,j} A_{i,j} |X_i - X_j| \leq \sum_{i,j} A_{i,j} d(i,j)


\displaystyle \sum_{i,j} d(i,j) \leq O \left(  {\log n} \right) {\mathbb E} \sum_{i,j}|X_i - X_j|


\displaystyle {\mathbb E}  \left[ \sum_{i,j} A_{i,j} |X_i - X_j| - O(c\log n) \cdot \frac 1n \sum_{i,j} |X_i - X_j| \right] \leq 0

and there must exist a vector x in the support of the distribution X such that

\displaystyle \sum_{i,j} A_{i,j} |x_i - x_j| - O(c\log n) \cdot \frac 1n \sum_{i,j} |x_i - x_j|  \leq 0

and now we are done because the cost of x in (1) is at most O(\log n) times the optimum of the Leighton-Rao relaxation.

Though this works, it seems an overkill to require condition (3) for all pairs of vertices. It seems sufficient to require

{\mathbb E} |X_i-X_j| \leq d(i,j)

only for edges, and to require

d(i,j) \leq O(\log n) \cdot  {\mathbb E} |X_i-X_j|

only “on average” over all pairs of vertices. It can be shown, however, that if one wants to round a generalization of the Leighton-Rao relaxation to sparsest cut with “non-uniform demands,” then any rounding algorithm can be turned into a rounding algorithm that satisfies (3).

So how do we start from an arbitrary metric and come up with a probabilistic linear order of the vertices so that their distances in the original metric are well approximated in the linear ordering? We will describe a method of Bourgain, whose applicability in this setting is due Linial, London and Rabinovich.

The starting point is to see that if A is a set of vertices, and we define

\displaystyle X_i := d(A,i) :=  \min_{a\in A} d(a,i)

then, no matter how we choose A, we always have

|X_i - X_j | = |d(A,i) - d(A,j) | \leq d(i,j)

[Hint: use the triangle inequality to see that d(A,j) \leq d(A,i) + d(i,j).]

This means that “all we have to do” is to find a distribution over sets A such that for every i,j we have

(3) \ \ \ \displaystyle {\mathbb E} | d(A,i) - d(A,j) | \geq \Omega \left( \frac 1 {\log n} \right) \cdot d(i,j)

If you want to make it sound more high-brow, you may call such a distribution a Frechet embedding of d into \ell_1. Constructing such a distribution will be the subject of the next post.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s