# CS294 Lecture 13: ARV Analysis, cont’d

In which we continue the analysis of the ARV rounding algorithm

We are continuing the analysis of the Arora-Rao-Vazirani rounding algorithm, which rounds a Semidefinite Programming solution of a relaxation of sparsest cut into an actual cut, with an approximation ratio ${O(\sqrt {\log |V|})}$.

In previous lectures, we reduced the analysis of the algorithm to the following claim.

Lemma 1 Let ${d(\cdot, \cdot)}$ be a negative-type semimetric over a set ${V= \{1,\ldots,n\}}$, let ${{\bf x}_1,\ldots,{\bf x}_n \in {\mathbb R}^m}$ be vectors such that ${d(i,j) = ||{\bf x}_i - {\bf x}_j||^2}$, let ${{\bf g} \sim {\mathbb R}^m}$ be a random vector with a Gaussian distribution, and let ${Y_i := \langle {\bf g}, {\bf x}_i\rangle}$.

Suppose that, for constants ${c',\sigma}$ and a parameter ${\ell}$, we have that there is a ${\geq 10\%}$ probability that there are at least ${c' n}$ pairs ${(i,j)}$ such that ${d(i,j) \leq \ell}$ and ${|Y_i - Y_j| \geq \sigma}$.

Then there is a constant ${c_2}$, that depends only on ${c'}$ and ${\sigma}$, such that

$\displaystyle \ell \geq \frac {c_2}{\sqrt {\log n}}$

1. Concentration of Measure

In the last lecture, we are have already introduced two useful properties of Gaussian distributions: that there is a small probability of being much smaller than the standard deviation in absolute value, and a very small probability of being much larger than the standard deviation in absolute value. Here we introduce a third property of a somewhat different flavor.

For a set ${A\subseteq {\mathbb R}^n}$ and a distance parameter ${D}$, define

$\displaystyle A_D := \{ {\bf x} \in {\mathbb R}^m : \exists {\bf a } \in A. \ || {\bf x} - {\bf a } || \leq D\}$

the set of points at distance at most ${D}$ from ${A}$. Then we have:

Theorem 2 (Gaussian concentration of measure) There is a constant ${c_3}$ such that, for every ${\epsilon, \delta >0}$ and for every set ${A\subseteq {\mathbb R}^n}$, if

$\displaystyle \mathop{\mathbb P} [ A] \geq \epsilon$

then

$\displaystyle \mathop{\mathbb P}[ A_D] \geq 1 - \delta$

for every ${D\geq c_3 \cdot \sqrt{\log \frac 1 {\epsilon \delta}}}$, where the probabilities are taken according to the Gaussian measure in ${{\mathbb R}^m}$, that is ${\mathop{\mathbb P} [A] = \mathop{\mathbb P} [ {\bf g} \in A]}$, where ${{\bf g} = (g_1,\ldots,g_m)}$ and the ${g_i}$ are independent Gaussians of mean 0 and variance 1.

The above theorem says that if we have some property that is true with ${\geq 1\%}$ probability for a random Gaussian vector ${{\bf g}}$, then there is a ${\geq 99\%}$ probability that ${{\bf g}}$ is within distance ${O(1)}$ of a vector ${{\bf g}'}$ that satisfies the required property. In high dimension ${m}$, this is a non-trivial statement because, with very high probability ${|| {\bf g} ||}$ is about ${\sqrt m}$, and so the distance between ${{\bf g}}$ and ${{\bf g}'}$ is small relative to the length of the vector.

We will use the following corollary.

Corollary 3 Let ${{\bf x}_1,\ldots, {\bf x}_n}$ be vectors in ${{\mathbb R}^m}$ and let ${d_{\rm max} = \max_{j=2,\ldots,n} ||{\bf x}_j - {\bf x}_1||^2}$. Let ${{\bf g}}$ be a random Gaussian vector in ${{\mathbb R}^m}$, and let ${Y_i = \langle {\bf x}_i, {\bf g} \rangle}$. If, for some ${k}$ and ${\epsilon}$, we have

$\displaystyle \mathop{\mathbb P} [ \exists j. \ Y_j - Y_1 \geq k ] \geq \epsilon$

then

$\displaystyle \mathop{\mathbb P} [ \exists j. \ Y_j - Y_1 \geq k - c_3 \sqrt{\log 1/(\epsilon \gamma) } \cdot \sqrt{d_{\rm max}} ] \geq 1-\gamma$

Proof: Let

$\displaystyle A:= \{ {\bf g} : \exists j. \ Y_j - Y_1 \geq k ] \geq \epsilon$

By assumption, we have ${\mathop{\mathbb P} [ A ] \geq \epsilon}$, and so, by concentration of measure:

$\displaystyle \mathop{\mathbb P} [ \exists {\bf g}' . \ || {\bf g} - {\bf g}'|| \leq c_3 \sqrt{\log 1/(\epsilon\gamma)} \ \wedge \ {\bf g}' \in A ] \geq 1-\gamma$

The even in the above probability can be rewritten as

$\displaystyle \exists {\bf g}' \in {\mathbb R}^m \ \exists j \in \{ 2,\ldots, n \} . \ || {\bf g} - {\bf g}'|| \leq c_3 \sqrt{ \log \frac 1 {\epsilon \gamma} } \ \wedge \langle {\bf x}_j - {\bf x}_1 , {\bf g}' \rangle \geq k$

and the above condition gives us

$\displaystyle k \leq \langle {\bf x}_j - {\bf x}_1 , {\bf g}' \rangle$

$\displaystyle = \langle {\bf x}_j - {\bf x}_1 , {\bf g} \rangle + \langle {\bf x}_j - {\bf x}_1 , {\bf g}' - {\bf g} \rangle$

$\displaystyle \leq \langle {\bf x}_j - {\bf x}_1 , {\bf g} \rangle + || {\bf x}_j - {\bf x}_1|| \cdot || {\bf g}' - {\bf g} ||$

$\displaystyle \leq Y_j - Y_1 + \sqrt{d_{\rm max} } \cdot c_3 \sqrt{ \log \frac 1 {\epsilon \gamma} }$

$\Box$

The (use of the) above statement is by far the most innovative part of the analysis of Arora, Rao and Vazirani, so it is worth developing an intuitive feeling for its meaning.

Let’s say that we are interested in the distribution of ${p_{\rm max} := \max_{j=2,\ldots,n} Y_j - Y_1}$. We know that the random variables ${Y_j -Y_1}$ are Gaussians of mean 0 and standard deviation at most ${\sqrt{d_{\rm max}}}$, but it is impossible to say anything about, say, the average value or the median value of ${p_{\rm max}}$ without knowing something about the correlation of the random variables ${Y_j - Y_1}$.

Interestingly, the above Corollary says something about the concentration of ${p_{\rm max}}$ without any additional information. The corollary says that, for example, the first percentile of ${p_{\rm max}}$ and the 99-th percentile of ${p_{\rm max}}$ differ by at most ${O(\sqrt {d_{\rm max} } )}$, and that we have a concentration result of the form

$\displaystyle \mathop{\mathbb P} [ | p_{\rm max} - {\rm median} (p_{\rm max} ) | > t \cdot \sqrt{d_{\rm max}} ] \leq e^{-\Omega(t^2)}$

which is a highly non-trivial statement for any configuration of ${{\bf x}_i}$ for which ${p_{\rm max} >> \sqrt {d_{\rm max}}}$.

2. Reworking the Assumption

Lemma 4 Under the assumptions of Lemma 1, there is a fixed set ${C\subseteq [n]}$, ${|C| \geq \frac {c'}{10} n}$, and a set ${M_{\bf g}}$ of disjoint pairs ${\{ i,j \}}$, dependent on ${{\bf g}}$, such that, for every ${{\bf g}}$ and for every pair ${\{ i,j \} \in M_{\bf g}}$ we have

$\displaystyle d(i,j) \leq \ell$

and

$\displaystyle | Y_i - Y_j | \geq \sigma$

and for all ${i \in C}$ we have

$\displaystyle \mathop{\mathbb P}[ \exists j\in C. \ \{ i,j \} \in M_{\bf g} ] \geq \frac {c'}{20}$

Proof: Let ${M_{\bf g}}$ be the set of disjoint pairs promised by the assumptions of Lemma 1. Construct a weighted graph ${G=([n], W)}$, where the weight of the edge ${\{ i,j\}}$ is ${\mathop{\mathbb P} [ \{ i,j \} \in M_{\bf g}]}$. The degree of every vertex is at most 1, and the sum of the degrees is twice the expectation of ${|M|}$, and so, by the assumptions of Lemma 1, it is at least ${c' n /5}$.

Now, repeatedly delete from ${G}$ all vertices of degree at most ${c'n/20}$, and all the edges incident on them, until no such vertex remains. At the end we are left with a (possibly empty!) graph in which all remaining vertices have degree at most ${c'n/20}$; each deletion reduces the sum of the degree by at most ${c'/10}$, and so the residual graph has total degree at least ${c'n/10}$, and hence at least ${c'n/10}$ vertices $\Box$

By the above Lemma, the following result implies Lemma 1 and hence the ARV Main Lemma.

Lemma 5 Let ${d(\cdot, \cdot)}$ be a semi-metric over a set ${C}$ such that ${d(u,v) \leq 1}$ for all ${u,v \in C}$, let ${\{ {\bf x}_v \}_{v \in C}}$ be a collection of vectors in ${{\mathbb R}^m}$, such that ${d(i,j) := ||{\bf x}_u - {\bf x}_v ||^2}$ is a semimetric, let ${{\bf g}}$ be a random Gaussian vector in ${{\mathbb R}^m}$, define ${Y_v := \langle {\bf g}, {\bf x}_v \rangle}$, and suppose that, for every ${{\bf g}}$, we can define a set of disjoint pairs ${M_{\bf g}}$ such that, with probability 1 over ${{\bf g}}$,

$\displaystyle \forall \{ u,v \} \in M_{\bf g}. \ | Y_u - Y_v | \geq \sigma \ \wedge \ d(u,v) \leq \ell$

and

$\displaystyle \forall u\in C. \ \mathop{\mathbb P} [ \exists v. \{ u,v\} \in M_{\bf g} ] \geq \epsilon$

Then

$\displaystyle \ell \geq \Omega_{\epsilon, \sigma} \left( \frac 1 {\sqrt{\log |C|}} \right)$