# CS294 Lecture 7: Higher order Cheeger inequality, cont’d

In which we state an analog of Cheeger’s inequalities for the ${k}$-th smallest Laplacian eigenvalue, and we discuss the connection between this result and the analysis of spectral partitioning algorithms

1. Review

Let ${G=(V,E)}$ be a ${d}$-regular undirected graph, ${L}$ its normalized Laplacian, ${0 = \lambda_1 \leq \cdots \leq \lambda_n \leq 2}$ be the Laplacian eigenvalues, and ${\phi_k(G)}$ be the order-${k}$ expansion of ${G}$.

We want to prove

$\displaystyle \phi_k (G) \leq O(k^{3.5}) \cdot \sqrt{\lambda_k}$

We will prove the somewhat stronger result that, given ${k}$ orthonormal vectors ${{\bf x}^{(1)}, \ldots, {\bf x}^{(k)}}$, we can find ${k}$ disjointly supported vectors ${{\bf y}^{(1)}, \ldots, {\bf y}^{(k)}}$ such that, for every ${i=1,\ldots, k}$,

$\displaystyle R_L( {\bf y}^{(i)} ) \leq O(k^7) \cdot \max_{j=1,\ldots,k} \ \ R_L({\bf x}^{(i)} )$

In order to do that, we define the mapping

$\displaystyle F(v) := ( x^{(1)}_v , \ldots, x^{(k)}_v ) \ \ \ \ \ (1)$

of vertices to ${{\mathbb R}^k}$ and the normalized distance

$\displaystyle dist(u,v) := \left \| \ \frac {F(u)} { ||F(u) ||} - \frac {F(v)} { ||F(v) ||} \right \| \ \ \ \ \ (2)$

between vertices, and we are going to prove the following two lemmas.

Lemma 1 (Localization) Given ${t}$ sets ${A_1,\ldots,A_t}$ such that, for every ${i=1,\ldots, t}$, ${\sum_{v \in A_i} || F(v)||^2 \geq \frac 12}$ and,for every ${u,v}$ in different sets ${dist(u,v) \geq \delta}$, we can construct ${t}$ disjointly supported vectors ${{\bf y}^{(1)},\ldots, {\bf y}^{(t)}}$ such that for every ${i=1,\ldots, t}$, we have

$\displaystyle R_L( {\bf y}^{(t)} ) \leq O(k \cdot \delta^{-2}) \cdot R_L(F)$

Lemma 2 (Well-Separated Sets) There are ${k}$ disjoint subsets of vertices ${A_1,\ldots, A_k}$ such that

• For every ${i=1,\ldots,k}$, ${\sum_{v\in A_i} ||F(v)||^2 \geq \frac 12}$
• For every ${u}$ and ${v}$ belonging to different sets, ${dist(u,v) \geq \Omega(k^{-3})}$

Recall that, for a function ${f: V \rightarrow {\mathbb R}^k}$ the Rayleigh quotient of ${f}$ is defined as

$\displaystyle R_L(f) := \frac { \sum_{ \{ u,v \}} || f(u) - f(v) ||^2 } {d \sum_v || f(v)||^2 }$

and, by definition of ${F}$, we have

$\displaystyle R_L(F) = \frac 1 k \sum_i R_L( {\bf x}^{(i)} )$

2. Some Preliminaries

We will prove some simple properties of the embedding ${F(\cdot)}$ and of the distance function ${dist(\cdot,\cdot)}$.

First, we observe that

$\displaystyle \sum_{v\in V} ||F(v)||^2 = \sum_v \sum_i (x^{(i)}_v)^2 = \sum_i || {\bf x}^{(i)} ||^2 = k$

Next, we prove the sense in which ${F(\cdot)}$ “spreads out” vertices across ${{\mathbb R}^k}$.

Lemma 3 For every unit vector ${{\bf w}\in {\mathbb R}^k}$,

$\displaystyle \sum_{v\in V} \langle F(v), {\bf w} \rangle ^2 = 1$

Proof: Consider the ${k\times n }$ matrix ${X}$ whose rows are the vectors ${{\bf x}^{(i)}}$ and whose columns are the points ${F(v)}$. Then we have

$\displaystyle \sum_{v\in V} \langle F(v), {\bf w} \rangle ^2 = || X^T {\bf w} ||^2 = {\bf w}^T X X^T {\bf w} = {\bf w}^T {\bf w} = 1$

where we used the fact that the rows of ${X}$ are orthogonal and so ${X X^T}$ is the identity. $\Box$

This means that, for every direction, the points ${F(v)}$ correlate with that direction in the same way, regardless of the direction itself.

In the proof of the localization lemma, we will make use of the following inequality: for every two vectors ${{\bf x}, {\bf y}}$,

$\displaystyle | \ || {\bf x}|| - || {\bf y}|| \ | \leq || {\bf x} - {\bf y} ||$

which is a consequence of Cauchy-Schwarz:

$\displaystyle ( ||{\bf x}|| - || {\bf y}||)^2 = ||{\bf x}||^2 + || {\bf y}||^2 - 2 || {\bf x}|| \cdot ||{\bf y} ||$

$\displaystyle \leq ||{\bf x}||^2 + || {\bf y}||^2 - 2 \langle {\bf x}, {\bf y} \rangle$

$\displaystyle = || {\bf x} - {\bf y} ||^2$

3. Localization

In this section we prove Lemma 1.

3.1. Proof Ideas

The basic idea is that we would like to define the vectors ${{\bf y}^{(i)}}$ as

$\displaystyle y^{(i)} _ v := {\bf 1}_{A_i}(v) \cdot || F(v)||$

The denominator of the Rayleigh quotient of such a vector is, by definition, at least ${1/2}$, and we might hope to upper bound the numerator of the Rayleigh quotient of ${{\bf y}^{(i)}}$ in terms of the numerator of the Rayleigh quotient of ${F}$, which is ${k R_L(F)}$.

Indeed, every edge ${\{ u,v \}}$ with both endpoints outside of ${A_i}$ contributes zero to the numerator of the Rayleigh quotient of ${{\bf y}^{(i)}}$, and every edge ${\{ u,v \}}$ with both endpoints in ${A_i}$ contributes

$\displaystyle ( \ ||F(u) || - || F(v) || \ )^2 \leq || F(u) - F(v) ||^2$

to the numerator of the Rayleigh quotient of ${{\bf y}^{(i)}}$, and the right-hand-side above is the contribution of the edge to the numerator of the Rayleigh quotient of ${F}$.

So far, so good, but the problem comes from edges ${\{ u,v \}}$ with one endpoint ${u \in A_i}$ and one endpoint ${v \not \in A_i}$. Such an edge contributes ${||F(u)||^2}$ to the Rayleigh quotient of ${{\bf y}^{(i)}}$ and ${||F(u)- F(v)||^2}$ to the Rayleigh quotient of ${F}$, and the former quantity could be much larger than the latter. If ${dist(u,v)}$ is large, however, ${||F(u)||^2}$ cannot be much larger than ${||F(u) - F(v)||^2}$, because of the following fact

Lemma 4

$\displaystyle ||F(v) || \cdot dist ( u,v) \leq 2 || F(u) - F(v) || \ \ \ \ \ (3)$

Proof:

$\displaystyle ||F(v) || \cdot dist ( u,v) = || F(v)|| \cdot \left\| \ \frac{F(u)}{||F(u) ||} - \frac{F(v) } {||F(v)||} \ \right\|$

$\displaystyle = \left\| F(u) \cdot \frac {||F(v)||}{||F(u)||} - F(v) \right\|$

$\displaystyle \leq \left\| F(u) \cdot \frac {||F(v)||}{||F(u)||} - F(u) \right\| + \| F(u) - F(v) \|$

$\displaystyle = \left \| F(u) \cdot \left( \frac {||F(v)||}{||F(u)||} - 1 \right) \right \| + \| F(u) - F(v) \|$

$\displaystyle = ||F(u)|| \cdot \left| \frac {||F(v)||}{||F(u)||} - 1 \right| + \| F(u) - F(v) \|$

$\displaystyle = | \ ||F(v) || - ||F(u)|| \ | + \| F(u) - F(v) \|$

$\displaystyle \leq 2 \| F(u) - F(v) \|$

$\Box$

We can conclude that the only problem comes from edges ${\{ u,v \}}$ such that ${u\in A_i}$, ${v\not \in A_i}$, and ${dist(u,v)}$ is small. To deal with such edges, we will modify the definition of ${{\bf y}^{(i)}}$, and use a “smoothed” version of the indicator function of ${A_i}$ instead of the actual indicator.

3.2. Proof

If ${v}$ is a vertex and ${A}$ is a set of vertices, we define

$\displaystyle dist(v,A) = \min_{u \in A} dist(v,u)$

For each ${i}$, we define the following smooth indicator function of ${A_i}$:

$\displaystyle \tau_i (v) = \left\{ \begin{array}{ll} 1 & \mbox{if } v \in A_i\\ 0 & \mbox{if } dist(v,A_i) \geq \frac \delta 2 \\ 1 - \frac 2 \delta \cdot dist(v, A_i) & \mbox{otherwise} \end{array} \right.$

Notice that the functions ${\tau_i(\cdot)}$ are disjointly supported: there cannot be a vertex ${v}$ such that ${\tau_i(v) >0}$ and ${\tau_j (v) > 0}$ for ${i\neq j}$, otherwise we would have ${dist(v, A_i) < \frac \delta 2}$ and ${dist(v, A_j) < \frac \delta 2}$, contradicting the well-separated condition on the sets ${A_i}$.

We define the vectors ${{\bf y}^{(i)}}$ as

$\displaystyle y^{(i)} _v = \tau_i (v) \cdot || F(v) ||$

The vectors ${{\bf y}^{(i)}}$ are disjointly supported, and it remains to understand their Rayleigh quotient.

The denominator of the Rayleigh quotient of ${{\bf y}^{(i)}}$ is

$\displaystyle \sum_{v\in V} \tau_i ^2 (v) \cdot || F(v) ||^2 \geq \sum_{v \in A_i} ||F(v)||^2 \geq \frac 12$

The contribution of an edge ${\{ u,v \}}$ to the numerator is the square of

$\displaystyle | y^{(i)}_v - y^{(i)}_u | = | \ \tau_i (v) \cdot || F(v) || - \tau_i (u) \cdot || F(u) || \ |$

$\displaystyle \leq | \ \tau_i (v) \cdot || F(v) || - \tau_i (v) \cdot || F(u) || \ | + | \ \tau_i (v) \cdot || F(u) || - \tau_i (u) \cdot || F(u) || \ |$

$\displaystyle = \tau_i (v) \cdot || F(v) - F(u) || + ||F(u)|| \cdot | \tau_i (v) - \tau_i (u) |$

$\displaystyle \leq || F(v) - F(u) || + ||F(u)|| \cdot \frac 2 \delta \cdot dist(v,u)$

$\displaystyle \leq || F(v) - F(u) || \cdot \left( 1 + \frac 4 \delta \right)$

where we used the inequality

$\displaystyle | \tau_i (v) - \tau_i (u) | \leq \frac 2 \delta \left| dist (v, A_i) - dist ( u, A_i) \right| \leq \frac 2 \delta dist(v,u)$

The numerator of the Rayleigh quotient of ${{\bf y}^{(i)}}$ is thus

$\displaystyle \sum_{ \{ u,v \} \in E} | y^{(i)}_v - y^{(i)}_u |^2 \leq O(\delta^{-2}) \sum_{ \{ u,v \}\in E} || F(v) - F(u)||^2 = O(\delta^{-2}) \cdot k R_L(F)$

and this proves Lemma 1.

4. Well-Separated Sets

In this section we prove Lemma 2, which follows easily from the following result.

Lemma 5 We can find disjoint sets of vertices ${T_1,\ldots, T_m}$ such that

• ${\sum_{i=1}^m \sum_{v\in T_i} || F(v)||^2 \geq k - \frac 14}$
• For every ${u}$, ${v}$ in different sets, ${dist(u,v) \geq \Omega(k^{-3}) }$
• For every set ${T_i}$, ${\sum_{v \in T_i} ||F(v)||^2 \leq 1 + \frac 1 {4k} }$

We can derive Lemma 2 from Lemma 5 as follows. Let us call the quantity ${\sum_{v\in A} ||F(v)||^2}$ the mass of a set ${A}$. Starting from the sets ${T_1,\ldots, T_m}$ promised by Lemma 5 we run the following process: as long as there are two sets both of mass ${< \frac 12}$ we merge them. Call ${A_1,\ldots, A_t}$ the sets of mass ${\geq \frac 12}$ obtained at the end of this process; in addition, there may be one more set of mass ${< \frac 12}$. Every set has mass ${\leq 1 + \frac 1 {4k}}$. This means that the total mass of the sets is at most ${\frac 12 + t \cdot \left( 1 + \frac 1 {4k} \right) \geq k - \frac 14}$, which implies ${t\geq k}$. Thus we have found ${k}$ sets of vertices, each of mass at least ${1/2}$, and such that any two sets have separation ${\Omega(k^{-3})}$.

Now we turn to the proof of Lemma 5. We are going to use the fact that, for every small cone in ${{\mathbb R}^k}$, the mass of vertices ${v}$ such that ${F(v)}$ is in the cone is also small and, in particular, it can made at most ${ 1 + \frac 1 {4k}}$. We will prove the Lemma by covering almost all the points ${F(v)}$ using a collection of well-separated small cones.

We first formalize the above intuition about cones. If ${R}$ (for region) is a subset of the unit sphere in ${{\mathbb R}^n}$, then the diameter of ${R}$ is

$\displaystyle diam(R) := \sup_{{\bf w}, {\bf z} \in R} || {\bf w} - {\bf z} ||$

and the cone generated by ${R}$ is the set ${\{ \alpha {\bf w} : \alpha \in {\mathbb R}_{\geq 0}, {\bf w} \in R \}}$ of non-negative scalar multiples of elements of ${R}$. The set of vertices covered by ${R}$, denoted ${V(R)}$ is the set of vertices ${v}$ such that ${F(v)}$ is in the cone generated by ${R}$ or, equivalently,

$\displaystyle V(R) := \left \{ v \in V : \frac {F(v)}{||F(v)||} \in R \right \}$

If ${R}$ has small diameter, then ${V(R)}$ has small mass.

Lemma 6 For every subset ${R}$ of the unit sphere,

$\displaystyle \sum_{v\in V(R)} ||F(v)||^2 \leq \left( 1 - \frac 12( diam(R))^2 \right)^{-2 }$

Proof: For every two unit vectors ${{\bf w}}$ and ${{\bf z}}$, we have

$\displaystyle \langle {\bf z} , {\bf w} \rangle = 1 - \frac 12 ||{\bf w} - {\bf z}||^2$

For every vertex ${v}$, call

$\displaystyle \bar F(v):= \frac { F(v)}{||F(v)||}$

Let ${{\bf w}}$ be a vector in ${R}$. Then we have

$\displaystyle 1 \geq \sum_{v\in V(R)} \langle F(v) , {\bf w} \rangle ^2$

$\displaystyle = \sum_{v\in V(R)} ||F(v)||^2 \cdot \langle \bar F(v) , {\bf w} \rangle ^2$

$\displaystyle = \sum_{v\in V(R)} ||F(v)||^2 \cdot \left( 1 - \frac 12 || \bar F(v) - {\bf w} || ^2 \right)^2$

$\displaystyle \geq \sum_{v\in V(R)} ||F(v)||^2 \cdot \left( 1 - \frac 12(diam(R)) ^2 \right)^2$

$\Box$

In particular, if ${diam(R) \leq \frac 1 {\sqrt 5 k}}$, then the mass of ${V(R)}$ is at most

$\displaystyle \left( 1 - \frac 1{10 k} \right)^{-2} \leq \left( 1 - \frac 1 {5k} \right)^{-1} = 1 + \frac 1 {5k-1} \leq 1 + \frac 1 {4k}$

Another observation is that, for every two subsets ${R_1,R_2}$ of the unit sphere,

$\displaystyle \min_{u\in V(R_1), v\in V(R_2)} dist(u,v) \geq \min_{{\bf w} \in R_1, {\bf z} \in R_2} || {\bf w} - {\bf z} ||$

Our approach will be to find disjoint subsets ${R_1,\ldots, R_m}$ of the unit sphere, each of diameter at most ${1/2\sqrt k}$, such that the total mass of the sets ${V(R_1), \ldots, V(R_m)}$ is at least ${k - \frac 14}$ and such that the separation between any two ${R_i,R_j}$ is at least ${\Omega(k^{-3})}$.

To do this, we tile ${{\mathbb R}^k}$ with axis-parallel cubes of side ${L = \frac 1{2k}}$, which clearly have diameter at most ${\frac 1 {2\sqrt k}}$, and, for every cube ${C}$, we define its core ${\tilde C}$ to be a cube with the same center as ${C}$ and of side ${L \cdot \left( 1 - \frac 1{4k^2} \right)}$. Note two points in the core of two different cubes have distance at least ${\frac 1 {8k^3}}$. Let now ${R_1, R_2, \ldots}$ be the intersections of the cube cores with the unit sphere. Since each ${R_i}$ is a subset of a core of a cube, it has diameter at most ${\frac 1 {2\sqrt k}}$, and the distance between any two points in different regions is at least ${\frac 1 {8k^3}}$. We claim that there is a way to choose the location of the centers of the cubes so that ${\sum_i \sum_{v\in V(R_i)} ||F(v)||^2 \geq k - \frac 14}$.

Let us start by a fixed configuration of the cubes and then apply an axis-parallel random shift (by adding to each coordinate, a random real in the range ${[0,L]}$. Then, for each fixed point in ${{\mathbb R}^n}$ and, in particular, for each point ${\bar F(v)}$, the probability that it falls in the core of a cube after the shift is at least ${1 - \frac 1{4k}}$, so the average mass of the vertices covered by the regions is at least ${k - \frac 14}$, and there must exist a shift that is at least as good.