CS294 Lecture 7: Higher order Cheeger inequality, cont’d

In which we state an analog of Cheeger’s inequalities for the {k}-th smallest Laplacian eigenvalue, and we discuss the connection between this result and the analysis of spectral partitioning algorithms

1. Review

Let {G=(V,E)} be a {d}-regular undirected graph, {L} its normalized Laplacian, {0 = \lambda_1 \leq \cdots \leq \lambda_n \leq 2} be the Laplacian eigenvalues, and {\phi_k(G)} be the order-{k} expansion of {G}.

We want to prove

\displaystyle  \phi_k (G) \leq O(k^{3.5}) \cdot \sqrt{\lambda_k}

We will prove the somewhat stronger result that, given {k} orthonormal vectors {{\bf x}^{(1)}, \ldots, {\bf x}^{(k)}}, we can find {k} disjointly supported vectors {{\bf y}^{(1)}, \ldots, {\bf y}^{(k)}} such that, for every {i=1,\ldots, k},

\displaystyle  R_L( {\bf y}^{(i)} ) \leq O(k^7) \cdot \max_{j=1,\ldots,k} \ \ R_L({\bf x}^{(i)} )

In order to do that, we define the mapping

\displaystyle  F(v) := ( x^{(1)}_v , \ldots, x^{(k)}_v )  \ \ \ \ \ (1)

of vertices to {{\mathbb R}^k} and the normalized distance

\displaystyle   dist(u,v) := \left \| \ \frac {F(u)} { ||F(u) ||} - \frac {F(v)} { ||F(v) ||} \right \| \ \ \ \ \ (2)

between vertices, and we are going to prove the following two lemmas.

Lemma 1 (Localization) Given {t} sets {A_1,\ldots,A_t} such that, for every {i=1,\ldots, t}, {\sum_{v \in A_i} || F(v)||^2 \geq \frac 12} and,for every {u,v} in different sets {dist(u,v) \geq \delta}, we can construct {t} disjointly supported vectors {{\bf y}^{(1)},\ldots, {\bf y}^{(t)}} such that for every {i=1,\ldots, t}, we have

\displaystyle  R_L( {\bf y}^{(t)} ) \leq O(k \cdot \delta^{-2}) \cdot R_L(F)

Lemma 2 (Well-Separated Sets) There are {k} disjoint subsets of vertices {A_1,\ldots, A_k} such that

  • For every {i=1,\ldots,k}, {\sum_{v\in A_i} ||F(v)||^2 \geq \frac 12}
  • For every {u} and {v} belonging to different sets, {dist(u,v) \geq \Omega(k^{-3})}

Recall that, for a function {f: V \rightarrow {\mathbb R}^k} the Rayleigh quotient of {f} is defined as

\displaystyle  R_L(f) := \frac { \sum_{ \{ u,v \}} || f(u) - f(v) ||^2 } {d \sum_v || f(v)||^2 }

and, by definition of {F}, we have

\displaystyle  R_L(F) = \frac 1 k \sum_i R_L( {\bf x}^{(i)} )

2. Some Preliminaries

We will prove some simple properties of the embedding {F(\cdot)} and of the distance function {dist(\cdot,\cdot)}.

First, we observe that

\displaystyle  \sum_{v\in V} ||F(v)||^2 = \sum_v \sum_i (x^{(i)}_v)^2 = \sum_i || {\bf x}^{(i)} ||^2 = k

Next, we prove the sense in which {F(\cdot)} “spreads out” vertices across {{\mathbb R}^k}.

Lemma 3 For every unit vector {{\bf w}\in {\mathbb R}^k},

\displaystyle  \sum_{v\in V} \langle F(v), {\bf w} \rangle ^2 = 1

Proof: Consider the {k\times n } matrix {X} whose rows are the vectors {{\bf x}^{(i)}} and whose columns are the points {F(v)}. Then we have

\displaystyle  \sum_{v\in V} \langle F(v), {\bf w} \rangle ^2 = || X^T {\bf w} ||^2 = {\bf w}^T X X^T {\bf w} = {\bf w}^T {\bf w} = 1

where we used the fact that the rows of {X} are orthogonal and so {X X^T} is the identity. \Box

This means that, for every direction, the points {F(v)} correlate with that direction in the same way, regardless of the direction itself.

In the proof of the localization lemma, we will make use of the following inequality: for every two vectors {{\bf x}, {\bf y}},

\displaystyle  | \ || {\bf x}|| - || {\bf y}|| \ | \leq || {\bf x} - {\bf y} ||

which is a consequence of Cauchy-Schwarz:

\displaystyle  ( ||{\bf x}|| - || {\bf y}||)^2 = ||{\bf x}||^2 + || {\bf y}||^2 - 2 || {\bf x}|| \cdot ||{\bf y} ||

\displaystyle  \leq ||{\bf x}||^2 + || {\bf y}||^2 - 2 \langle {\bf x}, {\bf y} \rangle

\displaystyle  = || {\bf x} - {\bf y} ||^2

3. Localization

In this section we prove Lemma 1.

3.1. Proof Ideas

The basic idea is that we would like to define the vectors {{\bf y}^{(i)}} as

\displaystyle  y^{(i)} _ v := {\bf 1}_{A_i}(v) \cdot || F(v)||

The denominator of the Rayleigh quotient of such a vector is, by definition, at least {1/2}, and we might hope to upper bound the numerator of the Rayleigh quotient of {{\bf y}^{(i)}} in terms of the numerator of the Rayleigh quotient of {F}, which is {k R_L(F)}.

Indeed, every edge {\{ u,v \}} with both endpoints outside of {A_i} contributes zero to the numerator of the Rayleigh quotient of {{\bf y}^{(i)}}, and every edge {\{ u,v \}} with both endpoints in {A_i} contributes

\displaystyle  ( \ ||F(u) || - || F(v) || \ )^2 \leq || F(u) - F(v) ||^2

to the numerator of the Rayleigh quotient of {{\bf y}^{(i)}}, and the right-hand-side above is the contribution of the edge to the numerator of the Rayleigh quotient of {F}.

So far, so good, but the problem comes from edges {\{ u,v \}} with one endpoint {u \in A_i} and one endpoint {v \not \in A_i}. Such an edge contributes {||F(u)||^2} to the Rayleigh quotient of {{\bf y}^{(i)}} and {||F(u)- F(v)||^2} to the Rayleigh quotient of {F}, and the former quantity could be much larger than the latter. If {dist(u,v)} is large, however, {||F(u)||^2} cannot be much larger than {||F(u) - F(v)||^2}, because of the following fact

Lemma 4

\displaystyle   ||F(v) || \cdot dist ( u,v) \leq 2 || F(u) - F(v) || \ \ \ \ \ (3)

Proof:

\displaystyle  ||F(v) || \cdot dist ( u,v) = || F(v)|| \cdot \left\| \ \frac{F(u)}{||F(u) ||} - \frac{F(v) } {||F(v)||} \ \right\|

\displaystyle  = \left\| F(u) \cdot \frac {||F(v)||}{||F(u)||} - F(v) \right\|

\displaystyle  \leq \left\| F(u) \cdot \frac {||F(v)||}{||F(u)||} - F(u) \right\| + \| F(u) - F(v) \|

\displaystyle  = \left \| F(u) \cdot \left( \frac {||F(v)||}{||F(u)||} - 1 \right) \right \| + \| F(u) - F(v) \|

\displaystyle  = ||F(u)|| \cdot \left| \frac {||F(v)||}{||F(u)||} - 1 \right| + \| F(u) - F(v) \|

\displaystyle  = | \ ||F(v) || - ||F(u)|| \ | + \| F(u) - F(v) \|

\displaystyle  \leq 2 \| F(u) - F(v) \|

\Box

We can conclude that the only problem comes from edges {\{ u,v \}} such that {u\in A_i}, {v\not \in A_i}, and {dist(u,v)} is small. To deal with such edges, we will modify the definition of {{\bf y}^{(i)}}, and use a “smoothed” version of the indicator function of {A_i} instead of the actual indicator.

3.2. Proof

If {v} is a vertex and {A} is a set of vertices, we define

\displaystyle  dist(v,A) = \min_{u \in A} dist(v,u)

For each {i}, we define the following smooth indicator function of {A_i}:

\displaystyle  \tau_i (v) = \left\{ \begin{array}{ll} 1 & \mbox{if } v \in A_i\\ 0 & \mbox{if } dist(v,A_i) \geq \frac \delta 2 \\ 1 - \frac 2 \delta \cdot dist(v, A_i) & \mbox{otherwise} \end{array} \right.

Notice that the functions {\tau_i(\cdot)} are disjointly supported: there cannot be a vertex {v} such that {\tau_i(v) >0} and {\tau_j (v) > 0} for {i\neq j}, otherwise we would have {dist(v, A_i) < \frac \delta 2} and {dist(v, A_j) < \frac \delta 2}, contradicting the well-separated condition on the sets {A_i}.

We define the vectors {{\bf y}^{(i)}} as

\displaystyle  y^{(i)} _v = \tau_i (v) \cdot || F(v) ||

The vectors {{\bf y}^{(i)}} are disjointly supported, and it remains to understand their Rayleigh quotient.

The denominator of the Rayleigh quotient of {{\bf y}^{(i)}} is

\displaystyle  \sum_{v\in V} \tau_i ^2 (v) \cdot || F(v) ||^2 \geq \sum_{v \in A_i} ||F(v)||^2 \geq \frac 12

The contribution of an edge {\{ u,v \}} to the numerator is the square of

\displaystyle  | y^{(i)}_v - y^{(i)}_u | = | \ \tau_i (v) \cdot || F(v) || - \tau_i (u) \cdot || F(u) || \ |

\displaystyle  \leq | \ \tau_i (v) \cdot || F(v) || - \tau_i (v) \cdot || F(u) || \ | + | \ \tau_i (v) \cdot || F(u) || - \tau_i (u) \cdot || F(u) || \ |

\displaystyle  = \tau_i (v) \cdot || F(v) - F(u) || + ||F(u)|| \cdot | \tau_i (v) - \tau_i (u) |

\displaystyle  \leq || F(v) - F(u) || + ||F(u)|| \cdot \frac 2 \delta \cdot dist(v,u)

\displaystyle  \leq || F(v) - F(u) || \cdot \left( 1 + \frac 4 \delta \right)

where we used the inequality

\displaystyle  | \tau_i (v) - \tau_i (u) | \leq \frac 2 \delta \left| dist (v, A_i) - dist ( u, A_i) \right| \leq \frac 2 \delta dist(v,u)

The numerator of the Rayleigh quotient of {{\bf y}^{(i)}} is thus

\displaystyle  \sum_{ \{ u,v \} \in E} | y^{(i)}_v - y^{(i)}_u |^2 \leq O(\delta^{-2}) \sum_{ \{ u,v \}\in E} || F(v) - F(u)||^2 = O(\delta^{-2}) \cdot k R_L(F)

and this proves Lemma 1.

4. Well-Separated Sets

In this section we prove Lemma 2, which follows easily from the following result.

Lemma 5 We can find disjoint sets of vertices {T_1,\ldots, T_m} such that

  • {\sum_{i=1}^m \sum_{v\in T_i} || F(v)||^2 \geq k - \frac 14}
  • For every {u}, {v} in different sets, {dist(u,v) \geq \Omega(k^{-3}) }
  • For every set {T_i}, {\sum_{v \in T_i} ||F(v)||^2 \leq 1 + \frac 1 {4k} }

We can derive Lemma 2 from Lemma 5 as follows. Let us call the quantity {\sum_{v\in A} ||F(v)||^2} the mass of a set {A}. Starting from the sets {T_1,\ldots, T_m} promised by Lemma 5 we run the following process: as long as there are two sets both of mass {< \frac 12} we merge them. Call {A_1,\ldots, A_t} the sets of mass {\geq \frac 12} obtained at the end of this process; in addition, there may be one more set of mass {< \frac 12}. Every set has mass {\leq 1 + \frac 1 {4k}}. This means that the total mass of the sets is at most {\frac 12 + t \cdot \left( 1 + \frac 1 {4k} \right) \geq k - \frac 14}, which implies {t\geq k}. Thus we have found {k} sets of vertices, each of mass at least {1/2}, and such that any two sets have separation {\Omega(k^{-3})}.

Now we turn to the proof of Lemma 5. We are going to use the fact that, for every small cone in {{\mathbb R}^k}, the mass of vertices {v} such that {F(v)} is in the cone is also small and, in particular, it can made at most { 1 + \frac 1 {4k}}. We will prove the Lemma by covering almost all the points {F(v)} using a collection of well-separated small cones.

We first formalize the above intuition about cones. If {R} (for region) is a subset of the unit sphere in {{\mathbb R}^n}, then the diameter of {R} is

\displaystyle  diam(R) := \sup_{{\bf w}, {\bf z} \in R} || {\bf w} - {\bf z} ||

and the cone generated by {R} is the set {\{ \alpha {\bf w} : \alpha \in {\mathbb R}_{\geq 0}, {\bf w} \in R \}} of non-negative scalar multiples of elements of {R}. The set of vertices covered by {R}, denoted {V(R)} is the set of vertices {v} such that {F(v)} is in the cone generated by {R} or, equivalently,

\displaystyle  V(R) := \left \{ v \in V : \frac {F(v)}{||F(v)||} \in R \right \}

If {R} has small diameter, then {V(R)} has small mass.

Lemma 6 For every subset {R} of the unit sphere,

\displaystyle  \sum_{v\in V(R)} ||F(v)||^2 \leq \left( 1 - \frac 12( diam(R))^2 \right)^{-2 }

Proof: For every two unit vectors {{\bf w}} and {{\bf z}}, we have

\displaystyle  \langle {\bf z} , {\bf w} \rangle = 1 - \frac 12 ||{\bf w} - {\bf z}||^2

For every vertex {v}, call

\displaystyle  \bar F(v):= \frac { F(v)}{||F(v)||}

Let {{\bf w}} be a vector in {R}. Then we have

\displaystyle  1 \geq \sum_{v\in V(R)} \langle F(v) , {\bf w} \rangle ^2

\displaystyle  = \sum_{v\in V(R)} ||F(v)||^2 \cdot \langle \bar F(v) , {\bf w} \rangle ^2

\displaystyle  = \sum_{v\in V(R)} ||F(v)||^2 \cdot \left( 1 - \frac 12 || \bar F(v) - {\bf w} || ^2 \right)^2

\displaystyle  \geq \sum_{v\in V(R)} ||F(v)||^2 \cdot \left( 1 - \frac 12(diam(R)) ^2 \right)^2

\Box

In particular, if {diam(R) \leq \frac 1 {\sqrt 5 k}}, then the mass of {V(R)} is at most

\displaystyle  \left( 1 - \frac 1{10 k} \right)^{-2} \leq \left( 1 - \frac 1 {5k} \right)^{-1} = 1 + \frac 1 {5k-1} \leq 1 + \frac 1 {4k}

Another observation is that, for every two subsets {R_1,R_2} of the unit sphere,

\displaystyle  \min_{u\in V(R_1), v\in V(R_2)} dist(u,v) \geq \min_{{\bf w} \in R_1, {\bf z} \in R_2} || {\bf w} - {\bf z} ||

Our approach will be to find disjoint subsets {R_1,\ldots, R_m} of the unit sphere, each of diameter at most {1/2\sqrt k}, such that the total mass of the sets {V(R_1), \ldots, V(R_m)} is at least {k - \frac 14} and such that the separation between any two {R_i,R_j} is at least {\Omega(k^{-3})}.

To do this, we tile {{\mathbb R}^k} with axis-parallel cubes of side {L = \frac 1{2k}}, which clearly have diameter at most {\frac 1 {2\sqrt k}}, and, for every cube {C}, we define its core {\tilde C} to be a cube with the same center as {C} and of side {L \cdot \left( 1 - \frac 1{4k^2} \right)}. Note two points in the core of two different cubes have distance at least {\frac 1 {8k^3}}. Let now {R_1, R_2, \ldots} be the intersections of the cube cores with the unit sphere. Since each {R_i} is a subset of a core of a cube, it has diameter at most {\frac 1 {2\sqrt k}}, and the distance between any two points in different regions is at least {\frac 1 {8k^3}}. We claim that there is a way to choose the location of the centers of the cubes so that {\sum_i \sum_{v\in V(R_i)} ||F(v)||^2 \geq k - \frac 14}.

Let us start by a fixed configuration of the cubes and then apply an axis-parallel random shift (by adding to each coordinate, a random real in the range {[0,L]}. Then, for each fixed point in {{\mathbb R}^n} and, in particular, for each point {\bar F(v)}, the probability that it falls in the core of a cube after the shift is at least {1 - \frac 1{4k}}, so the average mass of the vertices covered by the regions is at least {k - \frac 14}, and there must exist a shift that is at least as good.

4 thoughts on “CS294 Lecture 7: Higher order Cheeger inequality, cont’d

  1. Right after Lemma 5, in the paragraph showing the connection of Lemma 5 to Lemma 2, it states that “Every set has mass at most 1 + 1/4k. This means that the total mass of the sets is at most 1/2 + t (1 + 4k)”.
    Given the context, I assume that this refers to sets A_1, …, A_t that come up after the merging procedure. Nonetheless the fact that every set has mass at most 1+1/4k is true only for sets T_1, …, T_m. In my understanding the mass of a set A_i can be larger that 1+1/4k since A_i may come up from the merging of several of the sets T_1, …, T_m.
    What am I missing here?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s