In which we prove the difficult direction of Cheeger’s inequality.
As in the past lectures, consider an undirected -regular graph
, call
its adjacency matrix, and
its scaled adjacency matrix. Let
be the eigenvalues of
, with multiplicities, in non-increasing order. We have been studying the edge expansion of a graph, which is the minimum of
over all non-trivial cuts
of the vertex set (a cut is trivial if
or
), where the expansion
of a cut is
We have also been studying the (uniform) sparsest cut problem, which is the problem of finding the non-trivial cut that minimizes , where the sparsisty
of a cut is
We are proving Cheeger’s inequalities:
and we established the left-hand side inequality in the previous lecture, showing that the quantity can be seen as the optimum of a continuous relaxation of
, so that
, and
follows by the definition.
Today we prove the more difficult, and interesting, direction. The proof will be constructive and algorithmic. The proof can be seen as an analysis of the following algorithm.
Algorithm: SpectralPartitioning
- Input: graph
and vector
- Sort the vertices of
in non-decreasing order of values of entries in
, that is let
where
- Let
be such that
is minimal
- Output
We note that the algorithm can be implemented to run in time , assuming arithmetic operations and comparisons take constant time, because once we have computed
it only takes time
to compute
.
We have the following analysis of the quality of the solution:
Lemma 1 (Analysis of Spectral Partitioning) Let
be a d-regular graph,
be a vector such that
, let
be the normalized adjacency matrix of
, define
and let
be the output of algorithm SpectralPartitioning on input
and
. Then
Remark 1 If we apply the lemma to the case in which
is an eigenvector of
, then
, and so we have
which is the difficult direction of Cheeger’s inequalities.
Remark 2 If we run the SpectralPartitioning algorithm with the eigenvector
of the second eigenvalue
, we find a set
whose expansion is
Even though this doesn’t give a constant-factor approximation to the edge expansion, it gives a very efficient, and non-trivial, approximation.
As we will see in a later lecture, there is a nearly linear time algorithm that finds a vector
for which the expression
in the lemma is very close to
, so, overall, for any graph
we can find a cut of expansion
in nearly linear time.
1. Proof of the Lemma
In the past lecture, we saw that can be seen as the optimum of a continuous relaxation of sparsest cut. Lemma 1 provides a rounding algorithm for the real vectors which are solutions of the relaxation. In this section we will think of it as a form of randomized rounding. Later, when we talk about the Leighton-Rao sparsest cut algorithm, we will revisit this proof and think of it in terms of metric embeddings.
To simplify notation, we will assume that and that
. Thus our goal is to prove that there is an
such that
We will derive Lemma 1 by showing that there is a distribution over sets
of the form
such that
We need to be a bit careful in deriving the Lemma from (2). In general, it is not true that a ratio of averages is equal to the average of the ratios, so (2) does not imply that . We can, however, apply linearity of expectation and derive from (2) the inequality
So there must exist a set in the sample space such that
meaning that, for that set , we have
. (Basically we are using the fact that, for random variables
over the same sample space, although it might not be true that
, we always have
, provided that
over the entire sample space.)
From now on, we will assume that
-
, that is, the median of the entries of
is zero
-
which can be done without loss of generality because adding a fixed constant to all entries of
, or multiplying all the entries by a fixed constant does not change the value of
nor does it change the property that
. The reason for these choices is that they allow us to define a distribution
over sets such that
We define the distribution over sets of the form
,
, as the outcome of the following probabilistic process:
- We pick a real value
in the range
with probabily density function
. That is, for
,
.
Doing the calculation, this means that
if
have the same sign, and
if they have different signs.
- We let
According to this definition, the probability that an element belongs to the smallest of the sets
is the same as the probability that it belongs to
, which is the probability that the threshold
is in the range
, and that probability is
. Similarly, the probability that an element
belongs to the smallest of
is the same as the probability that it belongs to
, which is the probability that
is in the range
, which is again
. So we have established (3).
We will now estimate the expected number of edges between and
.
The event that the edge is cut by the partition
happens when the value
falls in the range between
and
. This means that
- If
have the same sign,
- If
have different sign,
Some attempts, show that a good expression to upper bound both cases is
Plugging into our expression for the expected number of cut edges, and applying Cauchy-Schwarz
The assumption of the Lemma tell us that
And we can rewrite
which gives us
Finally, it remains to study the expression . By applying the inequality
(which follows by noting that
), we derive
Putting all the pieces together we have
which, together with (3) gives (2), which, as we already discussed, implies the Main Lemma 1.
Pingback: Cheeger Inequality and Spectral Clustering « Blog do Bandeira
Pingback: Lecture 8 « Pseudorandomness and Derandomization
Pingback: Lecture 9 « Pseudorandomness and Derandomization
A small typo: You want to shift the entries so that x_{\lceil n/2 \rceil} = 0, i.e., the ceiling of n/2 not the floor of n/2.
(For example, if n=3, your shift convention gives x_1 = 0, and your statement that Pr[1 belongs to the smaller of the sets S, V-S] = x_1^2 is false; the probability is actually x_2^2. Everything works using the ceiling instead of the floor.)
Pingback: CS359G Lecture 4: Spectral Partitioning | in theory | Peter's ruminations
Pingback: CS359G Lecture 4: Spectral Partitioning | in theory | Peter's ruminations