As life is tentatively returning to normal, I would like to once again post technical material here. Before returning to online optimization, I would like to start with something from 2015 that we never wrote up properly, that has to do with *graph curvature* and with *Buser inequalities in graphs*.

# Tag Archives: Cheeger inequality

# CS294 Lecture 7: Higher order Cheeger inequality, cont’d

*In which we state an analog of Cheeger’s inequalities for the -th smallest Laplacian eigenvalue, and we discuss the connection between this result and the analysis of spectral partitioning algorithms*

# CS294 Lecture 6: Higher-order Cheeger Inequality

**1. Cheeger-type Inequalities for **

Let be an undirected -regular graph, its adjacency matrix, its normalized Laplacian matrix, and be the eigenvalues of , counted with multiplicities and listed in non-decreasing order.

In Handout 2, we proved that if and only if has at least connected components, that is, if and only if there are disjoint sets such that for . In this lecture and the next one we will prove a robust version of this fact.

First we introduce the notion of higher-order expansion. If is a collection of disjoint sets, then their order- expansion is defined as

and the order- expansion of a graph is

If the edges of a graph represent a relation of similarity of affinity, a low-expansion collection of sets represents an interesting notion of clustering, because the vertices in each set are more related to each other than to the rest of the graph. (Additional properties are desirable in a good clustering, and we will discuss this later.)

We will prove the following higher-order Cheeger inequalities:

Stronger upper bounds are known, but the bound above is easier to prove from scratch. It is known that and that .

# CS294 Lecture 4: Cheeger Inequalities cont’d

# CS294 Lecture 3: Cheeger Inequalities

*In which we generalize the notion of normalized Laplacian to irregular graphs, we extend the basic spectral graph theory results from last lecture to irregular graphs, and we prove the easy direction of Cheeger’s inequalities.*

**1. Irregular Graphs **

Let be an undirected graph, not necessarily regular. We will assume that every vertex has non-zero degree. We would like to define a normalized Laplacian matrix associated to so that the properties we proved last time are true: that the multiplicity of 0 as an eigenvalue is equal to the number of connected components of , that the largest eigenvalue is at most 2, and that it is 2 if and only if (a connected component of) the graph is bipartite.

# Proof of the Cheeger inequality in manifolds

Having (non-rigorously) defined the Laplacian operator in manifolds in the previous post, we turn to the proof of the Cheeger inequality in manifolds, which we restate below.

Theorem 1 (Cheeger’s inequality)Let be an -dimensional smooth, compact, Riemann manifold without boundary with metric , let be the Laplace-Beltrami operator on , let be the eigenvalues of , and define theCheeger constantof to bewhere the is the boundary of , is the -dimensional measure, and is -th dimensional measure defined using . Then

We begin by recalling the proof of the analogous result in graphs, and then we will repeat the same steps in the context of manifolds.

Theorem 2 (Cheeger’s inequality in graphs)Let be a -regular graph, be its adjacency matrix, be its normalized Laplacian matrix, be the eigenvalues of , and define for every subset of vertices . Define the conductance of aswhere is the number of edges with one endpoint in and one endpoint in . Then

**1. Proof the Cheeger inequality in graphs **

We will use the *variational characterization* of the eigenvalues of the Laplacian of a graph .

and if is a minimizer in the above expression then

Following the definition of we see that

and so the minimum in (3) is 0, and it is achieved for . This means that

The expression in the right-hand-side of (4) is an important one, and it is called the Rayleigh quotient of , which we will denote by :

It is also useful to consider the variant of the Rayleigh quotient where there are no squares; this does not have a standard name, so let us call it the Rayleigh quotient and denote it by :

The proof of the graph Cheeger inequality now continues with the proof of the following three facts.

Lemma 3 (Rounding of embeddings)For every non-negative vector there is a value such that

Lemma 4 (Embedding of into )For every non-negative vector , we have

Lemma 5 (From an eigenvector to a non-negative vector)For every such that there is a non-negative such that and such that

Now let us start from a function that optimizes (4), so that and , then apply Lemma 5 to find a function such that the volume of the vertices having positive coordinates in is at most and such that . Then consider the vector such that ; by Lemma 4, we have , and by Lemma 3 there is a threshold such that the set has conductance . Since is a subset of the vertices having positive coordinates in , we have , and so

which is the Cheeger inequality for graphs. It remains to prove the three lemmas.

*Proof:* of Lemma 3. For each threshold , define the set

The idea of the proof is that if we pick at random then the probability that an edge belongs to is proportional to and the probability that is proportional to , so that the expected number of edges in is proportional to the numerator of and the expected number of vertices in is proportional to the denominator of ; if is small, it is not possible for to always be large for every .

To avoid having to normalize the range of to be between and , instead of taking averages over a random choice of , we will consider the integral over all values of . We have

because we can write , where if and otherwise, and we see that only the values of between and make , so .

We also have

and if we denote by the threshold such that is smallest among all the , then

so that

*Proof:* of Lemma 3. Let us consider the numerator of ; it is:

(we used Cauchy-Swarz)

(we used the definition of and Cauchy-Swarz again)

And so

*Proof:* of Lemma 5. Let be the median of , and consider defined as . We have

because the numerators of and are the same (the additive term cancels). The denominators are such that

because and the vector are orthogonal, and so by Pythagoras’s theorem the length-squared of equals the length-squared of plus the length-squared of .

Let us define and so that . We use the following fact:

Fact 6Let be disjointly supported non-negative vectors (“disjointly supported” means that they are non-zero on disjoint subsets of coordinates), then

*Proof:* The numerator of is

and, using orthogonality and Pythagoras’s theorem, the denominator of is

The fact now follows from the inequality

The lemma now follows by observing that and are non-negative and disjointly supported, so

and that both and have at most non-zero coordinate.

**2. Proof of the Cheeger inequality in manifolds **

We will now translate the proof of the graph Cheeger inequality to the setting of manifolds.

As you may remember, we started off by saying that is symmetric and so all its eigenvalues are real and they are given by the variational characterization. Now we are already in trouble because the operator on manifolds cannot be thought of as a matrix, so what does it mean for it to be symmetric? The consequence of symmetry that is exploited in the analysis of the spectrum of symmetric matrices is the fact that if is symmetric, then for every we have

and the property makes no references to coordinates, and it is well defined even for linear operators over infinite-dimensional spaces, provided that there is a notion of inner product. If we the define the inner product

on functions , and more generally

for functions , where is a vector space with inner product , then we can say that an operator is *self-adjoint* if

for all (appropriately restricted) functions . If is compact, this property is true for the Laplacian, and, in particular, and are adjoints of each others, that is,

(The discrete analog would be that is the transpose of .)

Self-adjointness (and appropriate conditions on ) imply a version of the spectral theorem and of the variational characterization. In particular, all eigenvalues of are real, and if there is a minimum one then it is

and if is a minimizer of the above, then

(The minimization is quantified over all functions that are square-integrable, and the minimum is achieved because if is compact then the space of such functions is also compact and the cost function that we are minimizing is continuous. In this post, whenever we talk about “all functions,” it should be understood that we are restricting to whatever space of functions makes sense in the context.)

From the property that and are adjoint, we have

so

where the Rayleigh quotient

is always non-negative, and it is zero for constant , so we see that and

By analogy with the graph case, we define the “ Rayleigh quotient”

And we can prove the analogs of the lemmas that we proved for graphs.

Lemma 7 (Rounding of embeddings)For every non-negative function there is a value such that

where the Cheeger constant of a subset of the manifold is

Lemma 8 (Embedding of into )For every non-negative function , we have

Lemma 9 (From an eigenfunction to a non-negative function)For every function such that there is a non-negative such that and such that

Let us see the proof of these lemmas.

*Proof:* of Lemma 7. For each threshold , define the set

Let be a threshold for which is minimized

We will integrate the numerator and denominator of over all . The *coarea formula* for nonnegative functions is

and we also have

which combine to

so that

*Proof:* of Lemma 8. Let us consider the numerator of ; it is:

We can apply the chain rule, and see that

which implies

and, after applying Caucy-Swarz,

And so

*Proof:* of Lemma 9. Let be a median of , and consider defined as . We have

because the numerators of and are the same (the derivatives of functions that differ by a constant are identical) and the denominators are such that

where we used the fact the integral of is zero.

Let us define and so that . We use the following fact:

Fact 10Let be disjointly supported non-negative functions (“disjointly supported” means that they are non-zero on disjoint subsets of inputs), then

*Proof:* We begin with the following observation: if is a non-negative function, and , then , because has to be a local minimum.

Consider the expression occurring in the numerator of . We have

But

because for every at least one of or is zero, and so at least one of or is zero.

Using this fact, we have that the numerator of is equal to the sum of the numerators of and :

and the denominator of is also the sum of the denominators of and :

because for every . The fact now follows from the inequality

The lemma now follows by observing that and are non-negative and disjointly supported, so

and that both and have a support of volume at most .

If anybody is still reading, it is worth observing a couple of differences between the discrete proof and the continuous proof.

The Rayleigh quotient is defined slightly differently in the continuous case. It would correspond to defining it as

in the discrete case.

If are disjointly supported and nonnegative, the sum of the numerators of the Rayleigh quotients and can be strictly smaller than the numerator of , while we always have equality in the continuous case. In the discrete case, the sum of the numerators of and can be up to twice the numerator of (this fact is useful, but it did not come up in this proof), while again we have exact equality in the continuous case.

The chain rule calculation

corresponds to the step

In the continuous case, and are “infinitesimally close”, so we can approximate by .

# The Cheeger inequality in manifolds

Readers of *in theory* have heard about Cheeger’s inequality a lot. It is a relation between the edge expansion (or, in graphs that are not regular, the conductance) of a graph and the second smallest eigenvalue of its Laplacian (a normalized version of the adjacency matrix). The inequality gives a worst-case analysis of the “sweep” algorithm for finding sparse cuts, it shows a necessary and sufficient for a graph to be an expander, and it relates the mixing time of a graph to its conductance.

Readers who have heard this story before will recall that a version of this result for vertex expansion was first proved by Alon and Milman, and the result for edge expansion appeared in a paper of Dodzuik, all from the mid-1980s. The result, however, is not called *Cheeger’s inequality* just because of Stigler’s rule: Cheeger proved in the 1970s a very related result on manifolds, of which the result on graphs is the discrete analog.

So, what is the *actual* Cheeger’s inequality?

Theorem 1 (Cheeger’s inequality)Let be an -dimensional smooth, compact, Riemann manifold without boundary with metric , let be the Laplace-Beltrami operator on , let be the eigenvalues of , and define theCheeger constantof to bewhere the is the boundary of , is the -dimensional measure, and is -th dimensional measure defined using . Then

The purpose of this post is to describe to the reader who knows nothing about differential geometry and who does not remember much multivariate calculus (that is, the reader who is in the position I was in a few weeks ago) what the above statement means, to describe the proof, and to see that it is in fact the *same proof* as the proof of the statement about graphs.

In this post we will define the terms appearing in the above theorem, and see their relation with analogous notions in graphs. In the next post we will see the proof.

# CS359G Lecture 4: Spectral Partitioning

*In which we prove the difficult direction of Cheeger’s inequality.*

As in the past lectures, consider an undirected -regular graph , call its adjacency matrix, and its scaled adjacency matrix. Let be the eigenvalues of , with multiplicities, in non-increasing order. We have been studying the *edge expansion* of a graph, which is the minimum of over all non-trivial cuts of the vertex set (a cut is trivial if or ), where the expansion of a cut is

We have also been studying the (uniform) *sparsest cut* problem, which is the problem of finding the non-trivial cut that minimizes , where the sparsisty of a cut is

We are proving Cheeger’s inequalities:

and we established the left-hand side inequality in the previous lecture, showing that the quantity can be seen as the optimum of a continuous relaxation of , so that , and follows by the definition.

Today we prove the more difficult, and interesting, direction. The proof will be constructive and algorithmic. The proof can be seen as an analysis of the following algorithm.

Algorithm: *SpectralPartitioning*

- Input: graph and vector
- Sort the vertices of in non-decreasing order of values of entries in , that is let where
- Let be such that is minimal
- Output

We note that the algorithm can be implemented to run in time , assuming arithmetic operations and comparisons take constant time, because once we have computed it only takes time to compute .

We have the following analysis of the quality of the solution:

Lemma 1 (Analysis of Spectral Partitioning)Let be a d-regular graph, be a vector such that , let be the normalized adjacency matrix of , defineand let be the output of algorithm

SpectralPartitioningon input and . Then

Remark 1If we apply the lemma to the case in which is an eigenvector of , then , and so we have

which is the difficult direction of Cheeger’s inequalities.

Remark 2If we run the SpectralPartitioning algorithm with the eigenvector of the second eigenvalue , we find a set whose expansion isEven though this doesn’t give a constant-factor approximation to the edge expansion, it gives a very efficient, and non-trivial, approximation.

As we will see in a later lecture, there is a nearly linear time algorithm that finds a vector for which the expression in the lemma is very close to , so, overall, for any graph we can find a cut of expansion in nearly linear time.

# CS359G Lecture 3: Cheeger’s inequality

*In which we prove the easy case of Cheeger’s inequality.*

**1. Expansion and The Second Eigenvalue **

Let be an undirected -regular graph, its adjacency matrix, its normalized adjacency matrix, and be the eigenvalues of .

Recall that we defined the *edge expansion* of a cut of the vertices of as

and that the edge expansion of is .

We also defined the related notion of the *sparsity* of a cut as

and ; the *sparsest cut* problem is to find a cut of minimal sparsity.

Recall also that in the last lecture we proved that if and only if is disconnected. This is equivalent to saying that if and only if . In this lecture and the next we will see that this statement admits an *approximate version* that, qualitatively, says that is small if and only if is small. Quantitatively, we have

Theorem 1 (Cheeger’s Inequalities)