The Cheeger inequality in manifolds

Readers of in theory have heard about Cheeger’s inequality a lot. It is a relation between the edge expansion (or, in graphs that are not regular, the conductance) of a graph and the second smallest eigenvalue of its Laplacian (a normalized version of the adjacency matrix). The inequality gives a worst-case analysis of the “sweep” algorithm for finding sparse cuts, it shows a necessary and sufficient for a graph to be an expander, and it relates the mixing time of a graph to its conductance.

Readers who have heard this story before will recall that a version of this result for vertex expansion was first proved by Alon and Milman, and the result for edge expansion appeared in a paper of Dodzuik, all from the mid-1980s. The result, however, is not called Cheeger’s inequality just because of Stigler’s rule: Cheeger proved in the 1970s a very related result on manifolds, of which the result on graphs is the discrete analog.

So, what is the actual Cheeger’s inequality?

Theorem 1 (Cheeger’s inequality) Let {M} be an {n}-dimensional smooth, compact, Riemann manifold without boundary with metric {g}, let {L:= - {\rm div} \nabla} be the Laplace-Beltrami operator on {M}, let {0=\lambda_1 \leq \lambda_2 \leq \cdots } be the eigenvalues of {L}, and define the Cheeger constant of {M} to be

\displaystyle  h(M):= \inf_{S\subseteq M : \ 0 < \mu(S) \leq \frac 12 \mu(M)} \ \frac{\mu_{n-1}(\partial(S))}{\mu(S)}

where the {\partial (S)} is the boundary of {S}, {\mu} is the {n}-dimensional measure, and {\mu_{n-1}} is {(n-1)}-th dimensional measure defined using {g}. Then

\displaystyle   h(M) \leq 2 \sqrt{\lambda_2} \ \ \ \ \ (1)

The purpose of this post is to describe to the reader who knows nothing about differential geometry and who does not remember much multivariate calculus (that is, the reader who is in the position I was in a few weeks ago) what the above statement means, to describe the proof, and to see that it is in fact the same proof as the proof of the statement about graphs.

In this post we will define the terms appearing in the above theorem, and see their relation with analogous notions in graphs. In the next post we will see the proof.

Continue reading

Hello, and Welcome to Fun with Expanders

Long in the planning, my online course on graph partitioning algorithms, expanders, and random walks, will start next month.

The course page is up for people to sign up. A friend of mine has compared my camera presence to Sheldon Cooper’s in “Fun with Flags,” which is sadly apt, but hopefully the material will speak for itself.

Meanwhile, I will be posting about some material that I have finally understood for the first time: the analysis of the Arora-Rao-Vazirani approximation algorithm, the Cheeger inequality in manifolds, and the use of the Selberg “3/16 theorem” to analyze expander constructions.

If you are not a fan of recorded performances, there will be a live show in Princeton at the end of June.

Holiday Readings

Now that the Winter break is coming, what to read in between decorating, cooking, eating, drinking and being merry?

The most exciting theoretical computer science development of the year was the improved efficiency of matrix multiplication by Stother and Vassilevska-Williams. Virginia’s 72-page write-up will certainly keep many people occupied.

Terence Tao is teaching a class on expanders, and posting the lecture notes, of exceptional high quality. It is hard to imagine something that would a more awesome combination, to me, than Terry Tao writing about expanders. Maybe a biopic on Turing’s life, starring Joseph Gordon-Levitt, written by Dustin Lance Black and directed by Clint Eastwood.

Meanwhile, Ryan O’Donnell has been writing a book on analysis of boolean function, by “serializing” it on a blog platform.

Now that I have done my part, you do yours. If I wanted to read a couple of books (no math, no theory) during the winter, what would you recommend. Don’t recommend what you think I would like, recommend what you like best, and why.

CS359G Lecture 16: Constructions of Expanders

In which we give an explicit construction of expander graphs of polylogarithmic degree, state the properties of the zig-zag product of graphs, and provide an explicit construction of a family of constant-degree expanders using the zig-zag product and the polylogarithmic-degree construction.

A family of expanders is a family of graphs {G_n = (V_n,E_n)}, {|V_n|=n}, such that each graph is {d_n}-regular, and the edge-expansion of each graph is at least {h}, for an absolute constant {h} independent of {n}. Ideally, we would like to have such a construction for each {n}, although it is usually enough for most applications that, for some constant {c} and every {k}, there is an {n} for which the construction applies in the interval {\{ k, k+1, \ldots, ck \}}, or even the interval {\{ k, \ldots, ck^c\}}. We would also like the degree {d_n} to be slowly growing in {n} and, ideally, to be bounded above by an explicit constant. Today we will see a simple construction in which {d_n = O(\log^2 n)} and a more complicated one in which {d_n = O(1)}.

An explicit construction of a family of expanders is a construction in which {G_n} is “efficiently computable” given {n}. The weakest sense in which a construction is said to be explicit is when, given {n}, the (adjacency matrix of the) graph {G_n} can be constructed in time polynomial in {n}. A stronger requirement, which is necessary for several applications, is that given {n} and {i\in \{ 1,\ldots,n\}}, the list of neighbors of the {i}-th vertex of {G_n} can be computed in time polynomial in {\log n}.

In many explicit constructions of constant-degree expanders, the construction is extremely simple, and besides satisfying the stricter definition of “explicit” above, it is also such that the adjacency list of a vertex is given by a “closed-form formula.” The analysis of such constructions, however, usually requires very sophisticated mathematical tools.

Example 1 Let {p} be a prime, and define the graph {G_p = (V_p,E_p)} in which {V_p = \{ 0,\ldots,p-1\}}, and, for {a\in V_p - \{ 0\}}, the vertex {a} is connected to {a+1 \bmod p}, to {a-1 \bmod p} and to its multiplicative inverse {a^{-1} \bmod p}. The vertex {0} is connected to {1}, to {p-1}, and has a self-loop. Counting self-loops, the graph is 3-regular: it is the union of a cycle over {V_p} and of a matching over the {p-3} vertices {V_p - \{ 0,1,p-1 \}}; the vertices {0}, {1}, {p-1} have a self-loop each. There is a constant {h>0} such that, for each {p}, the graph {G_p} has edge expansion at least {h}. Unfortunately, no elementary proof of this fact is known. The graph {G_{59}} is shown in the picture below.

Constructions based on the zig-zag graph product, which we shall see next, are more complicated to describe, but much simpler to analyze.

We begin by describing a building block in the construction, which is also an independently interesting construction: a family of expanders with polylogarithmic degree, which have both a very simple description and a very simple analysis.

Continue reading

CS359G Lecture 3: Cheeger’s inequality

In which we prove the easy case of Cheeger’s inequality.

1. Expansion and The Second Eigenvalue

Let {G=(V,E)} be an undirected {d}-regular graph, {A} its adjacency matrix, {M = \frac 1d \cdot A} its normalized adjacency matrix, and {1=\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n} be the eigenvalues of {M}.

Recall that we defined the edge expansion of a cut {(S,V-S)} of the vertices of {G} as

\displaystyle  h(S) := \frac {E(S,V-S)}{d \cdot \min \{ |S|, |V-S| \} }

and that the edge expansion of {G} is {h(G) := \min_{S\subseteq V} h(S)}.

We also defined the related notion of the sparsity of a cut {(S,V-S)} as

\displaystyle  \phi(S) := \frac {E(S,V-S)}{ \frac dn \cdot |S| \cdot |V-S| }

and {\phi(G) := \min_S \phi(S)}; the sparsest cut problem is to find a cut of minimal sparsity.

Recall also that in the last lecture we proved that {\lambda_2 = 1} if and only if {G} is disconnected. This is equivalent to saying that {1-\lambda_2 = 0 } if and only if {h(G)=0}. In this lecture and the next we will see that this statement admits an approximate version that, qualitatively, says that {1-\lambda_2} is small if and only if {h(G)} is small. Quantitatively, we have

Theorem 1 (Cheeger’s Inequalities)

\displaystyle  \frac{1-\lambda_2}2 \leq h(G) \leq \sqrt{2 \cdot (1-\lambda_2) } \ \ \ \ \ (1)

Continue reading

CS359G Lecture 1: Overview

In which we describe what this course is about.

1. Overview

This class is about the following topics:

  1. Approximation algorithms for graph partitioning problems. We will study approximation algorithms for the sparsest cut problem, in which one wants to find a cut (a partition into two sets) of the vertex set of a given graph so that a minimal number of edges cross the cut compared to the number of pairs of vertices that are disconnected by the removal of such edges.

    This problem is related to estimating the edge expansion of a graph and to find balanced separators, that is, ways to disconnect a constant fraction of the pairs of vertices in a graph after removing a minimal number of edges.

    Finding balanced separators and sparse cuts arises in clustering problems, in which the presence of an edge denotes a relation of similarity, and one wants to partition vertices into few clusters so that, for the most part, vertices in the same cluster are similar and vertices in different clusters are not. For example, sparse cut approximation algorithms are used for image segmentation, by reducing the image segmentation problem to a graph clustering problem in which the vertices are the pixels of the image and the (weights of the) edges represent similarities between nearby pixels.

    Balanced separators are also useful in the design of divide-and-conquer algorithms for graph problems, in which one finds a small set of edges that disconnects the graph, recursively solves the problem on the connected components, and then patches the partial solutions and the edges of the cut, via either exact methods (usually dynamic programming) or approximate heuristic. The sparsity of the cut determines the running time of the exact algorithms and the quality of approximation of the heuristic ones.

    We will study three approximation algorithms:

    1. The Spectral Partitioning Algorithm, based on linear algebra;
    2. The Leighton-Rao Algorithm, based on a linear programming relaxation;
    3. The Arora-Rao-Vazirani Algorithm, based on a semidefinite programming relaxation.

    The three approaches are related, because the continuous optimization problem that underlies the Spectral Partitioning algorithm is a relaxation of the ARV semidefinite programming relaxation, and so is the Leighton-Rao relaxation. Rounding the Leighton-Rao and the Arora-Rao-Vazirani relaxations raise interesting problems in metric geometry, some of which are still open.

  2. Explicit Constructions of Bounded-Degree Expanders. Expander graphs are graphs with very strong connectivity and “pseudorandomness” properties. Constructions of constant-degree expanders are useful in a variety of applications, from the design of data structures, to the derandomization of algorithms, from efficient cryptographic constructions to being building blocks of more complex quasirandom objects.

    There are two families of approaches to the explicit (efficient) construction of bounded-degree expanders. One is via algebraic constructions, typically ones in which the expander is constructed as a Cayley graph of a finite group. Usually these constructions are easy to describe but rather difficult to analyze. The study of such expanders, and of the related group properties, has become a very active research program, involving mostly ergodic theorists and number theorists. There are also combinatorial constructions, which are somewhat more complicated to describe but considerably simpler to analyze.

  3. Bounding the Mixing Time of Random Walks and Approximate Counting and Sampling. If one takes a random walk in a regular graph that is connected and not bipartite, then, regardless of the starting vertex, the distribution of the {t}-th step of the walk is close to the uniform distribution over the vertices, provided that {t} is large enough. It is always sufficient for {t} to be quadratic in the number of vertices; in some graphs, however, the distribution is near-uniform even when {t} is just poly-logarithmic. Among other applications, the study of the “mixing time” (the time that it takes to reach the uniform distribution) of random walks has applications to analyzing the convergence time of certain randomized algorithms.

    The design of approximation algorithms for combinatorial counting problems, in which one wants to count the number of solutions to a given NP-type problem, can be reduced to the design of approximately uniform sampling in which one wants to approximately sample from the set of such solutions. For example, the task of approximately counting the number of perfect matchings can be reduced to the task of sampling almost uniformly from the set of perfect matchings of a given graph. One can design approximate sampling algorithms by starting from an arbitrary solution and then making a series of random local changes. The behavior of the algorithm then corresponds to performing a random walk in the graph that has a vertex for every possible solution and an edge for each local change that the algorithm can choose to make. Although the graph can have an exponential number of vertices in the size of the problem that we want to solve, it is possible for the approximate sampling algorithm to run in polynomial time, provided that a random walk in the graph converges to uniform in time poly-logarithmic in its size.

    The study of the mixing time of random walks in graphs is thus a main analysis tool to bound the running time of approximate sampling algorithms (and, via reductions, of approximate counting algorithms).

These three research programs are pursued by largely disjoint communities, but share the same mathematical core.

Continue reading

Graph Conductance and Gossip Spreading

Flavio Chierichetti, Silvio Lattanzi and Alessandro Panconesi, of the Sapienza University of Rome, have showed tight connections between the time it takes for a rumor to spread in a social network and the conductance of a network, in two recent papers in the past SODA and the next STOC. (The SODA paper is also notable for its “no thanks” section at the end.)

Flavio and Alessandro were recently invited on Italian national television to talk about gossip spreading in social networks, and the video (in Italian, no subtitles) is notable for Flavio writing, around 2:30, the formula for graph conductance on the “blackboard” provided by the producers.