Hello, and Welcome to Fun with Expanders

Long in the planning, my online course on graph partitioning algorithms, expanders, and random walks, will start next month.

The course page is up for people to sign up. A friend of mine has compared my camera presence to Sheldon Cooper’s in “Fun with Flags,” which is sadly apt, but hopefully the material will speak for itself.

Meanwhile, I will be posting about some material that I have finally understood for the first time: the analysis of the Arora-Rao-Vazirani approximation algorithm, the Cheeger inequality in manifolds, and the use of the Selberg “3/16 theorem” to analyze expander constructions.

If you are not a fan of recorded performances, there will be a live show in Princeton at the end of June.

CS359G Lecture 1: Overview

In which we describe what this course is about.

1. Overview

This class is about the following topics:

  1. Approximation algorithms for graph partitioning problems. We will study approximation algorithms for the sparsest cut problem, in which one wants to find a cut (a partition into two sets) of the vertex set of a given graph so that a minimal number of edges cross the cut compared to the number of pairs of vertices that are disconnected by the removal of such edges.

    This problem is related to estimating the edge expansion of a graph and to find balanced separators, that is, ways to disconnect a constant fraction of the pairs of vertices in a graph after removing a minimal number of edges.

    Finding balanced separators and sparse cuts arises in clustering problems, in which the presence of an edge denotes a relation of similarity, and one wants to partition vertices into few clusters so that, for the most part, vertices in the same cluster are similar and vertices in different clusters are not. For example, sparse cut approximation algorithms are used for image segmentation, by reducing the image segmentation problem to a graph clustering problem in which the vertices are the pixels of the image and the (weights of the) edges represent similarities between nearby pixels.

    Balanced separators are also useful in the design of divide-and-conquer algorithms for graph problems, in which one finds a small set of edges that disconnects the graph, recursively solves the problem on the connected components, and then patches the partial solutions and the edges of the cut, via either exact methods (usually dynamic programming) or approximate heuristic. The sparsity of the cut determines the running time of the exact algorithms and the quality of approximation of the heuristic ones.

    We will study three approximation algorithms:

    1. The Spectral Partitioning Algorithm, based on linear algebra;
    2. The Leighton-Rao Algorithm, based on a linear programming relaxation;
    3. The Arora-Rao-Vazirani Algorithm, based on a semidefinite programming relaxation.

    The three approaches are related, because the continuous optimization problem that underlies the Spectral Partitioning algorithm is a relaxation of the ARV semidefinite programming relaxation, and so is the Leighton-Rao relaxation. Rounding the Leighton-Rao and the Arora-Rao-Vazirani relaxations raise interesting problems in metric geometry, some of which are still open.

  2. Explicit Constructions of Bounded-Degree Expanders. Expander graphs are graphs with very strong connectivity and “pseudorandomness” properties. Constructions of constant-degree expanders are useful in a variety of applications, from the design of data structures, to the derandomization of algorithms, from efficient cryptographic constructions to being building blocks of more complex quasirandom objects.

    There are two families of approaches to the explicit (efficient) construction of bounded-degree expanders. One is via algebraic constructions, typically ones in which the expander is constructed as a Cayley graph of a finite group. Usually these constructions are easy to describe but rather difficult to analyze. The study of such expanders, and of the related group properties, has become a very active research program, involving mostly ergodic theorists and number theorists. There are also combinatorial constructions, which are somewhat more complicated to describe but considerably simpler to analyze.

  3. Bounding the Mixing Time of Random Walks and Approximate Counting and Sampling. If one takes a random walk in a regular graph that is connected and not bipartite, then, regardless of the starting vertex, the distribution of the {t}-th step of the walk is close to the uniform distribution over the vertices, provided that {t} is large enough. It is always sufficient for {t} to be quadratic in the number of vertices; in some graphs, however, the distribution is near-uniform even when {t} is just poly-logarithmic. Among other applications, the study of the “mixing time” (the time that it takes to reach the uniform distribution) of random walks has applications to analyzing the convergence time of certain randomized algorithms.

    The design of approximation algorithms for combinatorial counting problems, in which one wants to count the number of solutions to a given NP-type problem, can be reduced to the design of approximately uniform sampling in which one wants to approximately sample from the set of such solutions. For example, the task of approximately counting the number of perfect matchings can be reduced to the task of sampling almost uniformly from the set of perfect matchings of a given graph. One can design approximate sampling algorithms by starting from an arbitrary solution and then making a series of random local changes. The behavior of the algorithm then corresponds to performing a random walk in the graph that has a vertex for every possible solution and an edge for each local change that the algorithm can choose to make. Although the graph can have an exponential number of vertices in the size of the problem that we want to solve, it is possible for the approximate sampling algorithm to run in polynomial time, provided that a random walk in the graph converges to uniform in time poly-logarithmic in its size.

    The study of the mixing time of random walks in graphs is thus a main analysis tool to bound the running time of approximate sampling algorithms (and, via reductions, of approximate counting algorithms).

These three research programs are pursued by largely disjoint communities, but share the same mathematical core.

Continue reading