# Buser Inequalities in Graphs

As life is tentatively returning to normal, I would like to once again post technical material here. Before returning to online optimization, I would like to start with something from 2015 that we never wrote up properly, that has to do with graph curvature and with Buser inequalities in graphs.

The starting point is the Cheeger inequalities on graphs.

If ${G= (V,E)}$ is a ${d}$-regular graph, and ${A}$ is its adjacency matrix, then we define the normalized Laplacian matrix of ${G}$ as ${L: = I - A/d}$, and we call ${\lambda_2}$ the second smallest (counting multiplicities) eigenvalue of ${G}$. It is important in the spectral theory of graphs that this eigenvalue has a variational characterization as the solution of the following optimization problem:

$\displaystyle \lambda_2 = \min_{x \in {\mathbb R}^V : \langle x, {\bf 1} \rangle=0} \ \frac { x^T L x}{||x||^2} = \min_{x \in {\mathbb R}^V : \langle x,{\bf 1} \rangle=0} \ \frac { \sum_{(u,v) \in E} (x_u - x_v)^2}{d\sum_v x_v^2 }$

The normalized edge expansion of ${G}$ is defined as

$\displaystyle \phi(G) = \min_{S : |S| \leq |V|/2} \ \frac{ cut(S) }{d|S|}$

where ${cut(S)}$ denotes the number of edges with one endpoint in ${S}$ and one endpoint outside ${S}$. We have talked about these quantities several times, so we will just jump to the fact that the following Cheeger inequalities hold:

$\displaystyle \frac {\lambda_2 }2 \leq \phi(G) \leq \sqrt {2 \lambda_2}$

Those inequalities are called Cheeger inequality because the upper bound is the discrete analog of a result of Cheeger concerning Riemann manifolds. There were previous posts about this here and here, and for our current purposes it will be enough to recall that, roughly speaking, a Riemann manifold defines a space on which we can define real-valued and complex-valued functions, and such functions can be integrated and differentiated. Subsets ${S}$ of a Riemann manifold have, if they are measurable, a well defined volume, and their boundary ${\partial S}$ a well defined lower-dimensional volume; it possible to define a Cheeger constant of a manifold in a way that is syntactically analogous to the definition of edge expansion:

$\displaystyle \phi (M) = \min_{S : vol(S) \leq vol(M)/2} \ \ \ \frac{vol(\partial S)}{vol(S)}$

It is also possible to define a Laplacian operator ${L}$ on smooth functions ${f: M \rightarrow {\mathbb C}}$, and if ${\lambda_2}$ is the second smallest eigenvalue of ${L}$ we have the characterization

$\displaystyle \lambda_2 = \min_{f : M \rightarrow {\mathbb R} \ \ {\rm smooth}, \int_M f = 0} \ \ \frac{ \langle Lf , f\rangle}{\langle f,f\rangle} = \min_{f : M \rightarrow {\mathbb R} \ \ {\rm smooth}, \int_M f = 0} \ \ \frac{ \int_M ||\nabla f||^2}{\int_M f^2}$

and Cheeger proved

$\displaystyle \phi(M) \leq 2 \sqrt{\lambda_2 }$

which is syntactically almost the same inequality and, actually, has a syntactically very similar proof (the extra ${\sqrt 2}$ factor comes from the fact that things are normalized slightly differently).

The “dictionary” between finite graphs and compact manifolds is that vertices correspond to points, degree correspond to dimensionality, adjacent vertices ${v}$ and ${w}$ correspond to “infinitesimally close” points along one dimension, and the set of edges in the cut ${(S,V-S)}$ correspond to the boundary ${\partial S}$ of a set ${S}$, “volume” corresponds to number of edges (both when we think of volume of the boundary and volume of a set), vectors ${x\in {\mathbb R}^V}$ correspond to smooth functions ${f: M \rightarrow {\mathbb R}}$, and the collection of values of ${x}$ at the neighbors of a vertex ${\{ x_v : (u,v) \in E \}}$ corresponds to the gradient ${\nabla f(u)}$ of a function at point ${u}$.

Having made this long premise, the point of this post is that the inequality

$\displaystyle \phi(G) \geq \frac {\lambda_2}2 \ ,$

which is the easy direction to show in the graph case, does not hold for manifolds.

The easy proof for graphs is that if ${(S,V-S)}$ is the cut that realizes the minimum edge expansion, we can take ${x = {\bf 1}_S}$, or rather the projection of the above vector on the space orthogonal to ${{\bf 1}}$, and a two-line calculation gives the inequality.

If, however, ${S}$ is a subset of points of the manifold that realizes the Cheeger constant, we cannot define ${f(\cdot)}$ to be the indicator of ${S}$, because the function would not be smooth and its gradient would be undefined.

We could think of rescuing the proof by taking a smoothed version of the indicator of ${S}$, something that goes to 1 on one side and to 0 on the other side of the cut, as a function of the distance from the boundary. The “quadratic form” of such a function, however, would depend not just on the area of the boundary of ${S}$, but also on how quickly the volume of the subset of the manifold at distance ${\delta}$ from the boundary of ${S}$ grows as a function of ${\delta}$.

If the Ricci curvature of the manifold is negative, this volume grows very quickly, and the quadratic form of the smoothed function is very bad. The graph analog of this would be a graph made of two expanders joined by a bridge edge, and the problem is that the analogy between graphs and manifolds breaks down because “infinitesimal distance” in the manifold corresponds to any ${o(\log n)}$ distance on the graph, and although the bridge edge is a sparse cut in the graph, it is crossed by a lot of paths of length ${o(\log n)}$, or at least this is the best intuition that I have been able to build about what breaks down here in the analogy between graphs and manifolds.

If the Ricci curvature of the manifold is non-negative and the dimension is bounded, there is however an inequality in manifolds that goes in the direction of the “easy graph Cheeger inequality,” and it has the form

$\displaystyle \lambda_2 \leq 10 \phi^2 (M)$

This is the Buser inequality for manifolds of non-negative Ricci curvature, and note how strong it is: it says that ${\phi(M)}$ and ${\sqrt{\lambda_2}}$ approximate each other up to the constant factor ${2\sqrt{10}}$.

If the Ricci curvature ${R}$ is negative, and ${n}$ is the dimension of the manifold, then one has the inequality

$\displaystyle \lambda_2 \leq 2 \sqrt{|R| \cdot (n-1)} \cdot \phi(M) + 10 \phi^2 (M)$

Given the importance that curvature has in the study of manifolds, there has been considerable interest in defining a notion of curvature for graphs. For example curvature relates to how quickly balls around a point grow with the diameter the ball, a concept that is of great importance in graphs as well; curvature is a locally defined concept that has a global consequences, and in graph property testing one is precisely interested in understanding global properties based on local ones; and a Buser-type inequality in graphs would be very interesting because it would provide a class of graphs for which Fiedler’s algorithm provides constant-factor approximation for sparsest cut.

There have been multiple attempts at defining notions of curvature for graphs, the main ones being Olivier curvature and Bakry-Emery curvature. Each captures some but not all of the useful properties of Ricci curvature in manifolds. My (admittedly, poorly informed) intuition is that curvature in manifold is defined “locally” but, as we mentioned above, “local” in graphs can have multiple meanings depending on the distance scale, and it is difficult to come up with a clean and usable definition that talks about multiple distance scales.

Specifically to the point of the Buser inequality, Bauer et al and Klartag et al prove Buser inequalities for graphs with respect to the Bakry-Emery definition. Because Cayley graphs of Abelian groups happen to have curvature 0 according to this definition, their work gives the following result:

Theorem 1 (Buser inequality for Cayley graphs of Abelian groups) If ${G=(V,E)}$ is a ${d}$-regular Cayley graph of an Abelian group, ${\lambda_2}$ is the second smallest normalized Laplacian eigenvalue of ${G}$ and ${\phi(G)}$ is the normalized edge expansion of ${G}$, we have

$\displaystyle \lambda_2 \leq O(d) \cdot \phi^2 (G)$

(Klartag et al. state the result as above; Bauer et al. state it only for certain groups, but I think that their proof applies to all Abelian groups)

In particular, the above statement implies that if we have a ${d}$-regular Abelian Cayley graph then ${\sqrt {\lambda_2}}$ provides an approximation of the sparsest cut up to a multiplicative error ${O(\sqrt d)}$ and that Fiedler’s algorithm is an ${O(\sqrt d)}$ approximate algorithm for sparsest cut in Abelian Cayley graphs.

At some point in 2015 (or maybe 2014, during the Simons program on spectral graph theory?), I read the paper of Bauer at al. and wondered about a few questions. First of all, is there a way to prove the above theorem without using any notion of curvature?

Secondly, the theorem implies that Fiedler’s algorithm has a good approximation, but it does so in a very unusual way: we have ${\lambda_2}$ that is a continuous relaxation of the sparsest cut problem, and we have Fiedler’s algorithm that, as analyzed by Cheeger’s inequality, provides a somewhat bad rounding, and finally the Buser inequality is telling us that the relaxation has very poor integrality gap on all Abelian Cayley graphs, and so even though the rounding seems bad it is actually good compared to the integral optimum. There should be an underlying reason for all this, meaning some other relaxation that has an actual integrality gap for sparsest cut that is at most ${O(\sqrt d)}$.

I mentioned these questions to Shayan Oveis Gharan, and we were able to prove that if ${ARV(G)}$ is the value of the Arora-Rao-Vazirani (or Goemans-Linial) semidefinite programming relaxation of sparsest cut, then, for all ${d}$-regular Cayley graphs, we have

$\displaystyle \lambda_2 \leq O(d) \cdot (ARV(G))^2 \leq O(d) \cdot \phi^2 (G)$

which proves the above theorem and, together with the Cheeger inequality ${\lambda_2 \geq \phi^2(G)/2}$, also shows

$\displaystyle \phi (G) \leq O(\sqrt d) \cdot ARV(G)$

that is, the ARV relaxation has integrality gap at most ${O(\sqrt d)}$ on Abelian Cayley graphs. I will discuss the proof (which is a sort of “sum-of-squares version” of the proof of Bauer et al.) in the next post.