In this post we will construct a “spectral sparsifier” of a given hypergraph in a way that is similar to how Spielman and Srivastava construct spectral graph sparsifiers. We will assign a probability to each hyperedge, we will sample each hyperedge with probability , and we will weigh it by if selected. We will then bound the “spectral error” of this construction in terms of the supremum of a Gaussian process using Talagrand’s comparison inequality and finally bound the supremum of the Gaussian process (which will involve matrices) using matrix Chernoff bounds. This is joint work with Nikhil Bansal and Ola Svensson.

**1. Hypergraph Sparsifiers **

An (undirected) hypergraph is a collection of subsets , called hyperedges. A graph is the special case in which each set has cardinality 2. For simplicity we will talk only about -uniform hypergraphs, that is hypergraphs in which all hyperedges have the same cardinality (the arguments below would work also in the non-uniform case in which all hyperedges have cardinality *at most* ). Hyperedges may have weights.

If is a subset of vertices, a hyperedge is *cut* by if has non-empty intersection with both and . We call the number of hyperedges of cut by , or the total weight of such hyperedges if is a weighted hypergraph.

We can then generalize the notion of Benczur-Karger sparsification, and say that is a cut sparsifier of with error if and have the same vertex set and

Kogan and Krauthgamer show that every -uniform hypergraph admits a cut sparsifier of error with only weighted hyperedges. They are able to extend the argument of Benczur and Karger to hypergraphs, including arguing that there are few sparse cuts. They assign a probability to each hyperedge, sample it with that probability, weighing it if selected, and then use a union bound and Chernoff bounds to argue that all cuts are preserved.

Anand Louis has introduced a notion of hypergraph Laplacian in the following way. The Laplacian quadratic form of a hypergraph is a function defined as

where is the weight of the hyperedge , or 1 if the hypergraph is unweighted.

This definition has the motivation that it recovers the Laplacian quadratic form in the case of graphs, and that if is the indicator of a set then . Furthermore, one can define “eigenvalues” and “eigenvectors” of this hypergraph Laplacian and recover a Cheeger inequality and even higher-order Cheeger inequalities.

So it seems interesting to consider the following notion of sparsification: a hypergraph is a spectral sparsifier of with error if and have the same vertex set and

where, as before, the convention is that .

Soma and Yoshida studied this question and gave a construction with hyperedges. In the rest of this post we will discuss the construction by Nikhil Bansal, Ola Svensson and me, which uses hyperedges.

**2. Choosing Probabilities **

Given a hypergraph , we can construct a multigraph by taking each hyperedge of and then constructing, in , a clique between the vertices of . Thus, in , the edge is repeated as many times (possibly, zero times) as the number of hyperedges of that contain both and . Another way to think about it is that the Laplacian of is given by

where the inner sum is over the unordered pairs . This graph relates in several interesting ways with . For example, if is a subset of vertices and is a hyperedge that is cut by , then between and of the edges of derived from are cut by . If is not cut by , then none of the edges derived from is cut by , so we have

and, with a bit more work, it is possible to prove similar bounds for the Laplacian quadratic forms

This means that if we sparsify , say with the Spielman-Srivastava construction, we obtain information about , up to multiplicative error . Now suppose that, as we sample edges of to sparsify it in the Spielman-Srivastava way, we will also pick hyperedges of (for example we pick if at least one of its corresponding edges in is picked), and we weigh them by the inverse of the probability of being selected. Then we may hope that if the Spielman-Srivastava sparsification of is tuned to achieve error , the hypergraph that we obtain will have error at most . Indeed, this is roughly what happens, and we will be able to prove it by showing that the error in the hypergraph sparsification is dominated by the error in the “Gaussian version” of Spielman-Srivastava described in the previous post.

So we are going to assign to each hyperedge a probability

where the factor will be chosen later to get the construction to work and is the effective resistance of the edge of .

In fact, as before, it will be helpful to have probabilities that are non-positive powers of two, so we will choose to be a power of two such that

We have

Another fact that will become useful later (it will save us a factor of the order of in the number of hyperedges in the construction) is that

**3. A Discrete Random Process **

Our construction of a hypergraph sparsifier will be to select each independently with probability and, if selected, weigh it by . Our goal is to find an upper bound in probability, or even in expectation, on

Recalling that , we will study

Because if we can show that the above quantity is at most , then, for every ,

as desired.

If we define

then we are interested in the supremum in of

where is a random variable that is equal to with probability and is equal to with probability .

As before, we will do the construction in rounds. If the smallest probability is , then we will have rounds. We start with and then, at round , we take all hyperedges such that and, independently for each such hyperedge, we either delete it or we double its weight (with probability 1/2 in each case). The final hypergraph is distributed like . We have the processes

where the random variables are Rademacher, the weights are either 0 or , and is the set of edges that are “active” in round , that is, the set of hyperedges such that .

For each hypergraph , we will also consider its associated graph , obtained by replacing each hyperedge with a clique. The Laplacian of is

We have the following lemma

Lemma 1For every outcome of the first rounds such that , there is a probability at least over the randomness of the -th round thatand that

This means that we can take and apply the above lemma inductively to argue that we have a high probability that , and so we get a hypergraph spectral sparsifier with error and hyperedges.

To prove the Lemma we will define the Gaussian process

where the are Gaussian. Notice the two differences between and : we replaced the Rademacher choices with Gaussian choices, and we replaced

with

Furthermore, we are doing a random choice for each pair within each hyperedge instead of just one random choice per hyperedge.

Fact 2The random processes and are -dominated by .

*Proof:* For every we have

and

To complete the proof, we argue that

To verify the above inequality, assume , otherwise the argument will be symmetric, and call the maximizer for and the maximizer for . Then

It remains to estimate the expected sup

and the diameter

where, recall, .

With the usual change of variable we have

where, as before, we use the notation , and .

By matrix Chernoff bounds for Gaussian sums of matrices,

Recall that if is a hyperedge that is active at round , then and , and we also have

so that we have

The last term of (1) is the spectral norm of , which is at most 2 by the assumption of the Lemma.

Collecting all the pieces, we have proved that

We also need to bound the diameter of , that is

under the usual change of basis, for every of length 1 we want to bound the square root of

where is either zero or and we can continue our chain of inequalities with

where we used again the assumption of the Lemma .

To prove the last part of the lemma, we have to prove that with high probability

We see that

as noted before, each is either zero or , so

and the final claim follows from matrix Chernoff bounds.

Now we can put everything together. Applying our lemma inductively, we can say that with high probability

and

provided that we choose at least at absolute constant times .

In particular, we can choose to be an absolute constant times and have

which is the same as

So, with high probability, is a spectral sparsifier of with error , and it has hyperedges

**4. Some Additional Thoughts **

The dependence on is definitely not optimal, particularly because when is order of we know from the work of Soma and Yoshida that we can do better. One place where we seem to lose is that, although we know

we are only able to show that

is -dominated by

rather than, as we could have hoped, -dominated. The difficulty is that the bound

is sometimes tight. In order to do better, it seems necessary to do something a bit differently from what we do here.

The effective resistance of an edge in a graph can be written as

with the convention that . So it seems reasonable that a good definition of effective resistance for an hyperedge in a hypergraph would be

One can argue that these “effective resistances” add up to , but perhaps they add up to ? If we sample according to those “effective resitances,” can we apply generic chaining directly to without having to rely on a Gaussian process on matrices, for which we have Chernoff bounds?

Apologies if I missed this, but is it clear why the matrix Chernoff bound approach that works for regular matrices fails for hypergraphs? I see that at one point you use the fact that certain matrices in question are rank-1 in the case of regular graphs, and this would not be true for the case of hypergraphs, but maybe there’s something else also?

Matrix Chernoff bounds allow you to bound the spectral norm of a random matrix.

After a change of basis, the condition that a randomly subsampled graph sparsifies another becomes the question of bounding the spectral norm of a random matrix, to which matrix Chernoff bounds apply (this is Spielman-Srivastava).

The condition that a randomly subsampled hypergraph sparsifies another is the question of how large the supremum of a certain quantity can be, and that supremum is not the spectral norm of a matrix. Talagrand’s comparison inequality allows us to say that the supremum of that quantity is upper bounded by the spectral norm of a certain random matrix, provided that certain condition holds. We verify the conditions and then use matrix Chernoff bounds to upper the spectral norm of the random matrix.