In the previous post, we saw that the *sparsest cut* of a graph (which is within a factor of 2 of the *edge expansion* of the graph), is defined as

and can also be written as

while the *eigenvalue gap* of , can be written as

and so it can be seen as a *relaxation* of , considering that

when are boolean.

Let us now take a broader look at efficiently computable continuous relaxations of sparsest cut, and see that the eigenvalue gap is a *semidefinite programming* relaxation of sparsest cut, that by adding the *triangle inequality* to it we derive the Arora-Rao-Vazirani relaxation, and that by removing the semidefinite constraint from the ARV relaxation we get the Leighton-Rao relaxation.

We first prove that (2) is equivalent to

The difference is that instead of optimizing over all possible ways to assign a real number to each vertex , we are now optimizing over all possible ways to assign a vector to each vertex . The absolute value is then replaced by the Euclidean distance . The right-hand-side of is clearly a relaxation of the right-hand-side of , so we just need to establish that, in the right-hand-side of , it is optimal to take 1-dimensional vectors. For every vector solution , we see that

where the last inequality follows from

which, in turn, has a one-line proof:

The formulation (3) is equivalent to the following *semidefinite program*:

An optimization problem is a *semidefinite program* if every variable is allowed to be an arbitrary vector, and both the objective function and the constraints are *linear functions of inner products of pairs of variables*. In our case,

is a linear combination of inner products, and so the formulation (5) is a semidefinite program. Another way to define semidefinite programming (indeed, the definition that is usually given first, deriving the one in terms of inner products as an equivalent characterization) is that a semidefinite program is an optimization problem in which the variables are the entries of a positive semidefinite matrix , and the objective function and the constraints are linear in the entries of . A real matrix is positive semidefinite if it is symmetric and all its eigenvalues are non-negative. We can see that an matrix is positive semidefinite if and only if there are vectors such that .

(The proof is simple: if is real and symmetric, then all its eigenvalues are real, and there is an orthonormal system of eigenvectors such that

so that

If the eigenvalues are non-negative, we can let and have .

On the other hand, if there are such that , let be any eigenvalue of and be a length-1 eigenvector of . Then:

which completes the characterization.)

So we have gotten to the point where we can see as a *semidefinite programming* relaxation of sparsest cut. An interesting property of semidefinite programs is that, like linear programs, they admit a *dual*. From a minimization (primal) semidefinite program, we can write a maximization (dual) semidefinite program such that the cost of any feasible solution for the dual is a lower bound to the cost of any feasible solution for the primal; furthermore, the two programs have the same optimum cost. Returning to our motivating question of certifying the expansion of a given graph, any feasible solution to the dual of (5) gives us a lower bound to the expansion of the graph.

We shall have to suffer a bit longer before we can write the dual of (5). It is best to first move to a formulation in terms of entries of semidefinite matrices. A few calculations show that (5) can be written as

Where we write to mean that the matrix is positive semidefinite (we shall also write when is positive semidefinite), we write to mean , is the *Laplacian* of , that is the matrix , and is the Laplacian of the clique on vertices.

From this, we look up in Wikipedia the rules to form a semidefinite dual, and we get

It is possible to make sense of (7) independently of all the above discussion, that is, to see from first principles that the cost of any feasible solution to (7) is a lower bound to the sparsest cut of .

A feasible solution is a value and a positive semidefinite matrix such that . Equivalently, it is a value and vectors such that

Take any cut in . Then

and

but

and so the cost of the cut is at least . The justify the last inequality above, we need to prove that if is positive semidefinite than

Indeed

We can also see, from first principles, that if are the eigenvalues of and is a corresponding system of eigenvectors, we can find a feasible solution to (7) of cost .

We defined , where we assume that is -regular. We can write

And we are almost there because , and it remains to prove that the matrix

is positive semidefinite. This we can see by noting that are eigenvectors of with eigenvalues . Alternatively, we could construct vectors such that .

Essentially, what is going on is that we are “embedding” a scaled version of the complete graph into our graph. The edge expansion and sparsest cut of the complete graph are known, and the semidefinite constraint in (7) is telling us that the difference between our graph and the scaled complete graph can only add to the sparsest cut, so the sparsest cut of is at least the known sparsest cut of the scaled complete graph.

Why did we go through all this trouble? Now that we understand the spectral gap inside out, we can derive the Arora-Rao-Vazirani relaxation (ARV) of sparsest cut by just adding the “triangle inequalities” to (5), and we can derive the Leighton-Rao relaxation by removing the semidefinite constraints from ARV. In the dual of Leighton-Rao, we again “embed” a complete graph in our graph, by realizing every edge as a flow. In the dual of ARV, we will find both flows and spectral considerations.

[to be continued]

Luca, nice post. I anxiously await your interpretation of the dual of the ARV relaxation, which I still find a bit confounding. In particular, it involves the following kind of question: Consider a graph with positive and negative edge weights, and let L(G) be the Laplacian of G. When is L(G) a PSD matrix? (Certainly if there are no negative edge weights.)

To see how this relates to your post, consider e.g. the complete graph with edge weights all -eps for some small eps > 0. Now take an expander graph on the same set of nodes and add 1 to all the expander edges. This graph has lots of negative edge weights, but the Laplacian still manages to be PSD as long as eps > 0 is chosen to be a small enough constant. (This way of generating dual solutions is what ARV calls “expander flows.”)

I don’t think I intended to put that ridiculous smiley face at the end of my comment.

James, it seems to me that you are describing the feasible solutions to (7), the dual of the vanilla SDP without triangle inequalities, rather than solutions to the dual of ARV.

The point of (7) is that if G and H are (adjacency matrices of) graphs such that L(G)-L(H), or, which is the same, L(G-H), has no negative eigenvalue, then . So we can prove that the sparsest cut of G is at least by showing that if we subtract from every entry of the adjacency matrix of G, then all the eigenvalues of the Laplacian are still non-negative. Then the sparsest cut of G is at least the sparsest cut of the graph with in each entry, and this is by my normalization.

In Leighton-Rao, the dual exploits the fact if G and H are graphs such that H(i,j) units of flow can be routed in G from i to j for each (i,j), then .

As I understand it, the dual of ARV combines the two facts; to prove that , we construct a matrix H such that H(i,j) flow can be routed from i to j in G, hence and such that if we subtract from each entry of H the Laplacian of the result has non-negative entries, hence .

I am finding it quite difficult, however, to reverse-engineer the ARV primal as the dual of what I just described. (I spent a day convincing myself that the dual of the Leighton-Rao relaxation is indeed multi-commodity flow, which is supposed to be obvious, so I am a bit slow with this kind of thing.)

Ah, never mind, you were talking about subtracting the complete graph only as an example.

About the emoticon: in an earlier version of the post there was an equation, number 8, and “,” if not formatted inside a latex environment, looked like this: (8)

If you go to Settings->Writing->Formatting, you can turn off the auto-emoticon conversion.

In the dual of ARV, you basically look at a sort of residual flow graph. In a standard (Leighton-Rao style) multi-flow, this will always have positive edge weights because, by the constraints, you cannot over-congest an edge, and you cannot undersatisfy a demand. In the dual of ARV, you can do both: The weight of a pair (i,j) is

w(i,j) = Flow(i,j) – D + Cap(i,j) – Congestion(i,j)

here D is the demand, and it’s what the dual objective is trying to maximize (trying to send D units of flow from every i to every j). Cap(i,j) is the capacity of edge (i,j) (in the unweighted case, this is just 1 for edges, and 0 for other pairs). Congestion(i,j) is the amount of flow going through the edge (i,j). In a standard all-pairs flow, you would require that Flow(i,j) \geq D, and Cap(i,j) \geq Congestion(i,j), but in the ARV dual, you have a weaker notion of flow, which is that Laplacian(W) is PSD, where W = (w(i,j)) is the matrix of weights. Here, the degree of a node i is

deg(i) = \sum_{j} w(i,j)

and then Laplacian(W) = deg(W) – W, where deg(W) is the diagonal degree matrix. I think of W as the “error” matrix of the flow. In a standard flow, it would have all positive entries, while in a dual-ARV-flow, you only require Laplacian(W) to be PSD.

What you describe in your comment is more like an expander flow, where you actually have a feasible flow on some subset of the demands. The dual seems to be more powerful than this, however, able to send a flow which undersatisfies demands and overcongests edges as long as the “total error” in being a flow is PSD (rather, its Laplacian is PSD). (Maybe it’s not more powerful–that would be fantastically interesting, and I think it would reduce the Khot-Vishnoi paper to about 3 lines!)

About reversing engineering the primal, do you just want to prove weak duality like you did for the 2nd eigenvalue (i.e. show that a solution to the ARV SDP would yield an upper bound on the amount of “PSD flow” you can send?)

To do this, it helps to look at the dual-ARV program as a flow program (where the flow variables f_{ijk} are multipliers of triangle inequalities in the primal). Since triangle inequalities imply path inequalities, the only flow variables you need are f_{ijk}, i.e. flow from i to k can only be sent along two-hop paths i-j-k. (See how ARV writes the dual in their paper, but note that the factor 2 written there in lines (30) and (35) is wrong, and should be deleted.)

Now let {x_1, …, x_n} be a primal solution, let x = (x_1, …, x_n) (a vector of vectors) and consider the dual constraint:

x^T L(W) x \geq 0. (**)

This is some inequality involving the values f_{ijk}, Con(i,j) (which can be defined in terms of the flow variables), Cap(i,j), and D (the objective value). In order to see that it gives an upper bound on D, you want to make *all* the flow variables equal to 0. If you could do this, and then read off the above inequality (**), you would get an upper bound on D which is like (sum_{i,j} Cap(i,j) |x_i-x_j|^2)/(sum_{i,j} |x_i-xj|^2), i.e. exactly the primal objective.

How can you make the flow variables all equal to 0, while maintaining the inequality (**)? This is where the triangle inequalities on the vectors {x_1, …, x_n} comes in. Basically, you check that when you reduce f_{ijk} to 0, you lose in some terms, but you gain in others where Con(i,j) and Con(j,k) is reduced. By the squared-triangle inequality, you make positive net gain, hence you can make all the flow variables 0.

Pingback: Big Red Bits » Duality Theorem for Semidefinite Programming