Scribed by Rachel Lawrence
In which we introduce semidefinite programming and apply it to Max Cut.
1. Overview
We begin with an introduction to Semidefinite Programming (SDP). We will then see that, using SDP, we can find a cut with the same kind of near-optimal performance for Max Cut in random graphs as we got from the greedy algorithm — that is,
![\displaystyle cut > \frac{|E|}{2} + \Omega(n\cdot\sqrt[]{d})](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+cut+%3E+%5Cfrac%7B%7CE%7C%7D%7B2%7D+%2B+%5COmega%28n%5Ccdot%5Csqrt%5B%5D%7Bd%7D%29&bg=ffffff&fg=000000&s=0&c=20201002)
in random graphs
. More generally, we will prove that you can always find a cut at least this large in the case that G is triangle-free and with maximum vertex degree
, which will imply the bound in random graphs. We will also see how to use SDP to certify an upper bound:
![\displaystyle max\ cut < \frac{|E|}{2} + O(n\cdot \sqrt[]{d})](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+max%5C+cut+%3C+%5Cfrac%7B%7CE%7C%7D%7B2%7D+%2B+O%28n%5Ccdot+%5Csqrt%5B%5D%7Bd%7D%29&bg=ffffff&fg=000000&s=0&c=20201002)
with high probability in 
Methods using SDP will become particularly helpful in future lectures when we consider planted-solution models instead of fully random graphs: greedy algorithms will fail on some analogous problems where methods using SDP can succeed.
2. Semidefinite Programming
Semidefinite Programming (SDP) is a form of convex optimization, similar to linear programming but with the addition of a constraint stating that, if the variables in the linear program are considered as entries in a matrix, that matrix is positive semidefinite. To formalize this, we begin by recalling some basic facts from linear algebra.
2.1. Linear algebra review
Definition 1 (Positive Semidefinite) A matrix
is positive semidefinite (abbreviated PSD and written
) if it is symmetric and all its eigenvalues are non-negative.
We will also make use of the following facts from linear algebra:
- If
is a symmetric matrix, then all the eigenvalues of
are real, and, if we call
the eigenvalues of
with repetition, we have

where the
are orthonormal eigenvectors of the
.
- The smallest eigenvalue of
has the characterization

and the optimization problem in the right-hand side is solvable up to arbitrarily good accuracy
This gives us the following lemmas:
Lemma 2
if and only if for every vector
we have
.
Proof: From part (2) above, the smallest eigenvalue of M is given by

Noting that we always have
, then
if and only if the numerator
on the right is always non-negative. 
Lemma 3 If
, then
Proof:
,
. By Lemma 2, this implies
. 
Lemma 4 If
and
, then
Proof:
,
. By Lemma 2, this implies
. 
2.2. Formulation of SDP
With these characterizations in mind, we define a semidefinite program as an optimization program in which we have
real variables
, with
, and we want to maximize, or minimize, a linear function of the variables such that linear constraints over the variables are satisfied (so far this is the same as a linear program) and subject to the additional constraint that the matrix
is PSD. Thus, a typical semidefinite program (SDP) looks like

where the matrices
and the scalars
are given, and the entries of
are the variables over which we are optimizing.
We will also use the following alternative characterization of PSD matrices
Lemma 5 A matrix
is PSD if and only if there is a collection of vectors
such that, for every
, we have
.
Proof: Suppose that
and
are such that
for all
and
. Then
is PSD because for every vector
we have

Conversely, if
is PSD and we write it as

we have

and we see that we can define
vectors
by setting

and we do have the property that


This leads to the following equivalent formulation of the SDP optimization problem:

where our variables are vectors
. This is the statement of the optimization problem that we will most commonly use.
2.3. Polynomial time solvability
From lemmas 3 and 4, we recall that if
and
are two matrices such that
and
, and if
is a scalar, then
and
. This means that the set of PSD matrices is a convex subset of
, and that the above optimization problem is a convex problem.
Using the ellipsoid algorithm, one can solve in polynomial time (up to arbitrarily good accuracy) any optimization problem in which one wants to optimize a linear function over a convex feasible region, provided that one has a separation oracle for the feasible region: that is, an algorithm that, given a point,
- Checks whether it is feasible and, if not,
- Constructs an inequality that is satisfied by all feasible point but not satisfied by the given point.
In order to construct a separation oracle for a SDP, it is enough to solve the following problem: given a matrix
, decide if it is PSD or not and, if not, construct an inequality
that is satisfied by the entries of all PSD matrices but that is not satisfied by
. In order to do so, recall that the smallest eigenvalue of
is

and that the above minimization problem is solvable in polynomial time (up to arbitrarily good accuracy). If the above optimization problem has a non-negative optimum, then
is PSD. If it is a negative optimum
, then the matrix is not PSD, and the inequality

is satisfied for all PSD matrices
but fails for
. Thus we have a separation oracle and we can solve SDPs in polynomial time up to arbitrarily good accuracy.
3. SDP Relaxation of Max Cut and Random Hyperplane Rounding
The Max Cut problem in a given graph
has the following equivalent characterization, as a quadratic optimization problem over real variables
, where
:

We can interpret this as associating every vertex
with a value
, so that the cut edges are those with one vertex of value
and one of value
.
While quadratic optimization is NP-hard, we can instead use a relaxation to a polynomial-time solvable problem. We note that any quadratic optimization problem has a natural relaxation to an SDP, in which we relax real variables to take vector values and we change multiplication to inner product:


Figure 1: The hyperplane through the origin defines a cut partitioning the vertices into sets
and
.
Solving the above SDP, which is doable in polynomial time up to arbitrarily good accuracy, gives us a unit vector
for each vertex
. A simple way to convert this collection to a cut
is to take a random hyperplane through the origin, and then define
to be the set of vertices
such that
is above the hyperplane. Equivalently, we pick a random vector
according to a rotation-invariant distribution, for example a Gaussian distribution, and let
be the set of vertices
such that
.
Let
be an edge: One sees that if
is the angle between
and
, then the probability
is cut is proportional to
:
![\displaystyle \mathop{\mathbb P} [ (i,j) \mbox{ is cut } ] = \frac {\theta}{\pi}](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathop%7B%5Cmathbb+P%7D+%5B+%28i%2Cj%29+%5Cmbox%7B+is+cut+%7D+%5D+%3D+%5Cfrac+%7B%5Ctheta%7D%7B%5Cpi%7D+&bg=ffffff&fg=000000&s=0&c=20201002)
and the contribution of
to the cost function is

Some calculus shows that for every
we have

and so
![\displaystyle \mathop{\mathbb E} [ \mbox{ number of edges cut by } (S,V-S) ] \geq .878 \cdot \sum_{(i,j) \in E} \frac 14 || {\bf x}_i - {\bf x}_j ||^2](https://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5Cmathop%7B%5Cmathbb+E%7D+%5B+%5Cmbox%7B+number+of+edges+cut+by+%7D+%28S%2CV-S%29+%5D+%5Cgeq+.878+%5Ccdot+%5Csum_%7B%28i%2Cj%29+%5Cin+E%7D+%5Cfrac+14+%7C%7C+%7B%5Cbf+x%7D_i+-+%7B%5Cbf+x%7D_j+%7C%7C%5E2+&bg=ffffff&fg=000000&s=0&c=20201002)

so we have a polynomial time approximation algorithm with worst-case approximation guarantee
.
Next time, we will see how the SDP relaxation behaves on random graphs, but first let us how it behaves on a large class of graphs.
4. Max Cut in Bounded-Degree Triangle-Free Graphs
Theorem 6 If
is a triangle-free graph in which every vertex has degree at most
, then

Proof: Consider the following feasible solution for the SDP: we associate to each node
an
-dimensional vector
such that
,
if
, and
otherwise. We immediately see that
for every
and so the solution is feasible.
For example, if we have a graph such that vertex 1 is adjacent to vertices 3 and 5:
|
|
2 |
3 |
4 |
5 |
|
|
|
0 |
|
0 |
|
|
|
0 |
|
0 |
0 |
0 |
|
|
|
0 |
|
0 |
0 |
|
|
|
|
|
|
|
|
|
0 |
0 |
0 |
0 |
0 |
|
Let us transform this SDP solution into a cut
using a random hyperplane.
We see that, for every edge
we have

The probability that
is cut by
is

and

so that the expected number of cut edges is at least
. 