Beyond Worst Case Analysis: Lecture 5

Scribed by Haaris Khan

In which we study the SDP relaxation of Max Cut in random graphs.

1. Quick Review of Chernoff Bounds

Suppose {X_1, ..., X_n} are mutually independent random variables with values {{0, 1}}. \newline Let {X := \sum_{i=1}^{n} X_i}. The Chernoff Bounds claim the following: \newline

1. { \textrm{ } \forall \textrm{ } \epsilon \textrm{ such that } 0 \leq \epsilon \leq 1, }

\displaystyle  \mathop{\mathbb P}(\vert X - \mathop{\mathbb E}[X] \vert) > \epsilon \cdot \mathop{\mathbb E}[X]) \leq \exp(\Omega(\epsilon^2 \cdot \mathop{\mathbb E}[X]))

2. { \forall \textrm{ } t \textrm{ } > 1, }

\displaystyle  \mathop{\mathbb P} (\vert X - \mathop{\mathbb E}[X] \vert \geq t \cdot \mathop{\mathbb E}[X]) \leq \exp(-\Omega((t\log(t)) \cdot \mathop{\mathbb E}[X]))

3. When we do not know {\mathop{\mathbb E}[X]}, we can bound as follows:

\displaystyle  \mathop{\mathbb P}(\vert X - \mathop{\mathbb E}[X] \vert \geq \epsilon \cdot n) \leq \exp(- \Omega(\epsilon^2 \cdot n))

2. Cutting a Near-Optimal Number of Edges in {G_{n,p}} Via SDP Rounding

Consider {G_{n, p}} where {p > \frac{\log(n)}{n}}. We show that with {1 - o(1)} probability, the max-degree will be {O(d)}

  • Fix v
  • For some constant c,

    \displaystyle \mathop{\mathbb P}(\textrm{v has degree} > c \cdot d) = \mathop{\mathbb P}(\vert deg(v) - \mathop{\mathbb E}[v] \vert > (c - 1) \mathop{\mathbb E}[deg(v)])

    \displaystyle \leq \exp(- \Omega((c - 1)\log(c - 1) \cdot d)) \textrm{ (by Chernoff Bounds)}

    \displaystyle  \leq \exp(-\Omega((c - 1)\log(c - 1) \cdot \log(n))

    \displaystyle  \leq \frac{1}{n^2}, \textrm{ for some choice of constant c}

So {\mathop{\mathbb P}(\exists \text{ } v \textrm{ with degree } > c \cdot d) \leq n \cdot \frac{1}{n^2} \leq \frac{1}{n}}

Next, we compute the number of vertices that participate in a triangle. Recall that degree {d} can be bounded by {o(n^{\frac{1}{3}})}

\displaystyle \mathop{\mathbb E}[\textrm{number vertices in triangles}] = n \cdot \mathop{\mathbb P}(\textrm{v participates in a triangle})

If a vertex participates in a triangle, there are {\binom{n - 1}{2}} ways of choosing the other two vertices that participate with v in the triangle. \newline So the expected number of vertices in triangles can be bounded by

\displaystyle  \mathop{\mathbb E}[\textrm{number vertices in triangles}] \leq n \cdot p^3 \cdot \binom{n - 1}{2}

\displaystyle  \leq n^3 \cdot p^3

\displaystyle  = o(n) \textrm{ if } p = o \left(\frac{1}{n^{\frac{2}{3}}}\right), \textrm{ } d = o(n^{\frac{1}{3}})

So with {o(1)} probability,

  • All vertices have degree {O(d)}
  • {o(n)} vertices participate in triangles.

3. Eigenvalue Computations and SDP

Problems like finding the largest / smallest eigenvalue can be solved using SDP

Let {M} be symmetric, {\lambda_{\max}} be the largest eigenvalue of M: {\lambda_{\max} = \max_x \frac{\boldsymbol{x}^T M \boldsymbol{x}}{\|\boldsymbol{x} \|^2}} We can formulate this as Quadratic Programming:

\displaystyle  \begin{array}{rcl}  \max_{i, j} & & \sum_{i, j} M_{i, j} x_i y_j{\rm s.t.} \\ && \sum_i x_i^2 = 1 \\ \end{array}

We showed previously that we can relax a Quadratic Program to SDP:

\displaystyle  \begin{array}{rcl}  \max_{i, j} & & \sum_{i, j} M_{i, j} \langle \boldsymbol{x_i}, \boldsymbol{x_j} \rangle{\rm s.t.} \\ && \sum_i \|\boldsymbol{x_i}\|^2 = 1 \\ \end{array}

In fact, it happens that these two are equivalent. To show this, we must show that a vector solution {x} of SDP can hold as a solution to the QP and vice versa.

Proving {x} for QP is valid for SDP: Trivial. Any solution {x} to our Quadratic Program must be a solution for our SDP since it is a relaxation of the problem; then the optimum of our QP must be less than or equal to the optimum of our SDP

Proving {x} for SDP is valid for QP: Consider {x := \text{vector solution of cost c}}. We note that our SDP can be transformed into an unconstrained optimization problem as follows:

\displaystyle  \begin{array}{rcl}  \max_{i, j} & & \frac{\sum_{i, j} M_{i, j} \langle \boldsymbol{x_i}, \boldsymbol{x_j} \rangle}{\sum_i \|\boldsymbol{x_i}\|^2} \end{array}

The cost c can be defined as the value of our solution:

\displaystyle  c = \frac{\sum_{i, j} M_{i, j} \sum_k \boldsymbol{x_k}^i \boldsymbol{x_k}^j}{\sum_i \sum_k \|\boldsymbol{x_k}^i|^2}

\displaystyle  \leq \max_k \frac{\sum_{i, j} M_{i, j} \boldsymbol{x_k}^i \boldsymbol{x_k}^j}{\sum_i \|\boldsymbol{x_k}^i\|^2}

We get a one-dimensional solution when we use the {k^{th}} element of {x}, and wish to find the {k} that maximizes this.

We use the following inequality:

\displaystyle \frac{a_1 + ... + a_m}{b_1 + ... + b_m} \leq \max_{k = 1, ..., m} \frac{a_k}{b_k}, b_k > 0

Proof:

\displaystyle \sum_i a_i = \sum_i b_i \cdot \frac{a_i}{b_i} \leq \sum_i b_i \cdot \max_k \frac{a_k}{b_k}

\displaystyle  = \max_k \text{ } \frac{a_k}{b_k} \cdot \sum_i b_i

4. SDP Max-Cut: Spectral Norm as a SDP Certificate

Consider the SDP relaxation of Max-Cut on Graph {G}:

\displaystyle  \begin{array}{rcl}  \max & & \sum_{(i, j) \in E} \frac{1}{4} \|\boldsymbol{X_i} - \boldsymbol{X_j}\|^2 \\ {\rm s.t.} \\ && \forall v \in V, \|\boldsymbol{X_v}^2\| = 1 \\ \end{array}

Let the optimum value for this SDP be {SDPMaxCut(G)}. It’s obvious that {MaxCut(G) \leq SDPMaxCut(G)}. Under our constraints, we can rewrite our SDP as

\displaystyle  \sum_{(i, j) \in E} \frac{1}{2} - \frac{1}{2} \langle \boldsymbol{X_i}, \boldsymbol{X_j} \rangle

So our new optimization problem is

\displaystyle  \begin{array}{rcl}  \max & & \frac{\vert E \vert}{2} - \sum_{(i, j) \in E} \frac{1}{2} \langle \boldsymbol{X_i}, \boldsymbol{X_j} \rangle \\ {\rm s.t.} \\ && \forall i \in V, \| \boldsymbol{X_i} \|^2 = 1 \\ \end{array}

We can relax our constraint to the following: {\forall i \in V, \sum_i \| \boldsymbol{X_i} \|^2 = n}. Relaxing our constraint will yield an optimization problem with a solution less than the stricter constraint (call this {SDP'MaxCut(G)}):

\displaystyle  \begin{array}{rcl}  \max & & \frac{\vert E \vert}{2} - \sum_{(i, j) \in E} \frac{1}{2} \langle \boldsymbol{X_i}, \boldsymbol{X_j} \rangle \\ {\rm s.t.} \\ && \sum_v \|\boldsymbol{X_v}\|^2 = n \\ \end{array}

Clearly, we have the following inequalities: { MaxCut(G) \leq SDPMaxCut(G) \leq SDP'MaxCut(G) }. We can rewrite {SDP'MaxCut(G)} as

\displaystyle  \begin{array}{rcl}  \max & & \frac{\vert E \vert}{2} + \frac{n}{4} \cdot \sum_{i, j} \frac{-A_{i, j} \langle \boldsymbol{X_i}, \boldsymbol{X_j} \rangle}{\sum_i \|\boldsymbol{X_i}\|^2} \\ && \sum_v \|\boldsymbol{X_v}\|^2 = n \\ \end{array}

Note that our objective function computes the largest eigenvalue of {-A}:

\displaystyle  = \frac{\vert E \vert}{2} + \frac{n}{4} \cdot \lambda_{\max}(-A)

For every graph {G_{n, p}} with {0 \leq p \leq 1},

\displaystyle  MaxCut(G) \leq SDPMaxCut(G) \leq \frac{\vert E \vert}{2} + \frac{n}{4} \cdot \lambda_{\max}(-A)

\displaystyle  \leq \frac{\vert E \vert}{2} + \frac{n}{4} \cdot \lambda_{\max}(pJ - A)

\displaystyle  \leq \frac{\vert E \vert}{2} + \frac{n}{4} \cdot \|pJ - A\|

Recall from previous lectures that for {p > \frac{\log(n)}{n}}, the adjacency matrix of {A} sampled from {G_{n, p}} has {\|pJ - A\| \leq O(\sqrt{np})} with high probability. This implies that {SDPMaxCut(G) \leq \frac{\vert E \vert}{2} + O(n \cdot \sqrt{d})}. Semantically, this means that { SDPMaxCut(G) } computes in poly-time a correct upper-bound of {MaxCut(G)}.

5. Trace and Eigenvalues

Suppose matrix {M} is symmetric with eigenvalues {\lambda_1 \hdots \lambda_n}. The following are true:

  • {M^k} eigenvalues are {\lambda_1^k \hdots \lambda_n^k}
  • {trace(M) := \sum_{i, i} M_{i, i} } ; {trace(M) = \sum_i \lambda_i}

Then, for {M^{2k}, trace(M^{2k}) = \lambda_1^{2k} + \hdots + \lambda_n^{2k}}.

\displaystyle  (\max_i \vert \lambda_i \vert)^{2k} \leq trace(M^{2k}) \leq n \cdot (\max_i \vert \lambda_i \vert)^{2k}

Also,

\displaystyle  \|M\| \leq (trace(M^{2k})^{\frac{1}{2k}} \leq n^{\frac{1}{2k}} \cdot \|M\|

{A_{i, j}} is defined as the number of expected paths from {i} to {j} that take {k} steps (not necessarily simple paths in a graph)

{ = \sum_{\text{paths from i to j}} M_{i, a_1} \hdots M_{a_n, j}}

Our goal with this is to compute the eigenvalues {\lambda}. Since traces relates the sum of the diagonal and the sum of eigenvalues for symmetric {M}, we can use this to provide an upper bound for symmetric {M}.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s