Beyond Worst Case Analysis: Lecture 3

Scribed by Keyhan Vakil

In which we complete the study of Independent Set and Max Cut in {G_{n,p}} random graphs.

1. Maximum Independent Set

Last time we proved an upper bound of {O\left( \frac 1p \log np \right)} to the probable value of the maximum independent set in a {G_{n,p}} random graph. This bound also holds if {p} is a function of {n}. There is a simple greedy algorithm which can be shown to achieve an independent set of size {\Omega(n/d)} where {d} is the average degree of the graph. For a {G_{n,p}} random graph, this gives us an independent of size {\Omega(1/p)}. However we will see how to specialize this analysis to sparse {G_{n,p}} random graphs, and close the remaining gap between the probable value and the greedy algorithm.

Consider the greedy algorithm below.

  • {S:= \emptyset}
  • for each {v\in V}
    • if {v} has no neighbors in {S} then {S:= S \cup \{ v \}}
  • return {S}

1.1. First attempt

We might try to model our analysis of this algorithm based on our discussion from Lecture~2.

To wit, let {R} be the set of vertices not in {S} which have no neighbors in {S}. Let {R_i} be the size of {R} when {S} contains {i} vertices. If {R_k = 0}, then our algorithm outputs an independent set of size {k}. Therefore we can determine the expected size of the algorithm’s output (up to a constant factor) by determining {k} such that {\mathop{\mathbb E}[R_k] = O(1)}.

Now we determine {\mathop{\mathbb E}[R_{i+1} \mid R_i]}. A proportion of {p} vertices are connected to the {(i+1)}th vertex in expectation. Of the {R_i} vertices, we expect that {1-p} of them will remain unconnected to all the vertices in {S}. This gives us that {\mathop{\mathbb E}[R_{i+1} \mid R_i] = (1-p)R_i}, and by induction {\mathop{\mathbb E}[R_k] = (1-p)^k n}.

Let {k} be such that {\mathop{\mathbb E}[R_k] = 1}. Then:

\displaystyle  \mathop{\mathbb E}[R_k] = (1-p)^k n = 1 \implies k = \log_{\frac 1{1-p}} n \approx \frac 1p \ln n

We conclude that our independent set has expected size {\Theta(\frac1p \log n)}. However if we take {p = \Theta(1/n)}, that would lead us to believe that we could get an independent set of size {\Theta(n \log n)} in a graph with only {n} vertices, which is impossible.

The error is that {\mathop{\mathbb E}[R_{i+1} \mid R_i]} should be {(1-p)(R_i - 1)}, not {(1-p)R_i}. Note that once we add the {(i+1)}th vertex to {S}, it can no longer be in {R} by definition. When {p} is a constant, the difference is negligible, but when {p} is small then the difference becomes more significant.

It is possible to salvage this analysis, but the result is less elegant. Instead we will now present a different analysis, which will also let us conclude more about higher moments as well.

1.2. Analysis of the greedy algorithm

To analyze the algorithm, consider the following random variables: let {t_i} be the number of for-loop iterations between the time the {i}-th element is added to {S} and the time the {(i+1)}-th element is added to {S}. We leave {t_i} undefined if the algorithm terminates with a set {S} of size less than {i+1}. Thus the size of the independent set found by the algorithm is the largest {i} such that {t_{i-1}} is defined. Consider the following slightly different probabilistic process: in addition to our graph over {n} vertices {\{1,\ldots , n \}}, we also consider a countably infinite number of other vertices {n+1,n+2,\ldots}. We sample an infinite super-graph of our graph over this larger vertex set, so that each possible edge has probability {p} of being generated.

We continue to run the greedy algorithm for every vertex of this infinite graph, and we call {t_i} the (now, always defined) number of for-loop iterations between the {i}-th and the {(i+1)}-th time that we add a node to {S}. In this revised definition, the size of the independent set found by algorithm in our actual graph is the largest {k} such that {t_0 + t_1 + \cdots + t_{k-1} \leq n}.

Now we will reason about the distribution of {t_i}. Say that we have {i} vertices in {S} and we are trying to determine if we should add some vertex {v} to {S}. Note that the probability of {v} being disconnected from all of {S} is {(1-p)^i}. So we add a vertex at each iteration with probability {(1-p)^i}, which shows that {t_i} is geometrically distributed with success probability {(1-p)^i}.

Based on this, we can find the expected value and variance of our sum from before

\displaystyle \mathop{\mathbb E} \left[ t_0 + t_1 + \cdots t_{k-1} \right] = \frac { \frac 1 {(1-p)^k} - 1 }{\frac 1 {1-p} - 1} \leq \frac { \frac 1 {(1-p)^k}}{\frac 1 {1-p} - 1} = \frac 1 {p\cdot (1-p)^{k-1}}

and likewise

\displaystyle  \begin{array}{rcl}  \mathop{\bf Var}[t_0 + t_1 + \cdots t_{k-1}] & \leq & \sum_{i=0}^{k-1} \frac 1 {(1-p)^{2i}} \\ &= & \frac { \frac 1 {(1-p)^{2k}} - 1 }{\frac 1 {(1-p)^2} - 1} \\ & \leq & \frac 1 {(1 - (1-p)^2 ) \cdot (1-p)^{2k-2 } } \\ & \leq & \frac 1 {p \cdot (1-p)^{2k - 2} } \\ & = & p \left( \mathop{\mathbb E}[t_0 + \cdots + t_{k-1}] \right)^2. \end{array}

We want to choose {k} so that the sum is at most {n} with high probability. Let

\displaystyle  k = \log_{\frac {1}{1-p}} \frac {pn}2 \approx \frac 1p \ln pn .

This makes the expected value of the sum {\le n/2} and the standard deviation {\le \sqrt{p}n / 2}. Thus, if {p(n) \rightarrow 0} sufficiently fast, the greedy algorithm has a {1-o(1)} probability of finding an independent set of size {\Omega( p^{-1} \log pn ) = \Omega\left( \frac nd \log d \right)}, where {d := np} is a measure of the average degree.

1.3. Certifiable upper bound

We now derive a polynomial time computable upper bound certificate for maximum independent set in {G_{n,p}}. We use the following lemma without proof. Note its similarity to Lemma~2 from Lecture~1.

Lemma 1 If {p = p(n) \ge \frac {\log n}n}, {G} is sampled from {G_{n,p}}, {A} is the adjacency matrix of {G}, and {J} is the matrix of all ones, then there is a {1-o(1)} probability that

\displaystyle  \lVert A - p J \rVert \leq O( \sqrt {pn })

Since {A - pJ} is a real symmetric matrix its spectral norm can be computed as:

\displaystyle  \lVert A - pJ \rVert = \max_{{\bf x} \neq {\bf 0}} \frac{|{\bf x}^T(A - pJ){\bf x}|}{{\bf x}^T {\bf x}} \;.

If {S} is an independent set of size {k}, then {{\bf 1}_S^T A {\bf 1}_S = 0}, {{\bf 1}_S^T J {\bf 1}_S = k^2}, and {{\bf 1}_S^T {\bf 1}_S = k}, so that

\displaystyle  \begin{array}{rcl}  \lVert A - pJ \rVert &\geq & \frac{|{\bf 1}_S^T(A - pJ){\bf 1}_S|}{{\bf 1}_S^T {\bf 1}_S} \\ &= & pk. \end{array}

This bound holds for any independent set, so it also holds for the largest one. If we denote by {\alpha(G)} the size of the largest independent set in {G}, we have that

\displaystyle  \alpha(G) \leq \frac 1p \lVert A - p J \rVert .

For a {G_{n,p}} random graph, the above upper bound is {O(\sqrt{n/p}) = O(n/\sqrt d)} with high probability.

2. Max Cut

We will now reconsider Max Cut for the general case {G_{n,p}}. In Lecture~2, we dealt with the special case of {p=\frac12}. Unlike maximum independent set, our arguments for the case {p=\frac12} apply to Max Cut without much modification.

2.1. High probability upper bound

Let {G} be a random graph from {G_{n,p}}, and define {d := pn} as a measure of its average degree. We will prove that the size of a maximum cut of {G} is at most {dn/4 + O(\sqrt d n)} with high probability. The proof of this statement is nearly identical to the version in Lecture~2, where it was presented for the case {p=\frac12}. We know that the expected value of a cut {S} is {|S| \cdot |V-S| \le dn / 4}. By a Chernoff bound, the probability that any particular cut exceeds expectation by an additive factor of {O(\epsilon n)} is exponentially decreasing by a factor of {\epsilon^2 dn}. By taking {\epsilon = 1/\sqrt{d}} and taking a union bound over all {2^n} possible cuts {S}, we have that our expected cut has value at most {dn / 4 + O(\sqrt d n)} with probability {1 - 2^{-\Omega(n)}}.

2.2. Greedy algorithm

Consider the greedy algorithm

  • {A:= \emptyset}
  • {B:= \emptyset}
  • for each {v\in V}
    • if {v} has more neighbors in {B} than in {A} then {A:= A \cup \{ v \}}
    • else {B:= B \cup \{ v\}}
  • return {(A,B)}.

Label {V = \{ 1,\ldots,n \}}. Let {A_i} and {B_i} be the sets {A} and {B} when vertex {i} is considered in the for-loop. For the purpose of analysis, we delay the random decisions in {G} until a vertex is considered. In particular, we delay the choice of which of {1, 2, \ldots, i - 1} is a neighbor until {i} is vertex {i} is considered. Note that no edge needs to be considered twice, and so we can treat each one as an independent biased coin flip.

Let {a_i} and {b_i} be the neighbors of {i} in {A_i} and {B_i} respectively. We can show that {|a_i - b_i| = \max(a_i, b_i) - \frac12 (a_i + b_i)}, and so {\sum_i |a_i - b_i|} is the gain our algorithm achieves over cutting half the edges.

Now {|a_i - b_i|} has expectation {\Omega( \sqrt {pi} )} and variance {O(pi)}. Adding over all {i}, the sum of the differences has mean {\Omega( n \sqrt{pn} )} and variance {O(pn^2)}. This gives us an expected gain of {\Omega( n \sqrt {pn}) = \Omega( n \sqrt d)} with {1-o(1)} probability. The value of cutting half the edges is approximately {dn / 4}. This gives a final value of {dn/4 + \Omega(n\sqrt d)} w.h.p. as stated.

2.3. Certifiable upper bound

Again, we will derive a certifiable upper bound by looking at the spectral norm. If {(S,V-S)} is a cut with value {\frac {dn}4 + C}, then we have

\displaystyle  {\bf 1}_S^T A {\bf 1}_{V-S} = \frac {dn}4 + C

\displaystyle  {\bf 1}_S^T p J {\bf 1}_{V-S} = p \cdot |S| \cdot |V-S| \leq p \cdot \frac {n^2} 4 = \frac {dn}4

\displaystyle  \lVert {\bf 1}_S \rVert \cdot \lVert {\bf 1}_{V-S} \rVert = \sqrt { |S| \cdot |V-S| } \leq \sqrt { \frac {n^2}4 }

so

\displaystyle  C \leq 2n \cdot \lVert {\bf 1}_S \rVert \cdot \lVert {\bf 1}_{V-S} \rVert .

This means that, in every graph, the maximum cut is upper bounded by

\displaystyle  \frac {dn}4 + \frac n2 \left\lVert A - \frac dn J \right\rVert

which if {d \ge \log n} is with high probability at most {\frac {dn}4 + O( n \sqrt d)} (by Lemma~1).

3. Conclusion

We conclude with the following table, which summarizes our results for a random graph sampled from {G_{n, d/n}}.

Problem Expected Value Greedy Algorithm Certifiable Upper Bound
Independent Set {O\left(\frac nd \log d\right)} {\Omega\left(\frac nd \log d\right)} w.h.p. {O\left(\frac n{\sqrt{d}} \right)} w.h.p.*
Max Cut {\frac{dn}4 + O(n \sqrt d)} {\frac {dn}4 + \Omega(n \sqrt d)} w.h.p. {\frac {dn} 4 + O(n \sqrt d)} w.h.p.*

* Note that both certifiable upper bounds require {d \ge \log n}.

Both greedy algorithms perform very well in comparison to the probable value. In Max~Cut, our greedy algorithm is particularly strong, matching our certifiable upper bound up to a lower order term. This supports one of our major theses: while greedy algorithms exhibit poor worst-case performance, they tend to do well over our given distribution.

Advertisements

Beyond Worst Case Analysis: Lecture 2

Scribe: Mahshid Montazer

In this lecture, we study the Max Cut problem in random graphs. We compute the probable value of its optimal solution, we give a greedy algorithm which is nearly optimal on random graphs and we compute a polynomial time upper bound certificate for it using linear algebra methods. We also study the problem of Maximum Independent Set in random graphs and we compute an upper bound to the probable value for its optimal solution.

1. Max Cut

Definition 1 Max Cut: In an un-weighted graph {G=(V,E)}, a cut is defined as a partition of its vertices into two sets {V_1} and {V_2}. Let {E(V_1, V_2)} be the size of the cut {(V_1, V_2)} which is the number of the edges with one endpoint in {V_1} and one endpoint in {V_2}. Max Cut is the the problem of finding a cut of largest size.

To give a clear example, in every bipartite graph, a bipartition is a maximum cut. It is easy to show that the size of the maximum cut would be at least half of the number of the graph edges. One question that arises here is that how much more than half of the edges can we cut. The answer is: not that much in random graphs. We will show this claim in the following section.

2. Probable Value of Max Cut Optimal Solution

In this section, we compute the probable value of Max Cut optimal solution in random graphs. Our result is for samples of {G_{n,\frac{1}{2}}}, but the analysis will generalize to {G_{n,p}}.

Lemma 2 For every fixed cut {(S,V-S)}, {\mathop{\mathbb E} [E(S, V\setminus S)] \leq \frac{n^2}{8}}.

Proof: {\mathop{\mathbb E} [E(S, V\setminus S)] = \left\vert S \right\vert \left\vert V\setminus S \right\vert \frac{1}{2} = \frac{n^2}{8}.} \Box

Lemma 3 {\mathop{\mathbb P} [E(S, V\setminus S) \geq \frac{n^2}{8} + \epsilon \frac{n^2}{4}] \leq e^{-\Omega(\epsilon^2 n^2)}} where {0 \leq \epsilon \leq \frac{1}{2}}.

Proof: The proof is by applying Chernoff bounds on the result of lemma 2. \Box

Lemma 4 There is a constant {c>0} such that

\displaystyle  \mathop{\mathbb P} [\exists (S,V \setminus S) \mid E(S,V \setminus S) \geq \frac{n^2}{8} + \epsilon \frac{n^2}{4}] \leq 2^{-n}

where {\epsilon = \frac{c}{\sqrt{n}}} and the probability is taken over the choice of {G=(V,E)} from the distribution {G_{n,\frac 12 }}.

Proof:

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb P} [\exists (S,V \setminus S) \mid E(S,V \setminus S) \geq \frac{n^2}{8} + \epsilon \frac{n^2}{4}] & \leq & 2^n \cdot e^ {-\Omega(\epsilon^2 n^2)} \\  & \leq & 2^{-n}. \end{array}

for an appropriate choice of {c}. \Box

The above lemma clearly leads us to the following theorem.

Theorem 5 There is a constant {c} such that w.h.p. Max Cut in {G_{n,\frac{1}{2}}} is of size at most {\frac{n^2}{8} + c \cdot n^{1.5}.}

Thus, we showed that in {G_{n,1/2}}, the probable value of Max Cut is at most {\frac{n^2}{8} + c \cdot n^{1.5}}.

3. Greedy Algorithm for Max Cut

Consider the following greedy algorithm for Max Cut:

  • {A \leftarrow \emptyset , B \leftarrow \emptyset}
  • for {v \in V}
    • if {v} has more neighbors in {A} than in {B}, then {B \leftarrow B \cup \{v\}}
    • else {A \leftarrow A \cup \{v\}}
  • return {A} and {B}

The above algorithm can be applied to any graph, but we will analyze it on random graphs. A naive analysis of the algorithm guarantees that our greedy algorithm cuts at least half of the edges, giving us an approximation ratio of 2. The reason is that at each step, we add at least half of the processing vertex’s incident edges to the cut. However, a more careful analysis of the algorithm shows that it is near-optimal for random graphs. Below, we prove our claim for {G_{n,\frac{1}{2}}}.

Lemma 6 With high probability over the choice of {G} from {G_{n,\frac{1}{2}}}, the greedy algorithm finds a cut of size {\frac {n^2}8 + \Omega(n^{1.5})}.

Proof: Let {G(V,E) \sim G_{n,\frac{1}{2}}} be the given graph and let {v_1, v_2 , \cdots , v_n} be the order in which we process the vertices. Note that at the time of processing {v_i} {(1 \leq i <n)}, we do not need to know the edges that connect {v_i} to any vertex {v_j} {(j>i)}. Let { a_i = |A|} and {b_i = |B|} be the size of sets {A} and {B} before processing {v_i}, respectively. Although {G} is given before we run the algorithm, for the sake of the analysis, we can assume that we are building it on the go and while processing each of the vertices. Remember that each edge of the graph would exists independently with probability {\frac{1}{2}}. For deciding where to put {v_i}, we generate {a_i} random bits and call their summation {X_i}. We also generate {b_i} random bits and call their summation {Y_i}. We put {v_i} in set {A} (respectively, {B}) if {X_i \leq Y_i} (respectively, {Y_i < X_i}). Note that the more balanced {A} and {B} get, the worse it gets for the analysis. Also, note that the extra edges that the algorithm cuts other than half of the edges would be:

\displaystyle \sum_{1\leq i \leq n} {|X_i-Y_i|} = E(A, B) -\frac{|E|}{2}.

We know that

\displaystyle X_i-Y_i = \frac{a_i-b_i}{2}.

Note that

\displaystyle \mathop{\mathbb E}[|X_i - Y_i|] = \Omega(\sqrt{i})

and

\displaystyle \mathop{\bf Var}(|X_i - Y_i|) = O(i).

Thus, we have that {\sum_{1\leq i \leq n} {|X_i-Y_i|} } has mean {\Omega(n^{1.5})} and standard deviation {O(n)}. Thus, with {1-O(1)} probability we have:

\displaystyle  \sum_{1\leq i \leq n} {|X_i-Y_i|} = \sum_{1\leq i \leq n} {\Omega(\sqrt{i})} \geq \Omega(n^{1.5}).

\displaystyle \Rightarrow E(A,B) \geq \frac{n^2}{8} + \Omega(n^{1.5}).

\Box

4. Polynomial Time Upper Bound for Max Cut

In this section, we find polynomial time upper bound certificates for Max Cut in random graphs using linear algebra techniques.

Lemma 7 Let {G=(V,E)} be a graph, {A} be its adjacency matrix, {J} be the matrix all whose entries are 1 and {(S, V\setminus S)} be the Max Cut of {G}. Then

\displaystyle  E(S, V \setminus S) \leq \frac{n^2}{8} + \frac n2 || A - J/2 ||

Proof: we have:

\displaystyle  \begin{array}{rcl}  E(S, V \setminus S) - \frac{n^2}{8} & \leq & {\bf 1}^T_S \cdot ( A - J/2) \cdot {\bf 1}_{V\setminus S} \\ & \leq & || A - J/2 || \cdot || {\bf 1}_S || \cdot ||{\bf 1}_{V\setminus S} || \\ & \leq & || A - J/2 || \cdot \sqrt{|S|} \cdot \sqrt{|V \setminus S|} \\ & \leq & || A - J/2 || \cdot \frac{n}{2}\\ \end{array}

\Box

Recall that, with high probability over the choice of a graph {G} from {G_{n,\frac 12}}, if {A} is the adjacency matrix of {G} then we have {||A - J/2|| \leq O(\sqrt n)} with high probability.

We conclude that, with high probability over the choice of {G} from {G_{n,\frac 12}} we can find in polynomial time a certificate the max cut optimum of {G} is at most {\frac {n^2} 8 + O(n^{1.5})}.

5. Maximum Independent Set

In this section, we discuss the Maximum Independent Set problem for {G_{n,p}} (especially {G_{n,\frac{1}{2}}}) and we show its close connection with Max Clique problem. Finally, we compute its optimal solution’s probable value.

Definition 8 Maximum Independent Set: In a graph {G(V,E)}, an independent set is a set of vertices that are mutually disconnected. A Maximum Independent Set in {G} is an independent set of largest possible size. The Maximum Independent Set problem is the problem of finding such a set.

Note that the Maximum Independent Set in {G_{n,p}} corresponds to the Maximum Clique in {G_{n,1-p}}. Thus, for {p = \frac{1}{2}}, everything that we argued for Max Clique is usable for Maximum Independent Set as well.

In this section, we compute an upper bound to the probable value of Maximum Independent Set’s optimal solution in {G_{n,p}}.

Fix a set {S \subset V} of size {k}. We have

\displaystyle \mathop{\mathbb P} [S \text{ is an independent set in } G] = (1-p)^{\binom{k}{2}}

where the probability is over the choice of {G\sim G_{n,p}}.

The following lemma holds.

Lemma 9 {\mathop{\mathbb P}[\exists \text{ Independent Set of size } k] \leq e^{-\frac{k}{2} \left( ((k-1) \cdot \ln{\frac{1}{1-p}} - 2\ln{\frac{n}{k}} \right) }}

Proof:

\displaystyle  \begin{array}{rcl}  \mathop{\mathbb P}[\exists \text{ Independent Set of size } k] & \leq & \mathop{\mathbb E}[\text{\#Independent Sets of size k}]\nonumber \\ &= & \binom{n}{k} \cdot (1-p)^{\binom{k}{2}} \nonumber \\ & \leq & \left(\frac{n}{k} \right)^k \cdot (1-p)^{\frac{k^2}{2} - \frac{k}{2}} \nonumber \\ & =& e^{k \cdot \ln{\frac{n}{k}} - \left( \frac{k^2}{2} - \frac k2 \right) \cdot \ln{\frac{1}{1-p}}} \nonumber \\ & = & e^{-\frac{k}{2} ((k-1) \cdot \ln{\frac{1}{1-p}} - 2\ln{\frac{n}{k}})}.  \end{array}

\Box

Now, what would be the maximum value of {k} such that with high probability we can still make sure that there exists an independent set of size {k}? Note that the value of (0) goes to 0 when {k \geq 2\log_\frac{1}{1-p} \frac{n}{k} +2}.

A sufficient condition for {k \geq 2\log_\frac{1}{1-p} \frac{n}{k} +2} is to have {k = 2\log_\frac{1}{1-p} n +2}, showing us that there is a high probability that maximum independent set in {G_{n,p}} is at most {O\left ( \log_\frac{1}{1-p} n \right) = O \left ( \frac 1p \log n \right)}. A more careful bound is that we can have {k \geq 2\log_\frac{1}{1-p} \frac{n}{k} +2} provided, say, {k \geq 3 \log_{\frac 1{1-p}} np + 100}, and so with high probability the maximum independent set in {G_{n,p}} is at most {O \left ( \frac 1p \log p n \right)}. If we call {d=pn}, then the bound is {O \left ( \frac nd \log d \right)}