Scribed by Keyhan Vakil
In which we complete the study of Independent Set and Max Cut in random graphs.
1. Maximum Independent Set
Last time we proved an upper bound of to the probable value of the maximum independent set in a random graph. This bound also holds if is a function of . There is a simple greedy algorithm which can be shown to achieve an independent set of size where is the average degree of the graph. For a random graph, this gives us an independent of size . However we will see how to specialize this analysis to sparse random graphs, and close the remaining gap between the probable value and the greedy algorithm.
Consider the greedy algorithm below.
- for each
- if has no neighbors in then
1.1. First attempt
We might try to model our analysis of this algorithm based on our discussion from Lecture~2.
To wit, let be the set of vertices not in which have no neighbors in . Let be the size of when contains vertices. If , then our algorithm outputs an independent set of size . Therefore we can determine the expected size of the algorithm’s output (up to a constant factor) by determining such that .
Now we determine . A proportion of vertices are connected to the th vertex in expectation. Of the vertices, we expect that of them will remain unconnected to all the vertices in . This gives us that , and by induction .
Let be such that . Then:
We conclude that our independent set has expected size . However if we take , that would lead us to believe that we could get an independent set of size in a graph with only vertices, which is impossible.
The error is that should be , not . Note that once we add the th vertex to , it can no longer be in by definition. When is a constant, the difference is negligible, but when is small then the difference becomes more significant.
It is possible to salvage this analysis, but the result is less elegant. Instead we will now present a different analysis, which will also let us conclude more about higher moments as well.
1.2. Analysis of the greedy algorithm
To analyze the algorithm, consider the following random variables: let be the number of for-loop iterations between the time the -th element is added to and the time the -th element is added to . We leave undefined if the algorithm terminates with a set of size less than . Thus the size of the independent set found by the algorithm is the largest such that is defined. Consider the following slightly different probabilistic process: in addition to our graph over vertices , we also consider a countably infinite number of other vertices . We sample an infinite super-graph of our graph over this larger vertex set, so that each possible edge has probability of being generated.
We continue to run the greedy algorithm for every vertex of this infinite graph, and we call the (now, always defined) number of for-loop iterations between the -th and the -th time that we add a node to . In this revised definition, the size of the independent set found by algorithm in our actual graph is the largest such that .
Now we will reason about the distribution of . Say that we have vertices in and we are trying to determine if we should add some vertex to . Note that the probability of being disconnected from all of is . So we add a vertex at each iteration with probability , which shows that is geometrically distributed with success probability .
Based on this, we can find the expected value and variance of our sum from before
We want to choose so that the sum is at most with high probability. Let
This makes the expected value of the sum and the standard deviation . Thus, if sufficiently fast, the greedy algorithm has a probability of finding an independent set of size , where is a measure of the average degree.
1.3. Certifiable upper bound
We now derive a polynomial time computable upper bound certificate for maximum independent set in . We use the following lemma without proof. Note its similarity to Lemma~2 from Lecture~1.
Lemma 1 If , is sampled from , is the adjacency matrix of , and is the matrix of all ones, then there is a probability that
Since is a real symmetric matrix its spectral norm can be computed as:
If is an independent set of size , then , , and , so that
This bound holds for any independent set, so it also holds for the largest one. If we denote by the size of the largest independent set in , we have that
For a random graph, the above upper bound is with high probability.
2. Max Cut
We will now reconsider Max Cut for the general case . In Lecture~2, we dealt with the special case of . Unlike maximum independent set, our arguments for the case apply to Max Cut without much modification.
2.1. High probability upper bound
Let be a random graph from , and define as a measure of its average degree. We will prove that the size of a maximum cut of is at most with high probability. The proof of this statement is nearly identical to the version in Lecture~2, where it was presented for the case . We know that the expected value of a cut is . By a Chernoff bound, the probability that any particular cut exceeds expectation by an additive factor of is exponentially decreasing by a factor of . By taking and taking a union bound over all possible cuts , we have that our expected cut has value at most with probability .
2.2. Greedy algorithm
Consider the greedy algorithm
- for each
- if has more neighbors in than in then
- return .
Label . Let and be the sets and when vertex is considered in the for-loop. For the purpose of analysis, we delay the random decisions in until a vertex is considered. In particular, we delay the choice of which of is a neighbor until is vertex is considered. Note that no edge needs to be considered twice, and so we can treat each one as an independent biased coin flip.
Let and be the neighbors of in and respectively. We can show that , and so is the gain our algorithm achieves over cutting half the edges.
Now has expectation and variance . Adding over all , the sum of the differences has mean and variance . This gives us an expected gain of with probability. The value of cutting half the edges is approximately . This gives a final value of w.h.p. as stated.
2.3. Certifiable upper bound
Again, we will derive a certifiable upper bound by looking at the spectral norm. If is a cut with value , then we have
This means that, in every graph, the maximum cut is upper bounded by
which if is with high probability at most (by Lemma~1).
We conclude with the following table, which summarizes our results for a random graph sampled from .
|Problem||Expected Value||Greedy Algorithm||Certifiable Upper Bound|
* Note that both certifiable upper bounds require .
Both greedy algorithms perform very well in comparison to the probable value. In Max~Cut, our greedy algorithm is particularly strong, matching our certifiable upper bound up to a lower order term. This supports one of our major theses: while greedy algorithms exhibit poor worst-case performance, they tend to do well over our given distribution.