# Finally, a joy

In Rome we have an expression, mai una gioia (literally, “never (a moment of) joy”) that applies well to the present times. Yesterday, there was, finally, something to be joyous about: the announcement that two of my heroes, Laszlo Lovasz and Avi Wigderson, will share the 2021 Abel Prize, one of the highest honors of mathematics.

The reader can find a very good article about them on Quanta Magazine.

Instead of talking about their greatest accomplishment, here I would like to recall two beautiful and somewhat related results, that admit a short treatment.

Avi’s first paper was on the 3-coloring problem. He described a polynomial time algorithm that, given a 3-colorable graph ${G}$, finds a valid coloring of ${G}$ with ${O(\sqrt n)}$ colors, where ${n}$ is the number of vertices. The starting point is that if every vertex in ${G}$ had degree ${\leq \sqrt n -1}$ then there is an easy way to color the graph with ${\sqrt n}$ colors (look at the vertices in any order, choose for each vertex a color not already used for the neighbors that have already been considered).

What if there is some vertex ${v}$ of degree ${\geq \sqrt n}$? Well, and this is the idea of the paper, the neighbors of ${v}$ must induce a bipartite subgraph, so we can color the neighbors of ${v}$ with 2 colors in a way that is consistent with all the edges between neighbors of ${v}$, then remove the neighbors of ${v}$ and continue with the rest of the graph, committing to never use those two colors again. Every time we do this, we consume two colors but we get rid of ${\geq \sqrt n}$ vertices, so this costs us at most ${2\sqrt n}$ colors. When we are done, all remaining vertices have degree ${\leq \sqrt n-1}$ and we just need ${\sqrt n}$ more colors as observed above.

In total, we used at most ${3\sqrt n}$ colors.

This remained the state of the art for approximate graph coloring for a few years, until Avrim Blum improved the bound for 3-colorable graphs to ${O(n^{0.4})}$ and then to ${O(n^{3/8})}$ colors, using more sophisticated combinatorial arguments. Later, Karger, Motwani and Sudan devised an approximation algorithm that uses ${\tilde O(n^{1/4})}$ colors to color 3-colorable graphs, and is based on a semidefinite programming relaxation of the coloring problem, which is related to the Lovasz theta function. It is still open whether there is a polynomial time algorithm that uses ${n^{o(1)}}$ colors.

The Lovasz theta function ${\theta (G)}$ of a graph ${G}$ is a convex relaxation of the maximum independent set problem in ${G}$. It is, or it can be formulated as, a semidefinite programming relaxation of the maximum independent set problem, and so it can be computed (up to arbitrarily good accuracy) in polynomial time.

Lovasz defined the theta function to study the Shannon capacity of a graph. This is a question that arises when one wants to find error-correcting codes for a channel that may introduce errors. Suppose that the channel carries an element of an alphabet ${V}$ of size ${n=|V|}$, and that certain pairs of alphabet elements can be “confused” by the channel, meaning that for certain pairs ${\{ a,b \}}$ the channel may output ${b}$ given ${a}$ or viceversa. If we construct the graph ${G}$ whose vertex set is ${V}$ and whose edges are the pairs of “confusable” pairs, then an independent set in this graph gives a set of length-1 codewords that are not confusable, and ${\log_2 \alpha(G)}$, where ${\alpha(G)}$ is the size of a maximal independent set in ${G}$, is the number of bits that we can transmit in an errorless way by making one use of the channel. If we can access the channel ${k}$ times, then an errorless code is an independent set in ${G^k}$, the graph that has vertex set ${V^k}$ and such that ${((a_1,\ldots,a_k),(b_1,\ldots,b_k))}$ is an edge if and only if all the pairs ${(a_i,b_i)}$ are edges of ${G}$. The number of bits of information per channel use that we can send is $\displaystyle \frac 1k \log \alpha(G^k)$

and so we are interested in $\displaystyle \lim_{k\rightarrow \infty} (\alpha (G^k))^{1/k}$

Nicely, we have $\displaystyle \theta (G^k) = (\theta(G))^k$

so we have $\displaystyle \theta(G) \geq \lim_{k\rightarrow \infty} (\alpha (G^k))^{1/k}$

and the theta function provides an upper bound to the Shannon capacity of a graph.

If we compute ${\theta (\bar G)}$, where ${\bar G}$ is the complement of the graph ${G}$ (that is, the graph whose edges are all the non-edges of ${G}$), then we see that ${\theta(\bar G)}$ is a relaxation of the maximum clique problem in ${G}$. Remarkably, ${\theta(\bar G)}$ is also a lower bound to the chromatic number of ${G}$, and so it is sandwiched between the clique number and the chromatic number. Don Knuth has used this fact as a starting point for a wonderful 49-page treatise on convex relaxations of the chromatic number and the clique number; here we will see a quick proof of this fact.

As you can see in Knuth’s survey, there are several equivalent ways to define the theta function. Below we define ${\theta(\bar G)}$ as a natural semidefinite programming relaxation of the maximum clique problem (we have a variable ${x_0}$ plus one variable ${x_v}$ for every vertex ${v}$; all variables are vectors). $\displaystyle \begin{array}{lr} \max \sum_{v\in V} || x_v||^2\\ s.t.\\ ||x_0||^2 = 1\\ \langle x_0,x_v \rangle = ||x_v||^2 & \forall v\in V\\ \langle x_u,x_v \rangle = 0 & \forall u\neq v: (u,v) \not\in E\\ \end{array}$

To see that it is a relaxation, given a clique ${K}$, let ${x_0}$ be any unit vector, and let ${x_v = x_0}$ for ${v\in K}$ and ${x_v = {\bf 0}}$ for ${v\not\in K}$. We satisfy all constraints and the objective function is ${|K|}$.

If we write the semidefinite dual of the above relaxation, we can see that we get a minimization problem which is a relaxation of the chromatic number problem (and which is equivalent to the relaxation used by Karger, Motwani and Sudan to improve on Avi’s result on 3-coloring), so that the “sandwich theorem” follows from weak duality.

The reader should try to write down the semidefinite dual and reason about why it relaxes the chromatic number problem. Below, we collapse the construction of the dual, the way of embedding colorings into dual solutions, and the proof of weak duality for semidefinite programming, into a short but mystifying “sum of squares” proof.

Suppose that ${V = \{ v_1,\ldots,v_n \}}$ and let ${x_0,x_{v_1},\ldots,x_{v_n}}$ be a feasible solution to the max clique relaxation defined above. Suppose that the graph is ${k}$-colorable and let ${(C_1,\ldots,C_k)}$ be a partition of ${V}$ into color classes. We have $\displaystyle 0 \leq \sum_{i=1}^k \left \| -x_0 + \sum_{v\in C_i} x_v \right \|^2$ $\displaystyle = \sum_{i=1}^k \left( || x_0||^2 - 2 \sum_{v\in C_i} \langle x_0 , x_v \rangle + \sum_{u,v \in C_i} \langle x_u, x_v\rangle \right)$ $\displaystyle = \sum_{i=1}^k \left( 1 - \sum_{v\in C_i} || x_v||^2 \right)$ $\displaystyle = k - \sum_{v\in V} ||x_v||^2$

so that the objective function always satisfies $\displaystyle \sum_{v\in V} ||x_v||^2 \leq k$

In particular, this is true for an optimal solution ${x_0,x_{v_1},\ldots,x_{v_k}}$ and for an optimal coloring, so we have that ${\theta(\bar G)}$ is upper bounded by the chromatic number of ${G}$ as promised.

Above, we used various properties of feasible solutions, such as ${||x_0||^2 = 1}$, ${\langle x_0, x_v \rangle = ||x_v||^2}$ and $\displaystyle \sum_{u,v \in C_i } \langle x_u , x_v \rangle = \sum_{v\in C_i} || x_v||^2$

which follows from the fact that each color class is an independent set, and so all inner products ${\langle x_u, x_v \rangle}$ for ${u,v \in C_i, u\neq v}$ have to be equal to zero, because ${(u,v) \not\in E}$.