What does it mean when it’s hard to find hard instances?

[In the provincial spirit of Italian newspapers, that often have headlines like “Typhoon in South-East Asia causes widespread destruction; what are the consequences for Italian exports?”, and of men who overhear discussions about women’s issue and say things like “yes, but men have issues too,” I am going to comment on how Babai’s announcement affects me and the kind of problems I work on.]

If someone had told me last week: “a quasi-polynomial time algorithm has been found for a major open problem for which only a slightly subexponential algorithm was known before,” I would have immediately thought Unique Games!

Before Babai’s announcement, Graph Isomorphism had certain interesting properties in common with problems such as Factoring, Discrete Log, and Approximate Closest Vector (for approximation ratios of the order of sqrt (n) or more): no polynomial time algorithm is known, non-trivial algorithms that are much faster than brute force are known, and NP-completeness is not possible because the problem belongs to either NP \cap coNP or NP \cap coAM.

But there is an important difference: there are simple distributions of inputs on which Factoring, Discrete Log, and Closest Vector approximation are believed to be hard on average, and if one proposes an efficiently implementable algorithms for such problems, it can be immediately shown that it does not work. (Or, if it works, it’s already a breakthrough even without a rigorous analysis.)

In the case of Graph Isomorphism, however, it is easy to come up with simple algorithms for which it is very difficult to find counterexamples, and there are algorithms that are rigorously proved to work on certain distributions of random graphs. Now we know that there are in fact no hard instances at all, but, even before, if we believed that Graph Isomorphism was hard, we had to believe that the hard instances were rare and strange, rather than common.

It is also worth pointing out that, using Levin’s theory of average-case complexity, one can show that if any problem at all in NP is hard under any samplable distribution, then for every NP-complete problem we can find a samplable distribution under which the problem is hard. And, in “practice,” natural NP-complete problems do have simple distributions that seem to generate hard instances.

What about Small-set Expansion, Unique Games, and Unique-Games-Hard problems not known to be NP-hard, like O(1)-approximation of Sparsest Cut? We don’t know of any distribution for which it is plausible to conjecture that such problems are hard, and we have algorithms (Lasserre relaxations of constant degree) with no known counterexample. Many simple distributions of instances are rigorously solved by known algorithms. So, if we want to believe the Unique Games conjecture, we have to believe that there are hard instances, but they are rare and strange.

I am sure that it is possible, under standard assumptions, to construct an artificial problem L in NP that is in average-case-P according to Levin’s definition but not in P. Such a problem would not be polynomial time solvable, but it would be easy to solve on average under any samplable distribution and, intuitively, it would be a problem that is hard even though hard instances are rare and strage.

But can a natural problem in NP exhibit this behavior? Now that Graph Isomorphism is not a plausible example any more, I am inclined to believe (until the next surprise) that no natural problem has this behavior, and my guess concerning the Unique Games conjectures is going to be that it is false (or “morally false” in the sense that a quasipolynomial time algorithm exists) until someone comes up with a distribution of Unique Games instances that are plausibly hard on average and that, in particular, exhibit integrality gaps for Lasserre relaxations (even just experimentally).

On the Unique Games Conjecture (part 2)

[I am preparing a survey talk on Unique Games for a mathematics conference, and writing a survey paper for a booklet that will be distributed at the conference. My first instinct was to write a one-line paper that would simply refer to Subhash’s own excellent survey paper. Fearing that I might come off as lazy, I am instead writing my own paper. Here are some fragments. Comments are very welcome.]

1. Why is the Unique Games Conjecture Useful

In a previous post we stated the Unique Games Conjecture and made the following informal claim, here rephrased in abbreviated form:

To reduce Label Cover to a graph optimization problem like Max Cut, we map variables to collections of vertices and we map equations to collections of edges; then we show how to “encode” assignments to variables as 2-colorings of vertices which cut a {\geq c_1} fraction of edges, and finally (this is the hardest part of the argument) we show that given a 2-coloring that cuts a {\geq c_2} fraction of edges, then

  1. the given 2-coloring must be somewhat “close” to a 2-coloring coming from the encoding of an assignment and
  2. if we “decode” the given 2-coloring to an assignment to the variables, such an assignment satisfies a noticeable fraction of equations.

Starting our reduction from a Unique Game instead of a Label Cover problem, we only need to prove (1) above, and (2) more or less follows for free.

To verify this claim, we “axiomatize” the properties of a reduction that only achieves (1): we describe a reduction mapping a single variable to a graph, such that assignments to the variable are mapped to good cuts, and somewhat good cuts can be mapped back to assignments to the variable. The reader can then go back to our analysis of the Max Cut inapproximability proof in the previous post, and see that the properties below are sufficient to implement the reduction.

Continue reading

On the Unique Games Conjecture (part 1)

[I am preparing a survey talk on Unique Games for a mathematics conference, and writing a survey paper for a booklet that will be distributed at the conference. My first instinct was to write a one-line paper that would simply refer to Subhash’s own excellent survey paper. Fearing that I might come off as lazy, I am instead writing my own paper. Here is part 1 of some fragments. Comments are very welcome.]

Khot formulated the Unique Games Conjecture in a remarkably influential 2002 paper. In the subsequent eight years, the conjecture has motivated and enabled a large body of work on the computational complexity of approximating combinatorial optimization problems (the original context of the conjecture) and on the quality of approximation provided by “semidefinite programming” convex relaxations (a somewhat unexpected byproduct). Old and new questions in analysis, probability and geometry have played a key role in this development. Representative questions are:

  • The central limit theorem explains what happens when we sum several independent random variables. What happens if, instead of summing them, we apply a low degree polynomial function to them?
  • In Gaussian space, what is the body of a given volume with the smallest boundary?
  • What are the balanced boolean function whose value is most likely to be the same when evaluated at two random correlated inputs?
  • What conditions on the Fourier spectrum of a function of several variables imply that the function essentially depends on only few variables?
  • With what distortion is it possible to embed various classes of finite metric spaces into L1?

Continue reading

Visualizing Unique Games

Every year, at the AMS Joint Mathematics Meeting, there is a “Current Events” session, with four talks on new and exciting developments in math. Some of the talks are given by speakers who are knowledgeable about the development they are talking about, but unrelated to it.

I think of this as an excellent idea, and in fact I have sometimes fantasized about a theory seminar in which, as a rule, the speakers can only talk about other people’s work. (This has remained a fantasy because it would be too hard to recruit the speakers.) Indeed, at the Berkeley theory lunch, I used to encourage such talks, and I rarely spoke about my own work.

This is all to say that at the 2011 meeting one talk will be about Unique Games (according to my calculations, this means that computer science theory accounts for 1/4 of what’s new and exciting in math, yay!) and I have been invited to deliver it.

In the next three weeks, I am supposed to submit a picture (or a description of it, which will be realized by the AMS graphic designers) that illustrates the topic of my talk.

What picture would you use? I am thinking of a drawing of the Khot-Vishnoi graph or a drawing of the graph used to reduce unique games to Max Cut. A face picture of Subhash would also make sense, but I don’t think it’s what they are looking for. (See here for the graphics accompanying past talks.)


Two Announced Breakthroughs

I was at a conference last week, at which I heard two pieces of mathematical gossip.

One was that Arora, Barak and Steurer have developed an algorithm that, given a Unique Games in which a 1-\epsilon fraction of constraints are satisfiable, finds an assignment satisfying a constant fraction of constraints in time 2^{n^{poly(\epsilon)}}. This is now officially announced in an earlier (public) paper by Arora, Impagliazzo, Matthews and Steurer, which presented a slower algorithm running in time 2^{\alpha n} where \alpha = exp(-1/\epsilon).

I suppose that the next targets are now the approximation problems for which the only known hardness is via unique games. Is there a subexponential algorithm achieving \log\log n approximation for sparsest cut or 1.99 approximation for Vertex Cover?

The other news is on the Polynomial Ruzsa-Freiman conjecture, one of the main open problems in additive combinatorics. Apologies in advance to readers if I get some details wrong. In the special case of \mathbb{F}_2, the conjecture is that if F: \mathbb{F}_2^n \rightarrow \mathbb{F}_2^m is any function such that

\mathbb{P}_{x,y,z} [ F(x) + F(y)  + F(z)= F(x+y+z) ] \geq \epsilon

then there is a matrix M and a vector b such that

\mathbb{P}_{x} [ F(x) = Mx + b ] \geq \epsilon^{O(1)}

where the probability in the conclusion is independent of n,m and is polynomial in \epsilon. Various proofs were known achieving a bound of exp(-poly(1/\epsilon). The first proof, due to Samorodnitsky achieves, I believe, a bound of exp(-O(1/\epsilon^2)), while the results from this paper should imply a bound of exp(-\tilde O(1/\epsilon)) where we use the notation \tilde O(n) := O(n \log n).

At the conference, Ben Green announced a result of Schoen implying a subexponential bound of 1/2^{2^{\sqrt{\log 1/\epsilon}}}.

The proof begins with the standard step of applying a theorem of Ruzsa to find a subset A\subseteq \mathbb{F}_2^n such that |A| \geq \epsilon^{O(1)} 2^n, and F on A is a “Freiman 16-homomorphism,” meaning that for every 32 elements of A such that

a_1 + \cdots + a_{16} = a_{17} + \cdots + a_{32}

we have

F(a_1) + \cdots + F(a_{16}) = F(a_{17}) + \cdots + F(a_{32})

The choice of 16 is just whatever makes the rest of the proof work. The theorem of Ruzsa works for any arbitrarily large constant. Then we consider the set B := 8A of all the elements that can written as a_1 + \cdots + a_8 with a_i \in A, and we define a function F' on 8A by setting

F'(a_1 + \cdots + a_8) := F(a_1) + \cdots + F(a_8)

which is a well-posed definition because F is a Freiman 16-homomorphism. (It would have been sufficient if F had been an 8-homomorphism.) Note that for every b_1,b_2,b_3,b_4 \in B such that b_1 + b_2 = b_3 + b_4 we have

F'(b_1) + F'(b_2) = F'(b_3) + F'(b_4)

Now the result of Schoen implies that if A is a subset of \mathbb {F}_2^n of size 2^n/K, then there is a subspace V of dimension n - 2^{O(\sqrt {\log K})} such that V is entirely contained in 8A.

Previously, it was known how to get a subspace of dimension n - \tilde O(K), by adapting a technique of Chang (see this survey paper).

Note that F' is now a linear map on V, and that it agrees with F on V. (I cannot reconstruct how this last step follows, however — maybe that claim is that F agrees with F' on a poly(\epsilon) fraction of elements of V? ) It now just remains to extend, arbitrarily, F' to a linear map over the entire space.

Survey on Hardness of Approximation

In 2004 I wrote a survey on hardness of approximation as a book chapter for a book on approximation algorithm. I have just prepared a revised version for a new edition of the book.

While it would have been preferable to rethink the organization and start from scratch, because of time constraints I was only able to add sections on integrality gaps and on unique games, and to add references to more recent work (e.g. the combinatorial proof of the PCP theorem, the new 2-query PCPs, the tight results on minimizing congestion in directed networks, and so on).

If your (or your friend’s) important results are not cited, do let me know. The deadline to submit the chapter has long passed, but the threats from the editor haven’t yet escalated to the point where I feel that I have to submit it or else.