Average Case Complexity News

Greetings from Taiwan, a tropical country with democracy, free press, and rule of law, in which the trains run on time, and the food is awesome. Also people are friendly, drivers don’t honk, and the “close doors” buttons in elevators actually work. Talk about exceptionalism! On November 24, In Theory endorses voting No on 10, No on 11, No on 12, Yes on 13, Yes on 14 and Yes on 15.

More on Taiwan in a later post. Today I would like to give a couple of updates on the survey paper on average-case complexity theory that Andrej Bogdanov and I wrote in 2006: apologies to the readers for a number of missing references, and news on the front of worst-case to average-case reductions.

1. Addendum and missing references

In Section 2 and 3, we discuss Levin’s proof of the analog of NP-completeness for the theory of average-case complexity. Levin proves that there is a decision problem in NP, let’s call it {C}, such that if there is an “average polynomial time” algorithm for {C} with respect to the uniform distribution, then for every decision problem {L} in NP and for every distribution {\mu} of instances that is “polynomial time computable” in a certain technical sense, then {L} also admits an “average polynomial time” algorithm with respect to the distribution {\mu}.

To prove such a result, we want to take an instance {x} sampled from {\mu} and transform it into one (or more) instance {y}, such that if we feed {y} into an “average polynomial time” algorithm for problem {C} then

  1. the algorithm will run also in “average polynomial time” and
  2. from the answer of the algorithm we can decide whether {x\in L} or not.

This is different from the Cook-Levin theorem in that we need the instance {y} produced by the reduction to be approximately uniformly distributed, or else we are not guaranteed that the algorithm for {C} that runs in “average polynomial time” with respect to the uniform distribution will also run in “average polynomial time” in whatever is the distribution of {y}.

The key idea in Levin’s proof is to use compression: for the class of “polynomial time computable” distributions {\mu} that he considers, there exist essentially optimal compression algorithms, such that if {x} is sampled from {\mu} and then mapped to a compressed (but efficiently invertible) representation {x'}, then {x'} is “uniformly distributed.” Then one has to map {x'} to {y} in a way that does not mess up the distribution too much. (Above, “uniformly distributed” is in quotes because {x'} will be a string of variable length, so the sense in which it is approximately uniformly distributed has to be defined carefully.)

In order to turn this intuition into a proof, one has to give a precise definition of “average polynomial time” and of reductions between distributional problems, and prove that reductions preserve the existence of average polynomial time algorithms. These matters are quite subtle, because the first definitions that come to mind don’t quite work well, and the definitions that do work well are a bit complicated.

The paper with Andrej lacks references to several important works that clarified, simplified, and generalized Levin’s original presentation. In particular, our definition of average polynomial time, and the proof that it is preserved by reductions, is due to Russell Impagliazzo, in his famous “five worlds” paper. The presentation of the completeness result follows notes by Oded Goldreich. As in Oded’s notes, we present the completeness of a Bounded Halting problem, which we erroneously attribute to Levin, while it is due to Gurevich. After one proves the existence of one complete problem, one would like to see a series of reductions to natural problems, establishing more completeness results, that is, one would like the average-case complexity analog of Karp’s paper! This has not happened yet, although there have been a number of important average-case completeness results, which we also fail to cite.

We have submitted to the publisher an addendum to correct all the above omissions. We have already privately apologized to the authors of the above works, but we would also like to publicly apologize to the readers of the survey.

2. Worst-case to average-case reductions

In Section 7, Andrej and I discuss the problem of constructing “worst-case to average-case reductions” for NP-complete problems. Ideally, we would like to “collapse” Levin’s theory to the standard theory of NP-completeness, and to prove that if {P\neq NP}, or something morally equivalent like {NP \not\subseteq BPP}, then Levin’s complete problem {C} does not admit an average polynomial time algorithm with respect to the uniform distribution (same for other complete distributional problems). More ambitiously, one may hope to base symmetric-key cryptography, or even public-key cryptography, on the assumption that {NP \not\subseteq BPP}, which, short of settling the {P} versus {NP} question, would be the soundest possible basis on which to base cryptography. (One would also get high confidence that the resulting cryptosystems are quantum-resistant!)

Indeed, there are problems, like the Permanent and (in a limited sense) the Discrete Log problem in which one can take a worst-case instance {x}, and map into to a sequence of instances {y_1,\ldots,y_k}, each of which is uniformly distributed and such that solving the problem on the {y_i} gives a solution for {x}. Thus, an average polynomial time algorithm for such problems, which are called random self-reducible, gives a worst-case (randomized) polynomial time algorithm.

Feigenbaum and Fortnow, unfortunately, prove that if a problem in NP is random self reducible (or even randomly reducible, in a similar sense, to another problem in NP), then it is, essentially, in coNP, and so such techniques cannot apply to NP-complete problem.

In 2003, Andrej Bogdanov and I generalized this to show that if one has any non-adaptive reduction from a problem {L} to a distributional problem {\langle A, \mu \rangle} where {A} is in {NP} and {\mu} is samplable, then {L} is in {coNP/poly} and thus unlikely to be NP-complete. Our definition of reduction allows any non-adaptive procedure that, given an oracle that solves {A} in average polynomial time with respect to {\mu}, is able to solve {L} in worst-case polynomial time.

It remains open whether such a result holds for adaptive reductions, and my guess is that it does. I actually had a much stronger guess, namely that for every NP problem {A} and every samplable distribution {\mu}, solving {A} well on average with respect to {\mu} is doable with a computational power in the ballpark of {NP \cap coNP}, perhaps in randomized polynomial time with oracle access to a language in {NP/poly \cap coNP/poly}. If so there would be a fundamental gap between the complexity of solving NP-complete problems in the worst case versus the average case.

A recent, very exciting paper by Shuichi Hirahara, shows that my guess was wrong, and that there are distributional problems in NP with respect to samplable distributions, whose complexity is fundamentally higher than {NP \cap coNP}.

Hirahara considers an NP problem called MINKT, which is the problem of computing a resource-bounded version of Kolmogorov complexity. He shows that if MINKT has an average polynomial time algorithm, then it has a ZPP worst-case algorithm (the latter is an approximation algorithm with additive error). Furthermore, under reasonable assumptions, MINKT is not in coNP, or coNP/poly, or in any morally equivalent class.

So did Hirahara come up with an adaptive reduction, in order to get around the result of Andrej and myself? No, remarkably he comes up with an argument that is not a reduction! (More modestly, he says that he comes up with a non-black-box reduction.)

To clarify terminology, I prefer to call a reduction (what most other people prefer to call a “black box” reduction, in this setting) a way of showing that, given an oracle for a problem {B} we show how to solve problem {A}; hence an algorithm for {B} implies an algorithm for {A}.

This seems to be the only possible way of showing that if we have an algorithm for {B} then we have an algorithm for {A}, but consider the following example. Say that {S} is a complete problem for {\Sigma_2}, the second level of the polynomial time hierarchy; we know that if there is a polynomial time algorithm for 3SAT then there is an algorithm for {S}, however the proof does not involve a (black box, if you must) reduction from {S} to 3SAT. In fact, we do not know how to solve {S} in polynomial time given an oracle for 3SAT and, assuming that the polynomial time hierarchy does not collapse, there is no way to do that!

So how do we prove that if 3SAT is in polynomial time then {S} is in polynomial time? We take the {\Sigma_2} “algorithm” for {S}, which is an NP machine with oracle an NP machine, we replace calls to the oracle with executions of the 3SAT algorithm, and now we have a regular NP machine, which we can reduce to 3SAT. Note that, in the first step, we had to use the code of the algorithm for 3SAT, and use the assumption that it runs in polynomial time.

Hirahara’s argument, likewise, after assuming that MINKT has an average polynomial time algorithm, uses the code of the algorithm and the assumption about efficiency.

At this point, I do not have a good sense of the computational power that is necessary or sufficient to solve NP distributional problems well on average, but Hirahara’s paper makes me think it’s conceivable that one could base average-case hardness on {NP\not\subseteq BPP}, a possibility that I previously considered hopeless.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s