Average Case Complexity News

Greetings from Taiwan, a tropical country with democracy, free press, and rule of law, in which the trains run on time, and the food is awesome. Also people are friendly, drivers don’t honk, and the “close doors” buttons in elevators actually work. Talk about exceptionalism! On November 24, In Theory endorses voting No on 10, No on 11, No on 12, Yes on 13, Yes on 14 and Yes on 15.

More on Taiwan in a later post. Today I would like to give a couple of updates on the survey paper on average-case complexity theory that Andrej Bogdanov and I wrote in 2006: apologies to the readers for a number of missing references, and news on the front of worst-case to average-case reductions.

Continue reading

On the Necessity of Enumerating All Programs

In a tutorial on average-case complexity that I gave at FOCS 2008, I mentioned four of my favorite open questions in the theory. I have some progress to report on one of the questions (#3 in this post), and on two related problems.

The question is whether a certain exponential loss is necessary in Levin’s proof that there are problems that are “NP-complete on average.” The answer is yes, it is necessary, unless the polynomial hierarchy collapses. The two related problems are whether related exponential losses in the analysis of Levin’s universal search algorithm and of Levin’s universal one-way function are also necessary. The answers are, respectively, yes unless {P=NP}, and yes relative to an oracle.

This is one of those now fashionable papers whose arguments are “obvious in retrospect.” (I would like to claim, however, that the proof of the result for Levin’s average-case completeness, while obvious in hindsight, is not so obvious without hindsight.) In all three cases, the exponential loss is in terms of the code length of certain programs. The idea in each case is to construct programs that contain in their code information about a specific instance of a problem of interest.

Continue reading

Average-case complexity

What do we do with an NP-complete (optimization) problem of practical relevance? We strongly believe that there is no polynomial time exact algorithm, but what are the next best alternatives? Garey and Johnson, in 1978, list various possible ways to “cope” with NP-completeness, including looking for approximate algorithms and for algorithms that are efficient on average. Of course, when a problem is really hard, even these options may be intractable, and it would be good to know when this is the case.

Before the development of PCP machinery, complexity theorists had almost no tool to study the complexity of approximations. When it comes to studying the average-case complexity of natural problems in NP, unfortunately, we are still at the stage of pre-PCP studies of approximation.

Indeed, it has been quite difficult even to lay out the ground rules of the study of average-case complexity. Even to define such basic notions as “computational problem,” “efficient algorithm,” “reduction,” and “complete problem” in the setting of average-case complexity leads to surprisingly subtle difficulties.

Much of this foundational work was laid out in a famously succint paper of Levin’s, with important extensions appearing in a paper of Ben-David and others and in a paper of Impagliazzo and Levin.

The main results and insights in this field as of 1995 are summarized in a paper by Russell Impagliazzo that is perhaps the best survey paper in theoretical computer science ever written.

Andrej Bogdanov and I have written a more recent survey on the subject. We are in the process of revising it for journal publication, and we will be very grateful to anybody pointing out errors and omissions (especially in the references).

So, given that we have no tools yet to study the average-case complexity of natural problems, what exactly do we talk about for 58 pages?

There are, indeed, a number of interesting results and insights.

  1. To begin with, if we have a language L and a distribution D over inputs, how do we define the notion that an algorithm A is “efficient on average” for L with respect to D? The simplest definition is to require the expected running time to be polynomial. Indeed, all expected polynomial time algorithms satisfy the intuitive notion of being “efficient on average,” but there are algorithms of expected superpolynomial running time that are also intuitively “typically efficient.” Suppose for example that, for every n, our algorithm takes time n2 on all inputs of length n, except one exceptional input on which the algorithm takes time 22n. Levin’s definition better captures the intuitive notion and includes such “pathological” cases. In Levin’s definition, an algorithm is “polynomial on average” if, after running for t(n) steps, the algorithm can solve all but a poly(n)/(t(n))c fraction of inputs of length n, for some c>0.
  2. Now that we have a notion of efficiency, the next step is to have the average-case analog of an NP-complete problem. Consider the “bounded halting” problem BH where, on input a pair (M,x) where M is a non-deterministic Turing machine and x is a string of length n, we ask whether M can accept x in at most, say, n2 steps. It’s clear that BH is NP-complete. If L is a language in NP, and M is the machine that decides L, then the reduction from L to BH essentially maps x into (M,x). (We have to add some padding if M takes more than quadratic time to solve L, but the details are trivial.)

    Now let D be a distribution. Consider the reduction from L to BH that maps x into (M’,C(x)) where C is an encoding function and M’ is a machine that on input y first finds x such that C(x)=y and then simulates M on input x. As long as C is injective and polynomial time computable this is still a valid reduction. Levin shows that, if D comes from a certain class of “polynomial-time computable distributions,” then we can find a C such that C(x) is “almost uniformly” distributed when x is sampled from D. This means that the reduction maps a random input from D into a pair (M,C(x)) that is “almost uniformly” distributed. (Technically, the resulting distribution has very high min-entropy or, which is the same, is “dominated” by the uniform distribution.) This means that an algorithm that is good-on-average for BH under the uniform distribution yields an algorithm that is good-on-average for L under the distribution D. Recall that L was an arbitrary problem in NP and D was an arbitrary “polynomial time computable” distribution.

  3. If there is a good-on-average algorithm for BH under the uniform distribution, then there is a good-on-average algorithm for the search version of BH as well. (In the search version, we want to find an accepting computation of M if one exists.) The equivalence is trivial in the worst-case setting, but it requires an ingenious proof due to Ben-David et al. in the average-case setting.
  4. If there is a good-on-average algorithm for BH under the uniform distribution, then there is a good-on-average algorithm for every problem L in NP under any polynomial time samplable disribution. This is a result of Impagliazzo and Levin which is a proper generalization of the completeness result above. Samplable distributions are more general than computable distributions and capture essentially all “natural” distributions.
  5. If there is a good-on-average algorithm for BH under the uniform distribution, can we say that there are worst-case efficient algorithms for all problems in NP? Not if the connection is made with a non-adaptive reduction. This is a result of Andrej and I, generalizing earlier work of Feigenbaum and Fortnow. Can we at least say that there are good-on-average algorithms for all L in NP and all distributions D? Such a conclusion implies worst-case efficient algorithms for NP, so the answer is again probably not.
  6. So it is probably impossible to equate worst-case complexity and average-case complexity of NP-complete problems via reductions. What about different “degrees” of average-case complexity? If we define good-on-average algorithms to always run in polynomial time and to make a bounded number of mistakes, we can quantify average-case complexity by the fraction of inputs on which the best polynomial time algorithms makes mistakes. O’Donnell shows the equivalence of strong versus weak assumptions about the existence of average-case-hard problems in NP. Ryan’s proof brilliantly generalizes Yao’s XOR Lemma.