# FOCS 2008 Tutorial

Last week I gave a tutorial on average-case complexity at FOCS’08.

I have now posted online the slides, along with links to some survey papers.

In my talk I discussed the following four open problems, which are among my favorite.

1. Is it possible to base the existence of average-case hard problems in NP on the existence of worst-case hard problems in NP? (Say, on the assumption that $NP \not\subseteq BPP$.)

The evidence is mixed; work by Ajtai and others has shown the existence of distributional problems in NP which are hard-on-average provided that certain related $NP$ problems are hard in the worst case. The latter problems, however, are in $NP \cap coNP$ and are unlikely to be $NP$-complete. Furthermore, Feigenbaum and Fortnow show that if one proves that the worst case solution of a problem $A$ in $NP$ reduces to the average-case solution of a problem $B$ in $NP$ under a samplable distribution $D$, and the reduction has a certain special form, then $B$ is in “$NP \cap coNP$.” (There are scare quotes because $B$ is actually in $NP \cap coAM/log$, which is “essentially the same” as $NP \cap coNP$.) The Feigenbaum-Fortnow result has been generalized by Bogdanov and me to hold for any “non-adaptive” reduction, but the following question remains open.

Let $A,B$ be problems in $NP$. Suppose there is a polynomial-time reduction $R$ and a polynomial $q$ with the following property: For every oracle $O$ that agrees with $B$ on a $1- \frac 1 {q(n)}$ fraction of inputs of length $n$, $R^O$ correctly solves $A$ on every input.

Then $A$ is in $NP \cap coNP/poly$, or, more broadly, $A$ is in a class that cannot contain $NP$-complete problems unless the polynomial hierarchy collapses

2. In the hardness amplification problem we are given (or we assume the existence of) a problem that is hard on average in a weak sense (meaning that every efficient algorithm makes at least a $\frac {1} {poly(n)}$ fraction of mistakes) and we want to define a new problem (possibly, a very artificial one) that is hard on average in a much stronger sense. (Every efficient algorithm makes at least a $\frac {1}{2} - \frac {1}{poly(n)}$ fraction of mistakes, and is not noticeably better than the trivial algorithm that outputs a random guess and makes a $\frac 12$ fraction of mistakes.) O’Donnell and Healy, Vadhan and Viola prove such a result for problems in $NP$; their reductions are non-uniform, and hence the result applies to the notion of “efficient algorithm” given by polynomial size circuits. A uniform reduction, that gives amplification against probabilistic algorithms, appears in a work of mine, but it only amplifies from a $\frac {1}{poly(n)}$ fraction of mistakes to a $\frac {1}{2} - \frac {1}{polylog (n)}$ fraction of mistakes.

The following question remains open.

For every problem $W$ in $NP$ and polynomials $p,q$ there is a problem $S$ in $NP$ such that the following happens. If there is a randomized polynomial time algorithm $A$ that solves $S$ on a $\geq \frac 12 - \frac 1 {p(n)}$ fraction of inputs of length $n$, then there is a randomized polynomial time algorithm $B$ that solves $W$ on a $\geq 1 - \frac 1 {q(n)}$ fraction of the inputs of length $n$.

3. Is there a more effective proof of Levin’s completeness result?

Levin proves that there is an $NP$ problem $C$ such that for every $NP$ problem $A$ and every “computable” distribution $D$, there is a reduction $R$ from $\langle A,D\rangle$ to $\langle C,Uniform \rangle$. The reduction preserves average-case solvability in the sense that the distribution $R(D)$ is dominated by the uniform distribution. (Every element in the range of $R(D)$ has probability at most polynomially larger than in the uniform distribution.) There is, however, a loss in the domination factor which is exponential in the length of the non-deterministic program that puts $A$ in $NP$.

Is this necessary?

I leave it as an open-ended question, but I would consider a negative result in a relativized world to be a satisfactory negative answer. In particular, I believe that it is possible to construct an oracle $O$ relative to which for every fixed problem $C$ in $NP$ it is not possible to construct a polynomial-time reduction $R$ such that given a non-deterministic circuit $N$ deciding the language $A$, and a circuit $S$ “computing” the distribution $D$, the mapping $x \rightarrow R(N,S,x)$ is a reduction from $\langle A,D\rangle$ to $\langle C,Uniform \rangle$ such that $R(N,S,D)$ is dominated by the uniform distribution with a domination factor depending polynomially in the sizes of the circuits $N$ and $S$.

4. Find more reductions between distributional problems

We have very few completeness results in Levin’s theory, but also very few reductions between problems that are not known to be complete. I think it would be very interesting to discover more such reductions. The following question is due to Russell Impagliazzo.

Prove that if there is an efficient algorithm that solves well on average SAT on random instances with $n$ variables and $12n$ clauses, then there is an efficient algorithm that solves well on average SAT on random instances with $n$ variables and $10 n$ clauses.

In this setting, “solving well on average” means “certifying the unsatisfiability of a $1 - \frac {1}{poly(n)}$ fraction of the formulas.” Note that the converse is trivial: if we have an algorithm that certifies the unsatisfiability of formulas with $10 n$ clauses, then given $12 n$ random clauses we can drop the last $2n$ of them and pass the rest to the algorithm; if the algorithm certifies the unsatisfiability of this subformula, we are done.