In a tutorial on average-case complexity that I gave at FOCS 2008, I mentioned four of my favorite open questions in the theory. I have some progress to report on one of the questions (#3 in this post), and on two related problems.
The question is whether a certain exponential loss is necessary in Levin’s proof that there are problems that are “NP-complete on average.” The answer is yes, it is necessary, unless the polynomial hierarchy collapses. The two related problems are whether related exponential losses in the analysis of Levin’s universal search algorithm and of Levin’s universal one-way function are also necessary. The answers are, respectively, yes unless , and yes relative to an oracle.
This is one of those now fashionable papers whose arguments are “obvious in retrospect.” (I would like to claim, however, that the proof of the result for Levin’s average-case completeness, while obvious in hindsight, is not so obvious without hindsight.) In all three cases, the exponential loss is in terms of the code length of certain programs. The idea in each case is to construct programs that contain in their code information about a specific instance of a problem of interest.
1. Levin’s Universal Search
It is simpler to start the discussion from Levin’s universal search algorithm. Levin, in his 1973 paper independently discovering NP-completeness, suggested the following procedure to solve NP search problems: look for solutions in order of increasing resource-bounded Kolmogorov complexity. In other words, try all possible algorithms, with a time-out, for all possible time-outs, with a rather smart schedule to choose algorithms and time-outs.
In the concrete case of 3SAT, Levin’s algorithm has the property that for every algorithm there exists a constant
such that if
finds a satisfying assignment for a formula
in time
, then Levin’s algorithm finds a satisfying assignment for
in time at most
That is, for every algorithm , the running time of Levin’s algorithm is at most a constant factor slower than
on every input. The value of
, however, is exponential in the length of the code of
.
Could we hope to have a universal search algorithm for 3SAT that, for every algorithm and satisfiable
as above has running time upper bounded by a polynomial in the length of
, in
, and in
?
There is a very easy argument (which, as far as I know, has not been observed before) showing that such a “fully polynomial” universal search algorithm would imply P=NP.
Suppose such an algorithm existed, call it ; I claim that, for every satisfiable formula
,
outputs a satisfying assignment for
in time
. (
may not terminate on unsatisfiable formulas, but the fact that it finds satisfying assignments, when they exist, in polynomial time suffices to conclude that
.) To prove the claim, let
be a satisfiable formula, and consider the algorithm
that, on input a formula
, does the following: if
then
outputs a satisfying assignment for
that is explicitly specified as part of the code of
; if
then
performs a brute-force search for a satisfying assignment for
, and it outputs the first one it finds, if any.
The length of the code of is linear in
, and the running time of
is also linear in
, and so
must output a satisfying assignment for
in polynomial time.
Indeed, even this completely trivial proof is an overkill: we could have let just output a satisfying assignment of
regardless of the input
. The definition of
given above, however, rules out even universal search algorithm whose performance is guaranteed only against algorithms that are provably correct on all inputs. Hutter has studied such algorithms, and proved that the exponential dependency on the length of the code of
(and, in his case, exponential dependency on the length of the proof of correctness of
and the proof establishing time bounds on
) can be pushed to an additive term, rather than being a factor that multiplies the running time of
, as in (1). The above construction, however, shows that the additive term cannot be polynomial in the code length and proof length.
There are two other famous results of Levin which involve exponential complexity in the length of the code of certain algorithms, and, in those cases, the necessity of the exponential dependency is a somewhat less trivial question.
2. Levin’s Universal One-Way Function
Quite related to his universal search algorithm is Levin’s proof that there exists a universal one-way function, that is, a polynomial time computable function that is one-way provided that one-way functions exist.
Levin’s idea is the following. Define the function such that for every algorithm
and input
we have
where is the output of
if the computation of
terminates within, say,
steps, and
is, say, a string of
zeroes otherwise. We call such a function
because it is essentially a (time-bounded) universal turing machine.
Now it is easy to see that if one-way functions exist, then there exists a one-way function computable in time at most on input
. Call this function
and let
be the algorithm that computes
in time
. Then the mapping
is one-way (in fact, it is essentially the mapping ), and it accounts for a
fraction of the inputs of
. This implies that
is a weak one-way function.
Finally, define to be the function
where an input of length has been parsed into
strings of length
. By Yao’s amplification of hardness for one-way functions, if
is a weak one-way function then
is a one-way function in the standard sense.
Where is the exponential dependency here?
This is a subtle but important point that I learnt from Charles Rackoff. Suppose that we have a polynomial time algorithm that inverts
on an inverse polynomial fraction of inputs. Let
be any polynomial time computable function. Then by using
and the reduction outlined before we can invert
on, say, half of the inputs, in time that is polynomial in the length of the input, but exponential in the length of the program of
.
Now, the strength of a universal one-way function is the fact that one can use it as a primitive and claim that either it is secure or otherwise all one-way functions can be inverted and thus no cryptography is possible. But if we assume that Levin’s function is not one-way, the inverter that we get for, say, AES, has complexity much higher than brute-force search. In general, if we have a finite function whose code length is comparable to its input length (as is the case of functions constructed using -boxes and other ad-hoc tables of values) then no non-trivial attack for such a function would follow from breaking Levin’s function.
It would be much more desirable to construct a universal one-way function with the property that if there is a polynomial time algorithm that inverts
on an inverse polynomial time fraction of inputs, then we can use such an inverter to invert any other function
on an inverse polynomial fraction of inputs in time polynomial in the length of the input and on the length of the code of
.
We show that there is an oracle relative to which one-way functions exist but there is no algorithm that is able to invert every on an inverse polynomial time fraction of inputs and in time polynomial in the code length of
and in the length of the given inputs.
3. Levin’s Average-Case Complete Problem
The exponential dependency in Levin’s average-case complexity is even harder to explain.
Perhaps it helps to start from Cook’s theorem that 3SAT is NP-complete, in the standard worst-case complexity, setting. Cook proves that for every language in
there is a polynomial time computable reduction function
such that for every string
we have
So if we have a worst-case polynomial time algorithm for solving 3SAT, we also get a worst-case polynomial time algorithm for which works as follows: on input
, apply the 3SAT algorithm for
.
An important point is that such reductions not only exist, but are also explicitly and efficiently given if one has a description of
in the form of a non-deterministic polynomial time algorithm that decides
. Indeed, Cook’s theorem shows the existence of a single function
such that if
is a non-deterministic algorithm that decides a language
in time
, then the mapping
is a reduction from
to
. Furthermore,
is computable in time polynomial in
, in
and in
.
Note that, in order to establish the -completeness of 3SAT it would have been perfectly adequate if
had been computable in time exponential or doubly exponential in
, as long as it is polynomial in
and
. This hypothetical result would, however, be much less satisfying, because in practice it would not be true that “an efficient algorithm for 3SAT implies an efficient algorithm for each
in
,” unless we restrict ourselves to languages
that have extremely short descriptions.
In the theory of average-case complexity, instead of languages we study distributional problems, that is pairs
where
is a language and
is a distribution over the inputs. A reduction from
to
is a polynomial time computable function
such that
iff
, as in the worst-case complexity theory, but with the additional property that
is “polynomially dominated” by
. This means that if we look at any set of “bad” instances
that has probability
according to
, then if sample a random
according to
, the probability of
is at most
.
The point of this definition is that we would like to say the following: if there is a good average-case algorithm for
, then there also is a good average-case algorithm for
, which works as follows: on input
, run algorithm
on
. The good-on-average algorithm
, however, might provide incorrect answers (or time out, depending on the flavor of definition) on some inputs, which have small probability mass according to
. The domination property ensures us that the distribution of
, with
sampled from
, is not too likely to land into this set of bad inputs for
.
Consider now the non-deterministic bounded-halting problem defined as follows: on input
, where
is a non-deterministic algorithm and
and
are bit strings, is there a computational path of
that accepts
in time at most
? Let
be the uniform distribution (ensemble) over bit strings.
Here is a proof that the distributional problem is average-case complete for all distributional problems of the form
, where
is in
: to reduce
to
, use the (randomized) reduction
where is a non-deterministic Turing machine that decides
in time
for some polynomial
, and
is a randomly chosen string of length
.
It is clearly true that, for all and all choices of
,
, and the domination property holds with a loss in probability which is about
.
For reasons that would take some time to elucidate, this loss in the domination factor is roughly equivalent to having a factor of in the running time of the reduction, and this is a rather undesirable feature of the reduction.
(Compare with the hypothetical discussion above on a proof of Cook’s theorem in which the reduction from a generic to
takes time exponential in the description of the non-deterministic algorithm for
.)
Could we have a loss in probability which is polynomial in and in
? (We call a reduction with such parameters a “fully polynomial” reduction.)
Suppose that we had a worst-case to average-case reduction within NP, from, say, 3SAT to a distributional problem . That is a randomized transformation that starting from a 3SAT instance
produces a distribution
that is dominated by the uniform distribution and such that
if and only if
. Then we would also have a “fully polynomial” reduction proving that
is complete on average for
. To reduce
to
we would first reduce from
to
using Cook’s theorem, and then we would reduce from worst-case 3SAT to
using the hypothetical worst-case to average-case reduction. We can check that the domination parameter would be polynomial in the length of the code of the non-deterministic algorithm defining
.
The observation that I consider the main point of the paper is that the converse is true. If we had a completeness result in via a fully polynomial reduction, then we would also have a worst-case to average-case reduction.
Here is the proof. Suppose a distributional problem were complete for
under fully polynomial reductions. We want to show that there is a worst-case to average-case reduction from 3SAT to
. Toward this goal, consider the language
recognized by the non-deterministic algorithm
that, on every input
, accepts if and only if
is satisfiable, thus
is either
or
depending on whether
is satisfiable or not. Note that the running time and length of
are polynomial in
. We assumed that
is complete for
via a fully polynomial reduction
, so the mapping
is computable in time polynomial in and
and, for a random
, the output is dominated by the uniform distribution with a domination parameter polynomial in
and
. To finish the argument, notice that for every
and
and so the mapping that, given , outputs
for a random
is a worst-case to average-case reduction from
to
.
In past work with Andrej Bogdanov, we showed that such a worst-case to average-case reduction would imply that the polynomial hierarchy collapses. In fact, this is true even if the worst-case to average-case reduction is a “truth-table reduction.” (A polynomial time reduction that is allowed to make non-adaptive queries to the algorithm for the target problem.)
With the above argument, it follows that a fully polynomial completeness result proved with a truth-table reduction would also imply the collapse of the polynomial hierarchy.
Could there be a fully polynomial completeness result via a “Cook reduction”? (A polynomial time reduction that is allowed to make arbitrary, possibly adaptive, oracle queries to the algorithm for the target problem.) This is exactly the same question as whether there is an adaptive worst-case to average-case reduction within in NP, which remains open.
Thanks!
One question just to clarify my own understanding. In the first section, about “Levin’s Universal Search”, to make FPS terminate on unsatisfiable instances, can’t A_phi for unsatisfiable phi just terminate, outputting “unsatisfiable”, if the input psi is equal to phi, and do brute force search otherwise?
This means that the P=NP consequence would follow even from the existence of a weaker type of FPS, which is allowed to not terminate on unsatisfiable formulas, but which is required to be competitive only against algorithms that terminate on every input.
Also, without loss of generality, one could make universal search algorithms terminate on all inputs by outputting “unsatisfiable” after about
steps, because we know that if the formula is satisfiable then the universal search algorithm would output a satisfying assignment in
steps.
nice post!
Does anyone know if sth. is known about about “universal one-way permutations”? Seems like such a natural question, but Google has zero hits on that term. The best I can come up with is a universal PRP assuming NP=co-NP: define sth. like $\f(z_1,\ldots,z_m) = u(z_1)\ldots u(z_m)$ where $u(z)$ parses $z$ as $a,b,c$ and outputs $a,a(b),c$ if $c$ is a “witness” that $a$ is circuit that computes a permutation and $u(z)=z$ otherwise (not being a permutation is in NP, the witness is a pair of distinct $x,x’$ s.t. $a(x)=a(x’)$, thus not being a permutations is in co-NP).
Hi Krzysztof, that’s a very good question. Note that your proposal doesn’t work because your function equal the identity on all but an exponentially small fraction of inputs.
My guess is that there is an oracle relative to which a universal one-way permutation does not exist, according to the following definition:
A “universal one-way permutation” is a polynomial time computable permutation p and a reduction R such that for every machine M that computes a permutation and every function A that is the inverse of the permutation p on at least half of the inputs, we
for at least half of the
‘s. The reduction
is required to run in time polynomial in
but arbitrarily large in
.
have that
Right, I missed the exponential loss, now I see why you have to argue about TMs not circuits, which brings up another question: can you construct a universal OWF in the non-uniform setting? Seems uniformity is a very crucial property here.
But using circuits is not the only problem with the construction for OWPs. You could hope that the following works: parse z,|z|=n as (a,b,c) where |a|=|c|=log(n) and output a(b) iff c is a witness that the TM a is a bijection on inputs of length n-2log(n) and running in time poly(n). The problem here is that there’s no reason such a short witness should exist, i.e. the length of the witness can in general be polynomial (not just logarithmic) in n.