# The Impagliazzo Hard-Core-Set Theorem

The Impagliazzo hard-core set theorem is one of the bits of magic of complexity theory. Say you have a function $g:\{ 0, 1 \}^n \rightarrow \{ 0,1\}$ such that every efficient algorithm makes errors at least $1\%$ of the times when computing $g$ on a random input. (We’ll think of $g$ as exhibiting a weak form of average-case complexity.) Clearly, different algorithms will fail on a different $1\%$ of the inputs, and it seems that, intuitively, there should be functions for which no particular input is harder than any particular other input, per se. It’s just that whenever you try to come up with an algorithm, some set of mistakes, dependent on the algorithmic technique, will arise.

As a good example, think of the process of generating $g$ at random, by deciding for every input $x$ to set $g(x)=1$ with probability $99\%$ and $g(x)=0$ with probability $1\%$. (Make the choices independently for different inputs.) With very high probability, every efficient algorithm fails with probability at least about $1\%$, but, if we look at every efficiently recognizable large set $H$, we see that $g$ takes the value 1 on approximately $99\%$ of the elements of $H$, and so the trivial algorithm that always outputs 1 has a pretty good success probability.

Consider, however, the set $H$ of size $\frac {2}{100} 2^n$ that you get by taking the $\approx \frac{1}{100} 2^n$ inputs $x$ such that $g(x)=0$ plus a random sample of $\frac{1}{100} 2^n$ inputs $x$ such that $g(x)=1$. Then we can see that no efficient algorithm can compute $g$ on much better than $50\%$ of the inputs of $H$. This is the highest form of average-case complexity for a boolean function: on such a set $H$ no algorithm does much better in computing $g$ than an algorithm that makes a random guess.

The Impagliazzo hard-core theorem states that it is always possible to find such a set $H$ where the average-case hardness is “concentrated.” Specifically, it states that if every efficient algorithm fails to compute $g$ on a $\geq \delta$ fraction of inputs, then there is a set $H$ of size $\geq \delta 2^n$ such that every efficient algorithm fails to compute $g$ on at least a $\frac 12 - \epsilon$ fraction of the elements of $H$. This is true for every $\epsilon,\delta$, and if “efficient” is quantified as “circuits of size $s$” in the premise, then “efficient” is quantified as “circuits of size $poly(\epsilon,\delta) \cdot s$” in the conclusion.

The example of the biased random function given above implies that, if one wants to prove the theorem for arbitrary $g$, then the set $H$ cannot be efficiently computable itself. (The example does not forbid, however, that $H$ be efficiently computable given oracle access to $g$, or that a random element of $H$ be samplable given a sampler for the distribution $(x,g(x))$ for uniform $x$.)

A number of proofs of the hard core theorem are known, and connections have been found with the process of boosting in learning theory and with the construction and the decoding of certain error-correcting codes. Here is a precise statement.

Impagliazzo Hard-Core Set Theorem
Let $g:\{0,1\}^n \rightarrow \{0,1\}$ be a boolean function, $s$ be a size parameter, $\epsilon,\delta>0$ be given. Then there is a $c(\epsilon,\delta) = poly(1/\epsilon,1/\delta)$ such that the following happens.

Suppose that for every function $f:\{0,1\}^n \rightarrow \{0,1\}$ computable by a circuit of size $\leq c\cdot s$ we have

$\displaystyle Pr_{x \in \{0,1\}^n} [ f(x) = g(x) ] \leq 1-\delta$

Then there is a set $H$ of size $\geq \delta 2^n$ such that for every function $f$ computable by a circuit of size $\leq s$ we have

$\displaystyle Pr_{x\in H} [ f(x) = g(x) ] \leq \frac 12 + \epsilon$

Using the “finitary ergodic theoretic” approach of iterative partitioning, we (Omer Reingold, Madhur Tulsiani, Salil Vadhan and I) are able to prove the following variant.

Impagliazzo Hard-Core Set Theorem, “Constructive Version”
Let $g:\{0,1\}^n \rightarrow \{0,1\}$ be a boolean function, $s$ be a size parameter, $\epsilon,\delta>0$ be given. Then there is a $c(\epsilon,\delta) = exp(poly(1/\epsilon,1/\delta))$ such that the following happens.

Suppose that for every function $f:\{0,1\}^n \rightarrow \{0,1\}$ computable by a circuit of size $\leq c\cdot s$ we have

$\displaystyle Pr_{x \in \{0,1\}^n} [ f(x) = g(x) ] \leq 1-\delta$

Then there is a set $H$ such that: (i) $H$ is recognizable by circuits of size $\leq c\cdot s$; (ii) $|H| \geq \delta 2^n$, and in fact the number of $x$ in $H$ such that $g(x)=0$ is at least $\frac 12 \delta 2^n$, and so is the number of $x$ in $H$ such that $g(x)=1$; and (iii) for every $f$ computable by a circuit of size $\leq s$,

$\displaystyle Pr_{x\in H} [ g(x) = f(x) ] \leq max \{ Pr_{x\in H}[ g(x) = 0] , Pr_{x\in H} [g(x)=1] \} + \epsilon$

The difference is that $H$ is now an efficiently recognizable set (which is good), but we are not able to derive the same strong average-case complexity of $g$ in $H$ (which, as discussed as the beginning, is impossible in general). Instead of proving that a “random guess algorithm” is near-optimal on $H$, we prove that a “fixed answer algorithm” is near-optimal on $H$. That is, instead of saying that no algorithm can do better than a random guess, we say that no algorithm can do better than either always outputting 0 or always outputting 1. Note that this conclusion is meaningless if $g$ is, say, always equal to 1 on $H$, but in our construction we have that $g$ is not exceedingly biased on $H$, and if $\epsilon < \delta/2$, say, then the conclusion is quite non-trivial.

One can also find a set $H'$ with the same type of average-case complexity as in the original Impagliazzo result by putting into $H'$ a $\frac 12 \delta 2^n$ size sample of elements $x$ of $H$ such that $g(x)=0$ and an equal size sample of elements of $H$ such that $g$ equals 1. (Alternatively, put in $H'$ all the elements of $H$ on which $g$ achieves the minority value of $g$ in $H$, then add a random sample of as many elements achieving the majority value.) Then we recover the original statement except that $c(\epsilon,\delta)$ is exponential instead of polynomial. [Update: constructing $H'$ is somewhat more complicated than we originally thought, the details are in the paper.]

Coming up next, the proof of the “constructive hard core set theorem” and my attempt at explaining what the techniques have to do with “finitary ergodic theory.”

## 13 thoughts on “The Impagliazzo Hard-Core-Set Theorem”

1. The first time I learned about hardcore sets, my first instinct found it strange that the interest was in guaranteeing that H is big. That is to say, if there were a hardcore set theorem that said that one could get a hardcore H that was at most XX in size, then that would say something like “these very small number of inputs alone already start to make the problem hard”. So, is there any lower bound (or a reason why it’s not interesting) on how big a set needs to be before it could possibly be hardcore? H can’t be too small else its answers could all be hardcoded…but is there any higher lower bound?

2. Hi Luca,

Thanks for the very interesting sequence of blogposts.

One question: Why is it that the size of the hardcore set H (both in Impagliazzo’s proof and your proof) is \delta 2^n and not twice that size? In fact, your example with the random g (which is 1 with prob 0.99), the size of H is 2\delta 2^n.

3. Jelani: if you have a proof with a large H, you can always also get a smaller H with more or less the same hardness, via random sampling, provided the size of the smaller H is slightly more than the logarithm of the number of algorithms allowed by your definition of “efficiency.” This is more or less tight, because otherwise, as you say, all the right answers could be encoded in the algorithm.

The reason why a larger H is interesting is that if the density of H is about 2\delta, then H itself accounts for almost all the errors of an algorithm that makes a delta fraction errors. So not only H is hard, but {0,1}^n – H is “easy,” so we really identify THE hard instances of the problem.

Prahladh: indeed having 2\delta is better and tighter. The proofs in Impagliazzo’s paper, and our proof, only give delta. Some later refinements of Impagliazzo’s proof, and an optimization of our proof (which, unfortunately, loses the ‘constructive’ statement given above) give the optimal 2\delta.

4. Could you either indicate the refinement that achieves 2\delta or a reference for the same.

Thanks
-ph

5. Hi Prahladh, this is one of those things that “everybody knows” but that I had never looked up.

The stronger statement appears asLemma 4 in Ryan’s paper on amplification of hardness in NP

http://www.cs.cmu.edu/~odonnell/papers/hardness.pdf

The result is attributed to the Klivans-Servedio paper on boosting and hard core sets that I link to above, but in the conference version they do not have an explicit statement to this effect.

Here is my guess of how it goes. Find a set H of size delta 2^n which is hard core. Any algorithm on H cannot make more than a 1/2 + eps fraction of errors, otherwise the complement of the algorithm would be too good. So every algorithm makes at least about 1-delta/2 fraction of errors in {0,1}^n – H. Now apply the hard-core lemma again, this time thinking of g as a function whose domain is {0,1}^n – H. Now we find H’ of density at least about delta/2 that is also hard core. We can keep going for a little while, and then take the union of these sets.

6. Thanks Luca for the explanation.

If I am right (your argument as well as the stmt in Ryan’s paper) achieves (2-\eps)\delta 2^n sized hardcore sets for every \eps and not exactly 2\delta 2^n. I just find it odd that all the straightforward proofs give \delta 2^n, and one needs to do some additional work to get close to 2\delta 2^n, which is the number one would expect in the first case.

Thanks,
-ph

7. Very interesting. What I’m most curious about is (a) whether the construction of H from f is uniform (this would seem unlikely) (b) if not, how much additional non-uniformity is required to get ckts for H from ckts for f.

8. Rahul, the reduction works in the other direction, you assume that for every H you have a function f that has some advantage in predicting g, and from that you want to construct a function that does very well in predicting g over the entire range of inputs. Typically the final function is of the form h(f1,…,fk), where f1,…,fk are functions that do well on certain sets H1,…,Hk, and h is a threshold function. (In our proof, h is arbitrary, possibly of circuit complexity 2^k)

9. Thanks, Luca. I guess what I don’t understand is that the sort of proof you describe only seems to show existence of a hard-core H; it’s unclear to me how the upper bound on ckt size of H is obtained (I’m referring throughout to your new constructive version). But maybe a future post on the proof will clarify matters…

10. Sorry to be a bother but who is the picture of on the previous page? The reason I ask is, its not of Russell.