# The Impagliazzo Hard-Core Lemma for the Mathematician

At the conference I was at last week, I was asked by a mathematician about the exact statement of the Impagliazzo Hard-Core Lemma, and whether I could state it without saying “algorithm.”

So I gave the following statement:

Theorem 1 (Impagliazzo Hard-Core Lemma — Basic Version) Let ${\cal F}$ be a class of boolean functions ${f: \{ 0,1 \}^n \rightarrow \{ -1,1\}}$, let ${g: \{ 0,1 \}^n \rightarrow \{ -1,1\}}$ be an arbitrary function, and let ${\epsilon,\delta>0}$ be arbitrary parameters.

Then at least one of the following conditions is true:

1. ${g}$ is globally well approximated by a simple composition of a few functions from ${\cal F}$.

That is, there are functions ${f_1,\ldots,f_k \in \cal F}$, ${k = \epsilon^{-O(1)} \cdot \delta^{-O(1)}}$ and a “simple” function ${h: \{ 0,1 \}^k \rightarrow \{ 0,1 \}}$ such that

$\displaystyle \mathop{\mathbb P}_{x \in \{ 0,1 \}^n} [ g(x) = h(f_1(x),\ldots,f_k(x)) ] \geq 1-\delta$

2. ${g}$ is locally almost orthogonal to ${\cal F}$.

That is there is a subset ${H\subseteq \{ 0,1 \}^n}$ of density at least ${\delta}$ such that for every ${f\in \cal F}$,

$\displaystyle \left| \mathop{\mathbb E}_{x\in H} f(x)g(x) \right| := \left| \frac 1{|H|} \sum_{x\in H} f(x) g(x) \right| \leq \epsilon$

The theorem has been strengthened in a couple of ways in work of Klivans and Servedio and Holenstein. The “complexity” parameter ${k}$ needs to be only ${O(\epsilon^{-2} \log \delta^{-1})}$ which is conjectured (or known?) to be tight, and the density of the set ${H}$ can be made ${2\delta}$, which is tight. In all proofs ${h}$ is really simple, being more or less a threshold function applied to the sum of the ${f_i}$.

Stating the result in this way (which is, of course, standard) raises a few questions that I find rather interesting.

Lower Bounds on ${k}$?

I don’t know if a lower bound on ${k}$ is known, but I think it’s helpful to see why the statement would be false if we tried to approximate ${g}$ with one function from ${\cal F}$ in the first case.

Take ${g}$ to be a random function, pick an arbitrary partition of ${\{ 0,1 \}^n}$ into three subsets ${A_1,A_2,A_3}$, and define ${{\cal F} = \{ f_1,f_2,f_3 \}}$ such that ${f_i = g}$ on ${\{ 0,1 \}^n - A_i}$ and ${f_i= - g}$ on ${A_i}$. Pick ${\delta = .3}$. Now we see that there is no ${f_i}$ that agrees with ${g}$ on a ${\geq 1-\delta}$ fraction of inputs. On the other hand, no matter how we pick a subset ${H}$ of density ${.3}$ of ${\{ 0,1 \}^n}$, there is at least a function ${f_i}$ that agrees with ${g}$ on ${H}$ on at least a ${2/3}$ fraction of inputs. (Note that the sets ${H\cap A_i}$ are a partition of ${H}$, and at least one of the three must have size less than ${|H|/3}$.)

If I am not mistaken, this argument yields a lower bound ${\Omega(\log 1/\delta)}$ for constant ${\epsilon}$. How does one argue an ${\Omega(\epsilon^{-2})}$ lower bound?

From Boolean Functions to Bounded Functions

When one abstracts complexity theory results as we have just done, as results about classes of boolean functions, usually the same proofs also give more general results about bounded functions, which correspond to probabilistic algorithms.

A scaling argument suggests that the analog of the Lemma for bounded functions would be that for every set of ${\cal F}$ of functions of ${f: \{ 0,1 \}^n \rightarrow [-1,1]}$, every function ${g: \{ 0,1 \}^n \rightarrow [-1,1]}$, and every parameters ${\epsilon, \delta >0}$, at least one of the following holds:

1. there are functions ${f_1,\ldots,f_k \in \cal F}$ and a combining function ${h}$ such that

$\displaystyle || g - h(f_1(\cdot),\ldots,f_k(\cdot)) || _2^2 = \mathop{\mathbb E}_{x \in \{ 0,1 \}^n} | g(x) - h(f_1(x),\ldots,f_k(x)) |^2 \leq \delta$

2. there is a subset ${H\subseteq \{ 0,1 \}^n}$ of density at least ${\Omega( \delta )}$ such that for every ${f\in \cal F}$,

$\displaystyle \left| \mathop{\mathbb E}_{x\in H} f(x)g(x) \right| := \left| \frac 1{|H|} \sum_{x\in H} f(x) g(x) \right| \leq \epsilon$

But this turns out to be false. Take ${g}$ to be a random function ${g: \{ 0,1 \}^n \rightarrow \{ 1/2 , 1 \}}$ and take ${\cal F}$ to be any class of functions and ${\epsilon,\delta}$ to be parameters such that (1) is not true even for large ${k}$ (for example, take ${\cal F}$ to be class of functions computable by ${n^3}$ size circuits and ${\epsilon,\delta}$ to be inverse polynomial in ${n}$); now note that taking ${g}$ has inner product at least ${1/2}$ with the constant-one function on every subset ${H}$.

Is there a nice way to generalize the Hard-Core Lemma to bounded functions? (Partial answer in the next section.)

Explicit ${H}$

When formulating complexity-theoretic results in this abstract form, I often find it useful to see what happens if I instantiate the class of functions that are meant to represent efficient algorithms with the class of Fourier characters of ${{\mathbb F}_2^n}$. Typically, one obtains a basic fact from Fourier analysis, but with an unusual proof that makes use of no properties of the characters other than they are a family of functions, which can be nice to see. Sometimes, one obtains a statement that is not trivial, and presenting the proof in this special case can be a nice way to introduce a complexity-theoretic result to mathematicians.

Here, if you try this experiment, you get something that is not standard, but also not very meaningful: that a boolean functions is either well approximating by a function of a few characters (that is, a function that is constant on the cosets of a rather large subspace), or else there is a large subset on which the function has very low correlation with all characters. The first case is an interesting one, but the second one sounds strange, because we don’t know anything about the set.

It would be nicer if we could say: either ${g}$ is well approximated by a function of few characters, or there is a subspace ${H}$ such that ${g}$ is nearly orthogonal to all characters when restricted to ${H}$. This would correspond to a version of the Hard-Core Lemma in which the characteristic function of the set ${H}$ is itself a simple combinations of functions from ${\cal F}$. Unfortunately, this stronger statement is false in general, and it is even false in the special case of the Fourier analysis application: take ${g}$ to be 1 with probability ${3/4}$ and ${-1}$ with probability ${1/4}$, and you see that it is not approximable, but also that the constant-one character has large correlation in all large subspaces.

Maybe we could say that either ${g}$ is globally approximable, or else there is a large subspace ${H}$ such that ${g}$ restricted to ${H}$ has low correlation with all non-constant characters.

In fact, I think that the following statement is true via the same density-increment argument we have in this paper with Reingold, Tulsiani and Vadhan, where we prove a slightly different version:

Theorem 2 (Impagliazzo Hard-Core Lemma — Constructive ${H}$) Let ${\cal F}$ be a class of boolean functions ${f: \{ 0,1 \}^n \rightarrow \{ -1,1\}}$, let ${g: \{ 0,1 \}^n \rightarrow \{ -1,1\}}$ be an arbitrary function, and let ${\epsilon,\delta>0}$ be arbitrary parameters.

Then at least one of the following conditions is true:

1. ${g}$ is globally well approximated by a simple composition of a few functions from ${\cal F}$.

That is, there are functions ${f_1,\ldots,f_k \in \cal F}$, ${k = \epsilon^{-O(1)} \cdot \delta^{-O(1)}}$ and a function ${h: \{ 0,1 \}^k \rightarrow \{ 0,1 \}}$ such that

$\displaystyle \mathop{\mathbb P}_{x \in \{ 0,1 \}^n} [ g(x) = h(f_1(x),\ldots,f_k(x)) ] \geq 1-\delta$

2. ${g}$ is locally almost orthogonal to ${\cal F}$ after a shift.

That is there is a subset ${H\subseteq \{ 0,1 \}^n}$ of density at least ${\Omega(\delta)}$, such that ${1_H(x) = h(f_1(x),\ldots,f_k(x))}$ for ${k = \epsilon^{-O(1)} \cdot \delta^{-O(1)}}$, ${f_1,\ldots,f_k \in \cal F}$, and for function ${h: \{ 0,1 \}^k \rightarrow \{ 0,1 \}}$, such that for every ${f\in \cal F}$,

$\displaystyle \left| \mathop{\mathbb E}_{x\in H} f(x)\cdot \left( g(x) - \mathop{\mathbb E}_{x\in H} g(x) \right) \right| \leq \epsilon$

The only difference between the above statement and the one in the RTTV paper, Lemma 3.1 is that there Case (2) has a weaker form equivalent to

$\displaystyle \left| \mathop{\mathbb E}_{x\in H} f(x)\cdot g(x) \right| \leq \left| \mathop{\mathbb E}_{x\in H} g(x) \right| + \epsilon$

but, unless I made a mistake in the calculations I just did, the same proof works for the Case (2) in the theorem stated above.

In fact I think that this statement would also work for general bounded functions, reformulating Case (1) to require ${\ell^2_2}$ distance at most ${\delta}$ instead of agreement at least ${1-\delta}$.

While ${k}$ is polynomial in ${\epsilon^{-1}}$ and ${\delta^{-1}}$, the function ${h}$ might have exponential complexity in ${k}$ in the only proof I know of of Theorem 2, so the results discussed in this section have limited applicability in complexity theory. Is it possible to prove a constructive hard-core lemma in which the combining function has polynomial circuit complexity in ${\epsilon^{-1}}$ and ${\delta^{-1}}$?

Holenstein proves a constructive hard-core lemma in which ${H}$ is samplable if the distribution ${(x,g(x))}$ is samplable, and in a paper with Tulsiani and Vadhan we show that the characteristic function of ${H}$ can be made efficiently computable given oracle access to ${g}$, but the question I would like to raise is whether, as in the above statement, one can have ${H}$ fully polynomially computable for an arbitrary ${g}$ to which we have no access of any kind, but with the “shifted inner product” condition in Case (2).

## 6 thoughts on “The Impagliazzo Hard-Core Lemma for the Mathematician”

1. How about the question of finding an explicit set f_1,..,f_k (and h), given that the second condition does not hold (i.e., for every dense enough set H, there is some f \in F, which has \epsilon advantage in predicting better than random)?

Can’t Boosting be thought of as a constructive method for doing this (given an oracle for picking an \epsilon-good f for any H)?

2. Not really, because boosting gives you a way to sample H if you can sample from (x,g(x)), which is the (useful!) setup in which Holenstein’s proof is constructive.

I would like the notion of constructivity in which the characteristic function (or the pointwise probability of elements of H, in the non-flat case) is efficiently computable with no access to g whatsoever. This is provably impossible for the standard definition of hard-core set, but it’s plausible for weaker definitions, and true for weaker definitions such as the “shifted low correlation” in this post or “no advantage over constant functions” in the RTTV paper, but with exponential complexity in $\epsilon$ and $\delta$.

3. Hi Luca,

Freund’s boosting-by-majority paper has a lower bound of 1/epsilon^2 log(1/delta) on the number of boosting stages. It probably can be adapted to the hard-core lemma setting.

Also you seem to suggest that Holenstein’s hard-core set lemma is more constructive than say Impagliazzo’s original proof (I might have also seen this claim in Barak-Hardt-Kale paper). I wonder in which sense it is more constructive.
As far as I understand in either of those proofs H is efficiently samplable given random samples from (x,g(x)). This is also true about any hard-core set construction obtained from an efficient boosting algorithm (such as those given by Klivans and Servedio). Just to avoid confusion I’m referring to Impagliazzo’s construction based on incremental updates and not Nisan’s Min-max based proof of existence (also found in Impagliazzo’s paper).

4. Hi Vitaly, that’s a good point, probably Impagliazzo’s iterative proof too can be made constructive in the sense that H is samplable if (x,g(x)) is samplable. Holenstein was the first to make this claim explicit