At the conference I was at last week, I was asked by a mathematician about the exact statement of the Impagliazzo Hard-Core Lemma, and whether I could state it without saying “algorithm.”
So I gave the following statement:
Theorem 1 (Impagliazzo Hard-Core Lemma — Basic Version) Let
be a class of boolean functions
, let
be an arbitrary function, and let
be arbitrary parameters.
Then at least one of the following conditions is true:
is globally well approximated by a simple composition of a few functions from
.
That is, there are functions
,
and a “simple” function
such that
is locally almost orthogonal to
.
That is there is a subset
of density at least
such that for every
,
The theorem has been strengthened in a couple of ways in work of Klivans and Servedio and Holenstein. The “complexity” parameter needs to be only
which is conjectured (or known?) to be tight, and the density of the set
can be made
, which is tight. In all proofs
is really simple, being more or less a threshold function applied to the sum of the
.
Stating the result in this way (which is, of course, standard) raises a few questions that I find rather interesting.
Lower Bounds on ?
I don’t know if a lower bound on is known, but I think it’s helpful to see why the statement would be false if we tried to approximate
with one function from
in the first case.
Take to be a random function, pick an arbitrary partition of
into three subsets
, and define
such that
on
and
on
. Pick
. Now we see that there is no
that agrees with
on a
fraction of inputs. On the other hand, no matter how we pick a subset
of density
of
, there is at least a function
that agrees with
on
on at least a
fraction of inputs. (Note that the sets
are a partition of
, and at least one of the three must have size less than
.)
If I am not mistaken, this argument yields a lower bound for constant
. How does one argue an
lower bound?
From Boolean Functions to Bounded Functions
When one abstracts complexity theory results as we have just done, as results about classes of boolean functions, usually the same proofs also give more general results about bounded functions, which correspond to probabilistic algorithms.
A scaling argument suggests that the analog of the Lemma for bounded functions would be that for every set of of functions of
, every function
, and every parameters
, at least one of the following holds:
- there are functions
and a combining function
such that
- there is a subset
of density at least
such that for every
,
But this turns out to be false. Take to be a random function
and take
to be any class of functions and
to be parameters such that (1) is not true even for large
(for example, take
to be class of functions computable by
size circuits and
to be inverse polynomial in
); now note that taking
has inner product at least
with the constant-one function on every subset
.
Is there a nice way to generalize the Hard-Core Lemma to bounded functions? (Partial answer in the next section.)
Explicit
When formulating complexity-theoretic results in this abstract form, I often find it useful to see what happens if I instantiate the class of functions that are meant to represent efficient algorithms with the class of Fourier characters of . Typically, one obtains a basic fact from Fourier analysis, but with an unusual proof that makes use of no properties of the characters other than they are a family of functions, which can be nice to see. Sometimes, one obtains a statement that is not trivial, and presenting the proof in this special case can be a nice way to introduce a complexity-theoretic result to mathematicians.
Here, if you try this experiment, you get something that is not standard, but also not very meaningful: that a boolean functions is either well approximating by a function of a few characters (that is, a function that is constant on the cosets of a rather large subspace), or else there is a large subset on which the function has very low correlation with all characters. The first case is an interesting one, but the second one sounds strange, because we don’t know anything about the set.
It would be nicer if we could say: either is well approximated by a function of few characters, or there is a subspace
such that
is nearly orthogonal to all characters when restricted to
. This would correspond to a version of the Hard-Core Lemma in which the characteristic function of the set
is itself a simple combinations of functions from
. Unfortunately, this stronger statement is false in general, and it is even false in the special case of the Fourier analysis application: take
to be 1 with probability
and
with probability
, and you see that it is not approximable, but also that the constant-one character has large correlation in all large subspaces.
Maybe we could say that either is globally approximable, or else there is a large subspace
such that
restricted to
has low correlation with all non-constant characters.
In fact, I think that the following statement is true via the same density-increment argument we have in this paper with Reingold, Tulsiani and Vadhan, where we prove a slightly different version:
Theorem 2 (Impagliazzo Hard-Core Lemma — Constructive
) Let
be a class of boolean functions
, let
be an arbitrary function, and let
be arbitrary parameters.
Then at least one of the following conditions is true:
is globally well approximated by a simple composition of a few functions from
.
That is, there are functions
,
and a function
such that
is locally almost orthogonal to
after a shift.
That is there is a subset
of density at least
, such that
for
,
, and for function
, such that for every
,
The only difference between the above statement and the one in the RTTV paper, Lemma 3.1 is that there Case (2) has a weaker form equivalent to
but, unless I made a mistake in the calculations I just did, the same proof works for the Case (2) in the theorem stated above.
In fact I think that this statement would also work for general bounded functions, reformulating Case (1) to require distance at most
instead of agreement at least
.
While is polynomial in
and
, the function
might have exponential complexity in
in the only proof I know of of Theorem 2, so the results discussed in this section have limited applicability in complexity theory. Is it possible to prove a constructive hard-core lemma in which the combining function has polynomial circuit complexity in
and
?
Holenstein proves a constructive hard-core lemma in which is samplable if the distribution
is samplable, and in a paper with Tulsiani and Vadhan we show that the characteristic function of
can be made efficiently computable given oracle access to
, but the question I would like to raise is whether, as in the above statement, one can have
fully polynomially computable for an arbitrary
to which we have no access of any kind, but with the “shifted inner product” condition in Case (2).

6 comments
Comments feed for this article
March 12, 2010 at 5:39 pm
Hoeteck
I believe Lu, Tsai and Wu gave a lower bound on k in
“On the Complexity of Hard-Core Set Constructions”, ICALP ’07
http://www.springerlink.com/content/e884424t02lw0572/
March 13, 2010 at 9:59 am
anon
How about the question of finding an explicit set f_1,..,f_k (and h), given that the second condition does not hold (i.e., for every dense enough set H, there is some f \in F, which has \epsilon advantage in predicting better than random)?
Can’t Boosting be thought of as a constructive method for doing this (given an oracle for picking an \epsilon-good f for any H)?
March 13, 2010 at 10:04 am
anon above
(Except that Boosting works with distributions that are not flat over H…)
March 13, 2010 at 11:55 am
luca
Not really, because boosting gives you a way to sample H if you can sample from (x,g(x)), which is the (useful!) setup in which Holenstein’s proof is constructive.
I would like the notion of constructivity in which the characteristic function (or the pointwise probability of elements of H, in the non-flat case) is efficiently computable with no access to g whatsoever. This is provably impossible for the standard definition of hard-core set, but it’s plausible for weaker definitions, and true for weaker definitions such as the “shifted low correlation” in this post or “no advantage over constant functions” in the RTTV paper, but with exponential complexity in
and
.
March 13, 2010 at 9:34 pm
Vitaly
Hi Luca,
Freund’s boosting-by-majority paper has a lower bound of 1/epsilon^2 log(1/delta) on the number of boosting stages. It probably can be adapted to the hard-core lemma setting.
Also you seem to suggest that Holenstein’s hard-core set lemma is more constructive than say Impagliazzo’s original proof (I might have also seen this claim in Barak-Hardt-Kale paper). I wonder in which sense it is more constructive.
As far as I understand in either of those proofs H is efficiently samplable given random samples from (x,g(x)). This is also true about any hard-core set construction obtained from an efficient boosting algorithm (such as those given by Klivans and Servedio). Just to avoid confusion I’m referring to Impagliazzo’s construction based on incremental updates and not Nisan’s Min-max based proof of existence (also found in Impagliazzo’s paper).
March 14, 2010 at 12:07 pm
luca
Hi Vitaly, that’s a good point, probably Impagliazzo’s iterative proof too can be made constructive in the sense that H is samplable if (x,g(x)) is samplable. Holenstein was the first to make this claim explicit