In the last post, I stated the following generalization of the weak regularity lemma:
Theorem (Low Complexity Approximation, TTV Version) Let be a probability space,
a bounded function,
a collection of bounded functions
, and
an approximation parameter.
Then there is a function such that
-
has low complexity relative to
: there are
functions
and coefficients
such that
-
is
-indistinguishable from
by
, that is,
(Last time, I mentioned that our proof handled only boolean functions ; now we can handle arbitrary bounded functions, and with an “energy-decrease” style proof, this will appear in the next online revision of the paper.)
This seems to be a useful tool, limited only by one’s creativity in choosing the functions and then making use of the properties of
.
As already discussed,
- if one takes
to be the edges of a complete graph, and
the set of indicator variables of cuts, then the existence of
gives the weak regularity lemma of Frieze and Kannan; and
- if one takes
to be the set of circuits of size at most
, and normalizes
and
to be probability distributions, one gets that for every probability distribution
of high entropy there is a (non-uniformly) efficiently samplable and computable distribution
that is indistinguishable from
by circuits of size
.
In this post I’ll show how to use it to give short proofs of the Hardcore Lemma of Impagliazzo and the Dense Model Theorem of Green, Tao and Ziegler. Both proofs also have, at least in hindsight, a sense of “inevitability,” meaning that given the Low-Complexity Approximation Theorem, and given what we want to prove, both proofs get to the point in a most economical and natural way.
- The Impagliazzo Hardcore Lemma. We have already mentioned that if
is “hard-on-average” for
, then
cannot be an approximation in the sense of being close to
on most inputs. What, then, about the points on which
and
differ? They form an Impagliazzo Hardcore Set for
, as described next.
Let
be a function that is weakly hard on average for a class of algorithms
. Suppose, specifically, that for every algorithm
of complexity
relative to
we have
and, more generally, for fractional
, we have
Then, construct an approximating function
of complexity
relative to
and such that
and
are
-indistinguishable by
. Note that, even though
is “indistinguishable” from
, it is also “far” from
, as in (1).
Define a probability distribution
that assigns to point
a probability proportional to
. (If
were boolean, this would be the uniform distribution over points on which
and
differ.) Then this distribution is
-dense in the uniform distribution, meaning that every point has probability at most
. Observe also that we have
for every
, because
and
have the same sign and
, so we have
and so
is a hardcore distribution, because the above expression is equivalent to
- The Dense Model Theorem. Suppose that
is a pseudorandom set with respect to functions that have bounded complexity relative to
, and let
be a dense subset of
,
.
To find a dense model of
, we take
to be the characteristic function of
, and we let
be the low-complexity approximation, but using the uniform distribution on
as
.. Now suppose for simplicity that
is boolean, and that
is the set of inputs of
on which
is 1. We want to argue that
is a dense model of
. By assuming without loss of generality that
contains the all-one function, we get from the indistinguishability of
and
that
and from the pseudorandomness of
we have
and so
and
is indeed dense.
For the indistinguishability of
and
, take any function
, and observe that
where we use both the indistinguishability of
and
under distribution
, and the fact that the distributions
and
are indistinguishable by functions of bounded complexity.
This proof is appealingly intuitive, in the sense that if we expect
to be indistinguishable from a large set, then when we try to approximate the characteristic function of
we will end up with a low complexity function that is spread around much of
, thus defining a dense model. It also shows that “relative” versions of the Regularity Lemma, such as the Regularity Lemma for subgraphs of an expander, may be derived from the regular Lemma by the above argument. A disadvantage of the argument is that it does not establish the stronger form of the Dense Model Theorem suggested by Impagliazzo, in which there is no set
, but we require
to have the “pseudo-density” requirement that for every low-complexity bounded function
,
which follows immediately if
has density
in a pseudorandom set
, but that is a seemingly weaker property. (The relative Regularity Lemma in graphs had long be known to hold under such a pseudo-density assumption.)

3 comments
Comments feed for this article
December 10, 2008 at 3:59 pm
Mohammad
Luca, For reproof of “The Impagliazzo Hardcore Lemma”, it seems that the distribution that you get is actually 2\delta dense which is the stronger form of the Lemma by Holenstein.
But, the issue is that you have a bit stronger assumption. If boolean functions fail on \delta fraction, it does not mean that the fractional ones fail (will be far) by at least 2 \delta. Or maybe I am missing something…
Russel showed that the dense model theorem reduces to strong HC lemma. Now it seems the bounded decomposition is the right way to look at the theorems which proves both of the other ones easily (and it is not clear if it can be easily reduced to HC lemma)
December 10, 2008 at 6:13 pm
luca
Hi Mohammad I think you are right, now I am not sure why we thought we had density delta.
As for your other other question, if
E_x | g(x) – h(x) | = \delta
then there is a threshold t such that defining
h_t(x) = 1 iff h(x) \geq t
we have
Pr_x [ g(x) \neq h_t(x) ] \leq \delta/2
because, for every point x, we have
Pr_t [ g(x) \neq h_t(x) ] = (1/2) * |g(x)-h(x)|
(the threshold is picked uniformly between -1 and 1,
and there is only an interval of length |g(x)-h(x)|
which leads to g being different from h_t; note that
here we must assume that g is boolean)
And a threshold applied to h is just a threshold applied
to a sum of f_i, so it still has the same low complexity.
Indeed, in retrospect, I think that if we combine the “boosting”
proof of the low-complexity approximation with this proof,
we get exactly Holenstein’s proof of the hard-core lemma.
July 22, 2011 at 3:43 pm
s
but isn’t |g-h| in the setting that g is {-1, 1} boolean not a valid probability measure, since |g-h| = 2 if g =1, h = -1. so the measure you need is actually |g-h|/2 implying that the measure is only \delta dense?