In a previous post, I described abstract forms of the weak regularity lemma, in which we start from an arbitrary bounded function
and an arbitrary set
of “structured” bounded functions
, and we construct an “approximatng” function
that has “low complexity” relative to
and is “indistinguishable” from
by functions from
. We had two proofs, both generalizations of arguments due to Frieze and Kannan: in one proof
is not bounded, but its complexity is
, where
is the approximation parameter; in the second proof,
is bounded but it has exponential complexity
in the approximation parameter.
In a new paper by Madhur Tulsiani, Salil Vadhan, and I, we give a new proof that gives an approximating function that, at the same time, is bounded and has low complexity. This has a number of applications, which I will describe in a future post.
(Note that, in the statement below,
is required to contain Boolean functions, rather than bounded ones. It is possible, however, to “reduce” the bounded case to the boolean case via the following observation: if
is a family of bounded functions,
is the family of boolean functions
, and
and
are
-indistinguishable according to
, then they are also
-indistinguishable according to
.)
Theorem (Low Complexity Approximation, TTV Version) Let
be a probability space,
a bounded function,
a collection of boolean functions
, and
an approximation parameter.
Then there is a function
such that
That is,
is simply a linear combination of functions from
, whose value is truncated to be between
and
.
Recall that, when we do not require
to be bounded,
can be constructed via the following algorithm:
Algorithm FK
0;
- while
such that
- return
And the analysis shows that, if we call
, then
. Setting
shows that the algorithm stops within
steps.
Our proof proceeds by doing exactly the same, but making sure at every step that
is bounded.
Algorithm TTV
0;
- while
such that
- return
The problem with the old analysis is that now we could conceivably have a step in which
for all
, and so
, and thus
, and we have no energy decrease. To get to such a state, however, there must have been about
steps in which
changed in the same way as in the Frieze-Kannan case. I think there should be a sort of “amortized analysis” that would “charge” the lack of change of
at certain steps to other steps in which it indeed changes, and establish that, for every time step
,

Unfortunately I don’t know how to construct such a proof. Our proof, instead, follows step by step Impagliazzo’s proof of the Impagliazzo Hardcore Set Lemma, which employs a very similar algorithm (in a quite different context). As shown by Klivans and Servedio, the algorithm in Impagliazzo’s proof can be seen as a boosting algorithm in learning theory (and every other known boosting algorithm can be used to prove Impagliazzo’s Hardcore Lemma); this is why we think of our proof as a “boosting” proof.
I must confess, I have never really understood Impagliazzo’s proof, and so I can’t say I really understand ours, other than being able to reproduce the steps.
The idea is to consider the quantity
, which depends on
and
, and to see how it behaves summed over
, for a fixed
, and how it behaves for a fixed
, summed over
.
Suppose the algorithm is still running after
steps. Then, because of the way we define the termination condition, we have, for every 

and the crux of the proof is to show that for every
we have

So if we sum (1) over
and average (2) over
, we get

and setting
gives
.
Inequality (2), the main step of the proof, is the part for which I have little intuition. One breaks the summation into groups of time steps, depending on the value of
; there are
groups (because the value of
changes by discrete increments of
, and is between
and
) and each one is shown to contribute
, where
is the number of time steps in the group.
It is perhaps instructive to translate the Frieze-Kannan proof to this set-up. In the Frieze-Kannan algorithm, we have

and so

which is analogous to our inequality (2).