In a previous post, I described abstract forms of the weak regularity lemma, in which we start from an arbitrary bounded function $g: X \rightarrow [-1,1]$ and an arbitrary set $F$ of “structured” bounded functions $f: X \rightarrow [-1,1]$, and we construct an “approximatng” function $h$ that has “low complexity” relative to $F$ and is “indistinguishable” from $g$ by functions from $F$. We had two proofs, both generalizations of arguments due to Frieze and Kannan: in one proof $h$ is not bounded, but its complexity is $O(\epsilon^{-2})$, where $\epsilon$ is the approximation parameter; in the second proof, $h$ is bounded but it has exponential complexity $exp(O(\epsilon^{-2})$ in the approximation parameter.

In a new paper by Madhur Tulsiani, Salil Vadhan, and I, we give a new proof that gives an approximating function that, at the same time, is bounded and has low complexity. This has a number of applications, which I will describe in a future post.

(Note that, in the statement below, $F$ is required to contain Boolean functions, rather than bounded ones. It is possible, however, to “reduce” the bounded case to the boolean case via the following observation: if $F$ is a family of bounded functions, $F'$ is the family of boolean functions $\{ 1_{f(x) \geq t} \ : \ f\in F, t\in [-1,1] \}$, and $g$ and $h$ are $\epsilon$-indistinguishable according to $F'$, then they are also $\epsilon$-indistinguishable according to $F$.)

Theorem (Low Complexity Approximation, TTV Version) Let $(X,\mu)$ be a probability space, $g:X \rightarrow [-1,1]$ a bounded function, $F$ a collection of boolean functions $f:X \rightarrow \{ -1,1\}$, and $\epsilon$ an approximation parameter.

Then there is a function $h: X \rightarrow [-1,1]$ such that

• $h$ has low complexity relative to $F$: there are $k= O( \epsilon^{-2})$ functions $f_i\in F$ and coefficients $c_i$ such that

$\displaystyle h(x) := \max \{ -1 , \min \{ 1, \sum_{i=1}^k c_i f_i (x) \}\}$

• $h$ is $\epsilon$-indistinguishable from $g$ by $F$, that is,

$\forall f.F \ \ \left| {\mathbb E}_{x \sim \mu} f(x) g(x) - {\mathbb E}_{x \sim \mu} f(x) h(x) \right| \leq \epsilon$

That is, $h$ is simply a linear combination of functions from $F$, whose value is truncated to be between $-1$ and $1$.

Recall that, when we do not require $h$ to be bounded, $h$ can be constructed via the following algorithm:

Algorithm FK

• $h_0 :=$ 0; $t= 0$
• while $\exists f_{t+1} \in F$ such that ${\mathbb E} f_{t+1} \cdot (g-h_t) \geq \epsilon$
• $h_{t+1} := \sum_{i=1}^{t+1} \gamma f_{i}$
• $t:=t+1$
• return $h_t$

And the analysis shows that, if we call $\Delta_t := g - h_t$, then ${\mathbb E} \Delta_t^2 - \Delta_{t+1}^2 \geq 2\epsilon\gamma - \gamma^2$. Setting $\gamma := \epsilon$ shows that the algorithm stops within $\epsilon^{-2}$ steps.

Our proof proceeds by doing exactly the same, but making sure at every step that $h$ is bounded.

Algorithm TTV

• $h_0 :=$ 0; $t= 0$
• while $\exists f_{t+1} \in F$ such that ${\mathbb E} f_{t+1} \cdot (g-h_t) \geq \epsilon$
• $h_{t+1} := \max \{ -1, \min \{ 1, \sum_{i=1}^{t+1} \gamma f_i \} \}$
• $t:=t+1$
• return $h_t$

The problem with the old analysis is that now we could conceivably have a step in which $\sum_{i=1}^t \gamma f_i (x) > 1+\gamma$ for all $x$, and so $h_t = h_{t+1}$, and thus $\Delta_t = \Delta_{t+1}$, and we have no energy decrease. To get to such a state, however, there must have been about $1/\gamma$ steps in which $h$ changed in the same way as in the Frieze-Kannan case. I think there should be a sort of “amortized analysis” that would “charge” the lack of change of $h$ at certain steps to other steps in which it indeed changes, and establish that, for every time step $T$,

${\mathbb E} \Delta_0^2 - \Delta_T^2 \geq \Omega(T \cdot \gamma \cdot (\epsilon - O(\gamma))$

Unfortunately I don’t know how to construct such a proof. Our proof, instead, follows step by step Impagliazzo’s proof of the Impagliazzo Hardcore Set Lemma, which employs a very similar algorithm (in a quite different context). As shown by Klivans and Servedio, the algorithm in Impagliazzo’s proof can be seen as a boosting algorithm in learning theory (and every other known boosting algorithm can be used to prove Impagliazzo’s Hardcore Lemma); this is why we think of our proof as a “boosting” proof.

I must confess, I have never really understood Impagliazzo’s proof, and so I can’t say I really understand ours, other than being able to reproduce the steps.

The idea is to consider the quantity $f_{t+1}(x) \Delta_{t}(x)$, which depends on $t$ and $x$, and to see how it behaves summed over $t$, for a fixed $x$, and how it behaves for a fixed $t$, summed over $x$.

Suppose the algorithm is still running after $T$ steps. Then, because of the way we define the termination condition, we have, for every $t \in \{ 0,\ldots, T-1\}$

$\displaystyle (1) \ \ \ {\mathbb E}_x f_{t+1}(x) (g(x)-h_t(x)) \geq \epsilon$

and the crux of the proof is to show that for every $x$ we have

$(2) \displaystyle \ \ \ \sum_{i=0}^{T-1} f_{t+1}(x) (g(x)-h_t(x)) \leq \frac \gamma 2 \cdot T + \frac 4 {\gamma}$

So if we sum (1) over $t$ and average (2) over $x$, we get

$\displaystyle \epsilon T \leq \frac \gamma 2 \cdot T + \frac 4 {\gamma}$

and setting $\gamma := \epsilon$ gives $T \leq 8 \epsilon^{-2}$.

Inequality (2), the main step of the proof, is the part for which I have little intuition. One breaks the summation into groups of time steps, depending on the value of $\Delta_t$; there are $2/\gamma$ groups (because the value of $h_t$ changes by discrete increments of $\gamma$, and is between $-1$ and $+1$) and each one is shown to contribute $O(\gamma T') + O(1)$, where $T'$ is the number of time steps in the group.

It is perhaps instructive to translate the Frieze-Kannan proof to this set-up. In the Frieze-Kannan algorithm, we have

$\displaystyle f_{t+1}(x) (g(x)-h_t(x)) = \frac 1 {2\gamma} \cdot \left( \Delta_{t}^2 - \Delta^2_{t+1} +\gamma^2 f_{t+1}^2 \right)$

and so

$\begin{array}{ll} \displaystyle \sum_{t=0}^{T-1} f_{t+1}(x) (g(x)-h_t(x)) & = \displaystyle \frac 1{2\gamma} (\Delta_0^2 - \Delta_{T}^2) + \frac {\gamma}2 \cdot \sum_{t=1}^{T} f_t^2(x) \\ & \leq \displaystyle \frac 1 {2\gamma} + \frac \gamma 2 \cdot T \end{array}$

which is analogous to our inequality (2).