Scribed by Jonah Sherman
Summary
Today we prove the Goldreich-Levin theorem.
1. Goldreich-Levin Theorem
We use the notation
Theorem 1 (Goldreich and Levin) Let be a permutation computable in time . Suppose that is an algorithm of complexity such that
Then there is an algorithm of complexity at most such that
Last time we proved the following partial result.
Lemma 2 (Goldreich-Levin Algorithm — Weak Version) Suppose we have access to a function such that, for some unknown , we have
where is an unknown string.
Then there is an algorithm GLW that runs in time and makes oracle queries into and, with probability at least , outputs .
This gave us a proof of a variant of the Goldreich-Levin Theorem in which the right-hand-side in (2) was . We could tweak the proof Lemma 2 so that the right-hand-side of (4) is , leading to proving a variant of the Goldreich-Levin Theorem in which the right-hand-side in (2) is also .
We need, however, the full Goldreich-Levin Theorem in order to construct a pseudorandom generator, and so it seems that we have to prove a strengthening of Lemma 2 in which the right-hand-side in (4) is .
Unfortunately such a stronger version of Lemma 2 is just false: for any two different we can construct an such that
and
so no algorithm can be guaranteed to find given an arbitrary function such that , because need not be uniquely defined by .
We can, however, prove the following:
Lemma 3 (Goldreich-Levin Algorithm) Suppose we have access to a function such that, for some unknown , we have
where is an unknown string, and is given.
Then there is an algorithm that runs in time , makes oracle queries into , and outputs a set such that and with probability at least , .
The Goldreich-Levin algorithm has other interpretations (an algorithm that learns the Fourier coefficients of , an algorithm that decodes the Hadamard code is sub-linear time) and various applications outside cryptography.
The Goldreich-Levin Theorem is an easy consequence of Lemma 3. Let take input and then run the algorithm of Lemma 3 with , yielding a list . then checks if for any , and outputs it if one is found.
From the assumption that
it follows by Markov’s inequality (See Lemma 9 in the last lecture) that
Let us call an such that a good . If we pick at random and give to the above algorithm, there is a probability at least that is good and, if so, there is a probability at least that is in the list. Therefore, there is a probability at least that the algorithm inverts , where the probability is over the choices of and over the internal randomness of the algorithm.
2. The Goldreich-Levin Algorithm
In this section we prove Lemma 3.
We are given an oracle such that for an fraction of the . Our goal will be to use to simulate an oracle that has agreement with , so that we can use the algorithm of Lemma 2 the previous section to find . We perform this “reduction” by “guessing” the value of at a few points.
We first choose random points where For the moment, let us suppose that we have “magically” obtained the values . Then define as the majority value of:
For each , the above expression equals with probability at least (over the choices of ) and by choosing we can ensure that
from which it follows that
Consider the following algorithm.
- Algorithm GL-First-Attempt
- pick where
- for all
- define as majority of:
- apply Algorithm GLW to
- add result to list
- return list
The idea behind this program is that we do not in fact know the values , but we can “guess” them by considering all choices for the bits If agrees with for at least a fraction of the s, then there is a probability at least that in one of the iteration we invoke algorithm GLW with a simulated oracle that has agreement with . Therefore, the final list contains with probability at least .
The obvious problem with this algorithm is that its running time is exponential in and the resulting list may also be exponentially larger than the bound promised by the Lemma.
To overcome these problems, consider the following similar algorithm.
- Algorithm GL
- pick where
- define for each non-empty
- for all
- define for each non-empty
- define as majority of: over non-empty
- apply Algorithm GLW to
- add result to list
- return list
Let us now see why this algorithm works. First we define, for any nonempty Then, since are random, it follows that for any and are independent and uniformly distributed. Now consider an such that and agree on a fraction of the values of . Then for the choice of where for all we have that
for every non-empty . In such a case, for every and every , there is a probability at least over the choices of the that
and these events are pair-wise independent. Note the following simple lemma.
Lemma 4 Let be a set of pairwise independent random variables, each of which is with probability at least Then .
Proof: Let . The variance of a 0/1 random variable is at most , and, because of pairwise independence, .
We then have
Lemma 4 allows us to upper-bound the probability that the majority operation used to compute gives the wrong answer. Combining this with our earlier observation that the are pairwise independent, we see that choosing suffices to ensure that and have agreement at least with probability at least . Thus we can use Algorithm to obtain with high probability. Choosing as above ensures that the list generated is of length at most and the running time is then with oracle accesses, due to the iterations of Algorithm GLW, that makes oracle accesses, and to the fact that one evaluation of requires evaluations of .
Pingback: Best query complexity of Goldreich-Levin / Kushilevitz-Mansour learning algorithm | MoVn - Linux Ubuntu Center
I did not read this text carefully
(since I’m currently preoccupied with something else),
but it seems to me that valuable info may be provided in
Section 2.5.2.4 of Vol 1 of my FoC book (*);
see Prop 2.5.4, and Footnotes 10 and 11…
Oded
*) http://www.wisdom.weizmann.ac.il/~oded/foc-vol1.html