Scribed by Anupam Prakash

Summary

Today we finish the analysis of a construction of a pseudorandom permutation (block cipher) given a pseudorandom function.

1. The Luby-Rackoff Construction

Recall that if {F:\{ 0,1 \}^m \rightarrow \{ 0,1 \}^m} is a function, then we define the Feistel permutation {D_F: \{ 0,1 \}^{2m} \rightarrow \{ 0,1 \}^{2m}} associated with {F} as

\displaystyle   D_F(x,y) := y, x\oplus F(y) \ \ \ \ \ (1)

Let {F: \{ 0,1 \}^k \times \{ 0,1 \}^m \rightarrow \{ 0,1 \}^m} be a pseudorandom function, we define the following function {P: \{ 0,1 \}^{4k} \times \{ 0,1 \}^{2m} \rightarrow \{ 0,1 \}^{2m}}: given a key {\overline K (K_1,\ldots,K_4)} and an input {x},

\displaystyle   P_{\overline K} (x) := D_{F_{K_4}} ( D_{F_{K_3}} ( D_{F_{K_2}} (D_{F_{K_1}} (x ))))\ \ \ \ \ (2)

If {\overline F = F_1,F_2,F_3,F_4} are four functions, then {P_{\overline F}} is the same as the above construction but using the functions {F_i}:

\displaystyle   P_{\overline F} (x) := D_{F_{4}} ( D_{F_{3}} ( D_{F_{2}} (D_{F_{1}} (x ))))\ \ \ \ \ (3)

If {A} is an oracle algorithm, we define as {S(A)} the probabilistic process in which we run a simulation of {A} in which we reply to each query with a random answer.

2. Today’s Proof

The proof of the following result is what was missing from yesterday’s analysis.

Lemma 1 For every non-repeating algorithm {A} of complexity {\leq t} we have

\displaystyle  \left| \mathop{\mathbb P}_{\overline {F} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () =1 ] - \mathop{\mathbb P} [ S(A) = 1] \right|

\displaystyle  \leq \frac{t^2}{ 2\cdot 2^{2m}} + \frac {t^2}{ 2^{m}}

Proof: The transcript of {A}‘s computation consists of all the oracle queries made by {A}. The notation {(x,y,0)} represents a query to the {\pi} oracle at point {x} while {(x,y,1)} is a query made to the {\pi^{-1}} oracle at {y}. The set {T} consists of all valid transcripts for computations where the output of {A} is 1 while {T^{'} \subset T} consists of transcripts in {T} consistent with {\pi} being a permutation.

We write the difference in the probability of {A} outputting 1 when given oracles {(P_{\overline{R}}, P_{\overline{R}}^{-1})} and when given a random oracle as in {S(A)} as a sum over transcripts in {T}.

\displaystyle  \begin{array}{c} \left| \mathop{\mathbb P}_{\overline {F} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () =1 ] - \mathop{\mathbb P} [ S(A) = 1] \right| \\ = \left| \sum_{\tau \in T} \left(\mathop{\mathbb P}_{\overline {F} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () \leftarrow \tau ] - \mathop{\mathbb P} [ S(A) \leftarrow \tau]\right) \right| \end{array} \ \ \ \ \ (4)

We split the sum over {T} into a sum over {T^{'}} and {T\setminus T^{'}} and bound both the terms individually. We first handle the simpler case of the sum over {T\setminus T^{'}}.

\displaystyle   \begin{array}{rcl} && \displaystyle \left| \sum_{\tau \in T\setminus T^{'}} \left(\mathop{\mathbb P}_{\overline {F} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () \leftarrow \tau ] - \mathop{\mathbb P} [ S(A) \leftarrow \tau] \right) \right| \\ & =& \displaystyle \left| \sum_{\tau \in T\setminus T^{'}} \left(\mathop{\mathbb P} [ S(A) \leftarrow \tau] \right) \right| \\ & \leq & \displaystyle \frac{t^{2}}{2.2^{2m}} \end{array} \ \ \ \ \ (5)

The first equality holds as a transcript obtained by running {A} using the oracle {(P_{\overline{R}}, P_{\overline{R}}^{-1})} is always consistent with a permutation. The transcript generated by querying an oracle is inconsistent with a permutation iff. points {x,y} with {f(x)=f(y)} are queried. {S(A)} makes at most {t} queries to an oracle that answers every query with an independently chosen random string from {\{ 0, 1\}^{2m}}. The probability of having a repetition is at most {(\sum_{i=1}^{t-1} i)/2^{2m} \leq t^{2}/2^{2m+1}}.

Bounding the sum over transcripts in {T^{'}} will require looking into the workings of the construction. Fix a transcript {\tau \in T^{'}} given by {(x_{i}, y_{i}, b_{i}), 1\leq i \leq q}, with the number of queries {q \leq t}. Each {x_{i}} can be written as {(L_{i}^{0}, R_{i}^{0})} for strings {L_{i}^{0}, R_{i}^{0}} of length {m} corresponding to the left and right parts of {x_{i}}. The string {x_{i}} goes through {4} iterations of {D} using the function {F_{k}, 1\leq k\leq 4} for the {k}th iteration. The output of the construction after iteration {k}, {0 \leq k \leq 4} for input {x_{i}} is denoted by {(L_{i}^{k}, R_{i}^{k})}.

Functions {F_{1},F_{4}} are said to be good for the transcript {\tau} if the multisets {\{R^{1}_{1}, R^{1}_{2}, \cdots, R^{1}_{q}\}} and {\{L^{3}_{1}, L^{3}_{2}, \cdots, L^{3}_{q}\}} do not contain any repetitions. We bound the probability of {F_{1}} being bad for {\tau} by analyzing what happens when {R^{1}_{i}=R^{1}_{j}} for some {i,j}:

\displaystyle  R^{1}_{i} = L^{0}_{i} \oplus F_{1}( R^{0}_{i})

\displaystyle  R^{1}_{j} = L^{0}_{j} \oplus F_{1}( R^{0}_{j})

\displaystyle  0 = L^{0}_{i} \oplus L^{0}_{j} \oplus F_{1}( R^{0}_{i}) \oplus F_{1}( R^{0}_{j})  \ \ \ \ \ (6)

The algorithm {A} does not repeat queries so we have {(L_{i}^{0}, R_{i}^{0}) \neq (L_{j}^{0}, R_{j}^{0})}. We observe that {R_{i}^{0} \neq R_{j}^{0}} as equality together with equation (6) above would yield {x_{i}=x_{j}}. This shows that equation (6) holds only if {F_{1}( R^{0}_{j})= s \oplus F_{1}( R^{0}_{i}) }, for a fixed {s} and distinct strings {R^{0}_{i}} and {R^{0}_{j}}. This happens with probability {1/2^{m}} as the function {F_{1}} takes values from {\{ 0, 1\}^{m}} independently and uniformly at random. Applying the union bound for all pairs {i,j},

\displaystyle   Pr_{F_{1}}[ \exists i,j \in [q], \; \; R^{1}_{i}=R^{1}_{j} ] \leq \frac{t^{2}}{2^{m+1}} \ \ \ \ \ (7)

We use a similar argument to bound the probability of {F_{4}} being bad. If {L^{3}_{i}=L^{3}_{j}} for some {i,j} we would have:

\displaystyle  \begin{array}{rcl}  L^{3}_{i} &=& R^{4}_{i} \oplus F_{4}( L^{4}_{i}) \\ L^{3}_{j} &=& R^{4}_{j} \oplus F_{4}( L^{4}_{j}) \end{array}

\displaystyle  0 = R^{4}_{i} \oplus R^{4}_{j} \oplus F_{4}( L^{4}_{i}) \oplus F_{4}( L^{4}_{j})  \ \ \ \ \ (8)

The algorithm {A} does not repeat queries so we have {(L_{i}^{4}, R_{i}^{4}) \neq (L_{j}^{4}, R_{j}^{4})}. We observe that {L_{i}^{4} \neq L_{j}^{4}} as equality together with equation (8) above would yield {y_{i}=y_{j}}. This shows that equation (8) holds only if {F_{4}( L^{4}_{j})= s^{'} \oplus F_{4}( L^{4}_{i}) }, for a fixed string {s^{'}} and distinct strings {L^{4}_{i}} and {L^{4}_{j}}. This happens with probability {1/2^{m}} as the function {F_{4}} takes values from {\{ 0, 1\}^{m}} independently and uniformly at random. Applying the union bound for all pairs {i,j},

\displaystyle   Pr_{F_{4}}[ \exists i,j\in [q], \; \; L^{3}_{i}=L^{3}_{j} ] \leq \frac{t^{2}}{2^{m+1}} \ \ \ \ \ (9)

Equations (7) and (9) together imply that

\displaystyle   Pr_{F_{1}, F_{4}}[ F_{1}, F_{4} \text{ not good for transcript } \tau ] \leq \frac{t^{2}}{2^{m}} \ \ \ \ \ (10)

Continuing the analysis, we fix good functions {F_{1}}, {F_{4}} and the transcript {\tau}. We will show that the probability of obtaining {\tau} as a transcript in this case is the same as the probability of obtaining {\tau} for a run of {S(A)}. Let {\tau= (x_{i}, y_{i}, b_{i}), 1 \leq i \leq q \leq t}. We calculate the probability of obtaining {y_{i}} on query {x_{i}} over the choice of {F_{2}} and {F_{3}} .

The values of the input {x_{i}} are in bijection with pairs {(L_{i}^{1}, R_{i}^{1})} while the values of the output {y_{i}} are in bijection with pairs {(L_{i}^{3}, R_{i}^{3})}, after fixing {F_{1}} and {F_{4}}. We have the relations (from (1)(3)):

\displaystyle  \begin{array}{rcl}  L_{i}^{3} = R_{i}^{2} = L_{i}^{1} \oplus F_{2}( R_{i}^{1} ) \\ R_{i}^{3} = L_{i}^{2} \oplus F_{3}( R_{i}^{2}) = R_{i}^{1} \oplus F_{3}( L_{i}^{3} ) \end{array}

These relations imply that {(x_{i}, y_{i})} can be an input output pair if and only if we have {F_{2}( R_{i}^{1}), F_{3}( L_{i}^{3})= (L_{i}^{3} \oplus L_{i}^{1}, R_{i}^{3} \oplus R_{i}^{1})}. Since {F_{2}} and {F_{3}} are random functions with range {\{0,1\}^{m}}, the pair {(x_{i}, y_{i})} occurs with probability {2^{-2m}}. The values {R^{1}_{i}} and {L^{3}_{i}, (i \in [q])} are distinct because the functions {F_{1}} and {F_{4}} are good. This makes the occurrence of {(x_{i}, y_{i})} independent from the occurrence of {(x_{j}, y_{j})} for {i \neq j}. We conclude that the probability of obtaining the transcript {\tau} equals {2^{-2mq}}.

The probability of obtaining transcript {\tau} equals {2^{-2mq}} in the simulation {S(A)} as every query is answered by an independent random number from {\{0,1\}^{2m}}. Hence,

\displaystyle  \begin{array}{rcl} && \displaystyle \left| \sum_{\tau \in T^{'}} \left(\mathop{\mathbb P}_{\overline {F} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () \leftarrow \tau ] - \mathop{\mathbb P} [ S(A) \leftarrow \tau] \right) \right| \\ &\leq & \displaystyle \left| \sum_{\tau \in T^{'}} \mathop{\mathbb P}_{F_{2}, F_{3} } \left[ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () \leftarrow \tau | F_{1}, F_{4} \text { not good for } \tau \right] \right| \\ & \leq & \displaystyle \frac{t^{2}}{2^{m}} \left | \sum_{\tau \in T^{'}} \mathop{\mathbb P}_{F_{2}, F_{3} } [ A^{P_{\overline{R}}, P^{-1} _{\overline{R}} } () \leftarrow \tau] \right | \\ & \leq & \displaystyle \frac{t^{2}}{2^{m}} \end{array}  \ \ \ \ \ (11)

The statement of the lemma follows by adding equations (5) and (11) and using the triangle inequality. \Box

This concludes the analysis of the Luby-Rackoff scheme for constructing pseudorandom permutations from a family of pseudorandom functions.

About these ads