*In which we encounter for the first time message indistinguishability and semantic security*

In the last lecture we saw that

- all classical encryption schemes which allow the encryption of arbitrarily long messages have fatal flaws;
- it is possible to encrypt with perfect security using one-time pad, but the scheme can be used only once, and the key has to be as long as the message;
- if one wants perfect security, one needs a key as long as the total length of all messages that are going to be sent.

Our goal for the next few lectures will be to study schemes that allow the sending of messages that are essentially arbitrarily long, using a fixed key, and having a security that is essentially as good as the perfect security of one-time pad.

Today we introduce a notion of security (*semantic security*) that is extremely strong. When it is met *there is no point for an adversary to eavesdrop the channel*, regardless of what messages are being sent, of what she already knows about the message, and what goal she is trying to accomplish.

First, let us fix the model in which we are going to work. For the time being, we are going to be very modest, and we shall only try to construct an encryption scheme that, like one-time pad, is designed for only one use. We just want the key to be reasonably short and the message to be of reasonably large length.

We shall also restrict ourselves to *passive* adversaries, meaning that Eve is able to see the communication between Alice and Bob, but she cannot inject her own messages in the channel, and she cannot prevent messages from being delivered.

The definition of *correctness* for an *encryption scheme* is straightforward.

Definition 1[Symmetric-Key Encryption Scheme -- Finite case] A symmetric-key encryption scheme with key-length , plain-text length and ciphertext-length is is a pair of probabilistic algorithms , such that , , and for every key and every message ,

where the probability is taken over the randomness of the algorithms

Definition 2[Symmetric-Key Encryption Scheme -- Variable Key Length Case] A symmetric-key encryption scheme with variable key-length is is a pair of polynomial-time probabilistic algorithms and a function , such that for everysecurity parameter, for every key and every message ,

(Although it is redundant to give the algorithm the parameter , we do so because this will emphasize the similarity with the public-key setting that we shall study later.) It will be more tricky to satisfactorily formalize the notion of *security*.

One super-strong notion of security, which is true for the one-time pad is the following:

Definition 3[Perfect Security] A symmetric-key encryption scheme isperfectly secureif, for every two messages , the distributions and are identical, where we consider the distribution over the randomness of the algorithm and over the choice of .

The informal discussion from the previous lecture gives a hint to how to solve the following

Exercise 4Prove that if is perfectly secure, then .

Before we move on, let us observe two limitations that will be present in any possible definitions of security involving a key of length which is much smaller than the message length. Eve can always employ one of the following two trivial attacks:

- In time , Eve can enumerate all keys and produce a list of plaintexts, one of which is correct. Further considerations can help her prune the list, and in certain attack models (which we shall consider later), she can figure out with near certainty which one is right, recover the key and totally break the system.
- Eve can make a random guess of what the key is, and be correct with probability .

Already if , however, neither line of attack is worrisome for Alice and Bob. Even if Eve has access to the fastest of super-computers, Alice and Bob will be long dead of old age before Eve is done with the enumeration of all keys; and Alice and Bob are going to both going to be stricken by lighting, and then both hit by meteors, with much higher probability than . The point, however, is that *any* definition of security will have to involve a bound on Eve’s running time, and allow for a low probability of break. If the bound on Eve’s running time is enormous, and the bound on the probability of a break is minuscule, then the definition is as satisfactory as if the former was infinite and the latter was zero.

All the definitions that we shall consider involve a bound on the complexity of Eve, which means that we need to fix a model of computation to measure this complexity. We shall use (non-uniform) *circuit complexity* to measure the complexity of Eve, that is measure the number of gates in a boolean circuit implementing Eve’s functionality.

If you are not familiar with circuit complexity, the following other convention is essentially equivalent: we measure the running time of Eve (for example on a RAM, a model of computation that captures the way standard computers work) and we *add the length of the program* that Eve is running. The reason is that, without this convention, we would never be able to talk about the complexity of computing finite functions. Every function of an 128-bit input, for example, is very efficiently computable by a program which is nothing but a series of if-then-elses.

Finally we come to our first definition of security:

Definition 5[Message Indistinguishability -- concrete version] We say that an encryption scheme is message indistinguishable if for every two messages , and for every boolean function of complexity , we have

where the probability is taken over the randomness of and the choice of .

(Typical parameters that are considered in practice are and .)

When we have a family of ciphers that allow varying key lengths, the following asymptotic definition is standard.

Definition 6[Negligible functions] A function isnegligibleif for every polynomial and for every sufficiently large

Definition 7[Message Indistinguishability -- asymptotic definition] We say that a variable key length encryption scheme is message indistinguishable if for every polynomial there is a negligible function such that for all sufficiently large the scheme is -message indistinguishable when the security parameter is .

The motivation for the asymptotic definition, is that we take polynomial time to be an upper bound to the amount of steps that any efficient computation can take, and to the “number of events” that can take place. This is why we bound Eve’s running time by a polynomial. The motivation for the definition of negligible functions is that if an event happens with negligible probability, then the expected number of experiments that it takes for the event to happen is superpolynomial, so it will “never” happen. Of course, in practice, we would want the security parameters of a variable-key scheme to be exponential, rather than merely super-polynomial.

Why do we use message indistinguishability as a formalization of security?

A first observation is that if we take and put no limit on , then message-indistinguishability becomes perfect security, so at least we are dealing with a notion whose “limit” is perfect security.

A more convincing explanation is that message indistinguishability is equivalent to *semantic security*, a notion that we describe below and that, intuitively, says that Eve might as well not look at the channel.

What does it mean that “Eve might as well not look at the channel”? Let us summarize Eve’s information and goals. Alice has a message that is sent to Bob over the channel. The message comes from some distribution (for example, it is written in English, in a certain style, it is about a certain subject, and so on), and let’s assume that Eve knows . Eve might also know more about the specific message being sent, because of a variety of reasons; call the information that Eve has about the message. Finally, Eve has a goal in eavesdropping the conversation, which is to learn some information about the message. Perhaps she wants to reconstruct the message in its entirety, but it could also be that she is only interested in a single bit. (Does contain the string “I hate Eve”? Is a confidential report stating that company is going to miss its earning estimate? And so on.)

Why is Eve even bothering tapping the channel? Because via some cryptoanalytic algorithm , which runs in a reasonable amount of time, she thinks she has a good chance to accomplish. But the probability of accomplishing her goal would have been essentially the same *without* tapping the channel, then there is no point.

Definition 8[Semantic Security -- Concrete definition] An encryption scheme is semantically secure if for every distribution over messages, every functions and (of arbitrary complexity) and every function of complexity , there is a function of complexity such that

Think, as before, of and , and suppose is quite small (so that a computation of complexity can be performed in a few seconds or less), and notice how the above definition captures the previous informal discussion.

Now let’s see that semantic security is *equivalent* to message indistinguishability.

Lemma 9[Semantic Security Implies Message Indistinguishability] If is semantically secure, then it is message indistinguishable.

Note that semantic security implies message indistinguishability *regardless of the overhead parameter* .

*Proof:* We prove that if is *not* message indistinguishable then it is *not* semantically secure regardless of how large is .

If is not message indistinguishable, then there are two messages and an algorithm of complexity such that

Pick a bit uniformly at random in ; then we have

And now take to be , to be the distribution for a random , and define , and to be empty. Then

On the other hand, for every , regardless of complexity

and so we contradict semantic security. ◻

Lemma 10[Message Indistinguishability Implies Semantic Security] If is message indistinguishable and has complexity , then is semantically secure, where is the maximum length of over .

*Proof:* Fix a distribution , an information function , a goal function , and a cryptoanalytic algorithm of complexity .

Take , so that the complexity of is equal to the complexity of plus the complexity of .

For every message , we have

Otherwise defining would contradict the indistinguishabiltiy.

Averaging over in

and so

◻

It is also possible to define an asymptotic version of semantic security, and to show that it is equivalent to the asymptotic version of message indistinguishability.

Definition 11[Semantic Security -- Asymptotic Definition] An encryption scheme is semantically secure if for every polynomial there exists a polynomial and a negligible function such that is semantically secure for all sufficiently large .

Exercise 12Prove that a variable key-length encryption scheme is asymptotically semantically secure if and only if it is asymptotically message indistinguishable,

## 9 comments

Comments feed for this article

June 14, 2011 at 10:13 pm

SomWouldn’t it be easier to understand Definition 8, if we take the same adversary A in both instances ? In one case, A gets I(m) and Enc(k,m) and in the other case it gets I(m) and a random string in C (the set of cipher texts). Due to this change, the overhead parameter o will also not be required. Am I missing any subtle issue here ?

On the other hand, does this change make it more difficult to prove the equivalence between Indistinguishability and Semantic security ?

June 16, 2011 at 4:03 pm

lucaSom: if you use this definition, then it is very easy to prove equivalence with message indistinguishability, in fact your definition is almost the same as message indistinguishability. Indeed, implicitly, the proof of equivalence beween semantic security and message indistinguishability goes by the chain of implications

message indistinguishability -> your definition -> semantic security

by following the reverse directions in a proof by contradiction. (That is, a violation of semantic security can be made into a violation in which the adversaries are the same, which leads to a violation of message indistinguishability.)

December 29, 2011 at 5:46 pm

QuoraIs it safe to use the hash of the plain text as a key for encryption?…Semantic security is relevant to symmetric encryption. The idea is simply that your adversary should not be able to distinguish between the encryption of two chosen plaintexts. Check out Luca Trevesian’s lecture notes talking about semantic security f…

August 31, 2012 at 9:43 pm

LetaExcellent blog here! Also your site loads up fast! What web host are

you using? Can I get your affiliate link to your host?

I wish my site loaded up as fast as yours lol

March 6, 2013 at 11:05 pm

www.maidbrigade.comI do not even understand how I ended up here, however I assumed this submit was once great.

I don’t recognize who you are but certainly you are going to a famous blogger in case you aren’t already.

Cheers!

March 19, 2013 at 8:11 am

Working to formalize indistinguishability |[...] power, and the crypto underlying SSL/TLS wilts under that assumption. In contrast, the notion of semantic security actually does apply, yet it is not broad enough. Semantic security essentially gives an adversary [...]

March 28, 2013 at 11:59 am

CrystalAnd when such basic principles are disturbed,

neither does the IMD, the Eurogroup of the EU and IMF

did to paphos car hire today is poised to announce what president

Clifton E. There are a lot of money and are an informed

buyer. Budget Singapore renters will benefit from this tech whether or

not your personal or family car is that it makes you dizzy.

How about the services while hiring the car.

But it’s a little harder to find cinema or theatre that doesn’t come with a ticket price.

July 30, 2013 at 9:37 pm

AndreaHi Luca,

very nice lecture, thanks for posting!

There is one thing I do not understand in the proof: message indistinguishability -> semantic security

What do you mean by “averaging over M” at the end?

Why do we need that?

April 6, 2014 at 10:17 am

TravisIncredible! This blog looks exactly like my old one! It’s on a completely different topic but it has pretty much the

same page layout and design. Outstanding choice of colors!