*In which we prove properties of expander graphs.*

**1. Quasirandomness of Expander Graphs **

Recall that if is a -regular graph, and is its adjacency matrix, then, if we call the eigenvalues of with repetitions, we are interested in the parameter , and we have

where is the matrix with a one in each entry, and is the matrix norm .

Our fist result today is to show that, when is small, the graph has the following *quasirandomness* property: for every two disjoint sets , the number of edges between and is close to what we would expect in a random graph of average degree , that is, approximately .

For two (possibly overlapping) sets of vertices , we define to be the number of edges with one endpoint in and one endpoint in , with edges having both endpoints in , if any, counted twice.

Lemma 1 (Expander Mixing Lemma)Let be a -regular graph, and let and be two disjoint subsets of vertices. Then

*Proof:* We have

and

so

Note that, for every disjoint , we have , and so the right-hand side in the expander mixing lemma is at most , which is a small fraction of the total number of edges if is small compared to .

**2. Random Walks in Expanders **

A *-step random walk* is the probabilistic process in which we start at a vertex, then we pick uniformly at random one of the edges incident on the vertices and we move to the other endpoint of the edge, and then repeat this process times.

If is the normalized adjacency matrix of an undirected regular graph , then is the probability that, in one step, a random walk started at reaches . This is why the normalized adjacency matrix of a regular graph is also called its *transition matrix*.

Suppose that we start a random walk at a vertex chosen according to a probability distribution , which we think of as a vector such that for every and . After taking one step, the probability of being at vertex is , which means that the probability distribution after one step is described by the vector , and because of the symmetric of , this is the same as .

Iterating the above reasoning, we see that, after a -step random walk whose initial vertex is chosen according to distribution , the last vertex reached by the walk is distributed according to .

The parameter of is equal to , and so if has a parameter bounded away from , and if is large enough, we have that the parameter of is very small, and so is close to in matrix norm. If was actually equal to , then would be equal to the uniform distribution, for every distribution . We would thus expect to be close to the uniform distribution for large enough .

Before formalizing the above intuition, we need to fix a good measure of distance for distributions. If we think of distributions as vectors, then a possible notion of distance between two distributions is the Euclidean distance between the corresponding vectors. This definition, however, has various shortcoming and, in particular, can assign small distance to distributions that are intuitively very different. For example, suppose that and are distributions that are uniform over a set , and over the complement of , respectively, where is a set of size . Then all the entries of are and so , which is vanishingly small even though distributions over disjoint supports should be considered as maximally different distributions.

A very good measure is the *total variation distance*, defined as

that is, as the maximum over all events of the difference between the probability of the event happening with respect to one distribution and the probability of it happening with respect to the other distribution. This measure is usually called *statistical distance* in computer science. It is easy to check that the total variation distance between and is precisely . Distributions with disjoint support have total variation distance 1, which is largest possible.

Lemma 2 (Mixing Time of Random Walks in Expanders)Let be a regular graph, and be its normalized adjacency matrix. Then for every distribution over the vertices and every , we havewhere is the uniform distribution.

In particular, if , then , where is an absolute constant.

*Proof:* Let be the normalized adjacency matrix of a clique with self-loops. Then, for every distribution , we have . Recall also that .

We have

The last result that we discussed today is one more instantiation of the general phenomenon that “if is small then a result that is true for the clique is true, within some approximation, for .”

Suppose that we take a -step random walk in a regular graph starting from a uniformly distributed initial vertex. If is a clique with self-loops, then the sequence of vertices encountered in the random walk is a sequence of independent, uniformly distributed, vertices. In particular, if is a bounded function, the Chernoff-Hoeffding bounds tell us that the empirical average of over the points of the random walk is very close to the true average of , except with very small probability, that is, if we denote by the set of vertices encountered in the random walk, we have

where . A corresponding Chernoff-Hoeffding bound can be proved for the case in which the random walk is taken over a regular graph such that is small.

Lemma 3 (Chernoff-Hoeffding Bound for Random Walks in Expanders)Let be a regular graph, and the distribution of -tuples constructed by sampling independently, and then performing a -step random walk starting at . Let be any bounded function. Then

We will not prove the above result, but we briefly discuss one of its many applications.

Suppose that we have a polynomial-time probabilistic algorithm that, on inputs of length , uses random bits and then outputs the correct answer with probability, say, at least . One standard way to reduce the error probability is to run the algorithm times, using independent randomness each time, and then take the answer that comes out a majority of the times. (This is for problems in which we want to compute a function exactly; in combinatorial optimization we would run the algorithm times and take the best solutions, and in an application in which the algorithm performs an approximate function evaluation we would run the algorithm times and take the median. The reasoning that follows for the case of exact function computation can be applied to the other settings as well.)

On average, the number of iterations of the algorithms that give a correct answer is , and the cases in which the majority is erroneous correspond to cases in which the number of iterations giving a correct answer is . This means that the case in which the modified algorithm makes a mistake correspond to the case in which the empirical average of independent 0/1 random variables deviates from its expectation by more than , which can happen with probability at most , which becomes vanishingly small for large .

This approach uses random bits. Suppose, instead, that we consider the following algorithm: pick random strings for the algorithm by performing a -step random walk in an expander graph of degree with vertices and such that , and then take the majority answer. A calculation using the Chernoff bound for expander graphs show that the error probability is , and it is achieved using only random bits instead of .