Beyond Worst-Case Analysis: Lecture 6

Scribed by Theo McKenzie

In which we study the spectrum of random graphs.

1. Overview

When attempting to find in polynomial time an upper bound certificate on the max cut and maximum independent set of a graph, we have used the following property. If {G\sim G_{n,\frac{1}{2}}}, then with high probability {\|A-\mathop{\mathbb E} (A)\|\leq O(\sqrt{n})}, where {\|\cdot\|} is the spectral norm. Generally, if {G\sim G_{n,p}} and {p>\frac{\log n}{n}} then w.h.p.

\displaystyle \|A-\mathop{\mathbb E} (A)\|\leq O(\sqrt{np}).

Today we will prove how to obtain the bound in Proposition 1 with an extra term of {\log n}, as well as show an outline of the method of finding the bound in Proposition 1. We will also show how when {p} is small this bound breaks down, namely how when {p=\Theta(\frac 1 n)},

\displaystyle \|A-\mathop{\mathbb E} (A)\|\geq\Omega \left(\sqrt{\frac{\log n}{\log\log n}} \right).

2. Introducing the Trace

Henceforth {M_{ij}^k} signifies {(M^k)_{ij}}. Take {M} symmetric and real. All eigenvalues of this matrix are real, and we can enumerate them {\lambda_1,\lambda_2,\ldots,\lambda_n} such that {|\lambda_1|\geq|\lambda_2|\geq\ldots\geq |\lambda_n|}.

The trace {\textnormal{Tr}(A)} is defined to be {\textnormal{Tr}(A)=\sum_{i=1}^n A_{ii}} where {A} is an {n\times n} matrix.

Moreover we know that \textnormal{Tr}{(A)=\sum_{1=1}^n \lambda_i}. If we take {k} large and even, the eigenvalues of {M^k} are {|\lambda_1|^k\geq|\lambda_2|^k\geq\ldots\geq |\lambda_n|^k}. Therefore we have

\displaystyle  \sum_i M^k_{ii}=\text{Tr}(M^k)=\sum_i |\lambda_i|^k\geq |\lambda_1|^k=\|M\|^k.

Moreover we have

\displaystyle \textnormal{Tr}(M^k)^{\frac{1}{k}}=\left(\sum_i |\lambda_i|^k\right)^\frac 1 k\leq n^{\frac{1}{k}}|\lambda_1|= n^{\frac{1}{k}}\|M\|.

This gives us an estimation of the norm, {\|M\|\leq (\sum_i M^k_{ii})^\frac{1}{k}\leq n^{\frac1 k}\|M\|}, which for {k>\log n} gives a constant factor approximation of {\|M\|}.

3. Using the Trace to Bound the Spectral Norm

Assume that {G\sim G_{n,\frac{1}{2}}} and {A} is the adjacency matrix of {G}. We will prove the following. {\displaystyle{\mathop{\mathbb E}_{G\sim G_{n,\frac{1}{2}}}}(\textnormal{Tr}((A-\mathop{\mathbb E} (A))^k))} is bounded above by {2^{O(k)}n^{1+k/2}k^{k/2}}. If {k>\log n}, by taking the {k}th root we achieve a bound of {O(\sqrt{n\log n})} on {\|A-\mathop{\mathbb E} (A)\|}.

3.1. Expected Value of Matrix Entries

First, we examine the matrix {M=2A-2\mathop{\mathbb E} (A)}. We have {M_{ii}=0} and {M_{ij}\in\{\pm1\}} with equal probability of each when {i\neq j}. Moreover {M_{ij}=M_{ji}}. If {i\neq j,\mathop{\mathbb E}( M_{ij}^k)=0} if {k} is odd and {\mathop{\mathbb E} (M_{ij}^k)=1} for {k} even.

{\mathop{\mathbb E} (\sum_i M^k_{ii})=n\mathop{\mathbb E} (M^k_{11})} by the linearity of expectation and symmetry between the entries. We evalute {M^k_{11}}.

\displaystyle M^k_{11}=\sum_{\{i_1,\ldots,i_{k-1}\}} M_{1i_1}M_{i_1i_2}\cdots M_{i_{k-1,1}}

where {{i_1,\ldots i_{k-1}}} represents the intermediate steps on a “path” between vertices that starts at 1 and returns to 1. For example, {M_{11}^2=\sum M_{1i}M_{i1}}. Note that we can repeat edges in these paths. By the linearity of expectation

\displaystyle  \mathop{\mathbb E} (M_{11}^k)=\sum_{\{i_1,\ldots,i_{k-1}\}}\mathop{\mathbb E} (M_{1i_1}M_{i_1i_2}\cdots M_{i_{k-1,1}}).

If any pair {\{i,j\}} occurs {\ell} times in the sequence of pairs {\{ 1,i_1\}, \{ i_1, i_2 \}, \ldots, \{ i_{k-1}, 1 \}}, where {\ell} is odd, then as the value of this term is independent from all other terms and {\mathop{\mathbb E} M^\ell_{ij}=0} for odd {\ell}, then {\mathop{\mathbb E} (M_{1i_1}M_{i_1i_2}\cdots M_{i_{k-1}1})=0}. If all pairs occur an even number of times, their product’s expectation is 1. Therefore {\mathop{\mathbb E} (M_{11}^k)} is the number of sequences {i_1,\ldots,i_{k-1}\in V^{k-1}} such that, in the sequence of pairs {\{ 1,i_1\}, \{ i_1, i_2 \}, \ldots, \{ i_{k-1}, 1 \}}, each pair occurs an even number of times.

3.2. Encoding argument

In order to give an upper bound on the number of such sequences, we will show how to encode a sequence where there are {m} distinct edges. In the sequence {i_1,\ldots i_{k-1}}, the element {i_j} is represented either as {(0,i_j)}, which takes {1 + \log n} bits, if {i_ji} appears for the first time in the sequence at location {j}, and as {(0,\ell)} otherwise, where {\ell < j} is such that {i_\ell = i_j}, which requires {1 + \log k} bits. Notice that, if {i_j} occurs for the first time at location {j}, then the pair {\{ i_{j-1}, i_j \}} also occurs for the first time at the location {j-1} and {j}. Thus the number of times that we encounter a vertex for the first time is at most the number of distinct edges. If we have {t} distinct vertices (other than vertex 1), then we are using {k + t \log n + (k-t) \log k}; for {k<n}, this value increases with {t}, but we have {t \leq m \leq k/2} (because every edge has to appear an even number of times and so there can be at most {k/2} distinct edges. This means that we use at most {k + \frac k2 \log n + \frac k2 \log k} bits in the encoding. The number of strings that can be encoded using at most {L} bits is {2^{L+1} }. If we assume {k<n}, we have the bound {\mathop{\mathbb E}(M_{11}^k)\leq k^\frac{k}{2}n^{\frac{k}{2}}2^{k+1}}, meaning

\displaystyle \textnormal{Tr}(M)=n\mathop{\mathbb E}( M_{11}^k)\leq n^{1+\frac{k}{2}}2^{k+1} k^\frac{k}{2}.

Therefore using suitable {k} and {t} we achieve our bound on {\|M\|}. For example, choose {k=\log n} and {t=10\sqrt{n}\sqrt{\log n}}. We use Markov’s inequality to obtain

\displaystyle  \mathbf{P}(\|M\|>t)=\mathbf{P}(\|M\|^k>t^k)\leq \frac{\mathop{\mathbb E}\|M\|^k}{t^k}\leq\left(\frac{2n^{\frac{1}{k}}\sqrt n \sqrt k }{t}\right)^k\leq e^{-\Omega(\log n)}\rightarrow 0.

4. Tightening the Bound

To obtain the sharper bound of {O(\sqrt n)}, we need to count the number of pairs more sharply and remove the {k^{\frac{k}{2}}} term, namely improve the way we talk about repetitions. Here we give an outline for how to find a tighter bound.

The worst case in the above analysis is when the number of distinct vertices (not counting vertex {1}) is maximal, which is {k/2}. In that case, the number of distinct “edges” {\{ i_j, i_{j+1} \}} is {k/2}, and they must form a connected graph over {1 + k/2} vertices, that is, they have to form a tree. Furthermore, each edges is repeated exactly twice in the closed walk, otherwise we would not have enough distinct edges to connect {1+k/2} distinct vertices.

If the pairs form a tree, then the only way we can have closed walk in which every edge is repeated twice is that the closed walk is a depth-first visit of the tree. In this case, we can improve our encoding in the following way. In a depth-first visit of a tree only two events are possible at each step: either we discover a new vertex, or we backtrack on the edge between the current node and the parent node. Thus we only need to pay {1 + \log n} bits to encode a new node in the sequence and {1} bit to encode an already seen node, and we obtain a bound of {2^{\frac{k}{2}+\frac k 2 \log n + \frac k 2}= 2^kn^\frac k 2}. By taking the {k}th root we obtain a bound on {\|M\|} of {O(\sqrt n)}.

5. Generalizing to any {p}

Now assume {G\sim G_{n,p}} and {A} is the adjacency matrix of {G}. We also assume {p<\frac{1}{2}}. We define

\displaystyle M=A-\mathop{\mathbb E}(A).

In this matrix {M_{ii}=0} and if {i\neq j, M_{i,j}=1-p} with probability {p} and {-p} with probability {1-p}. Therefore {\mathop{\mathbb E} (M_{ij})=0, \mathop{\mathbb E} (M_{ij}^2)=p-p^2\leq p}. In fact, {\mathop{\mathbb E} (M_{ij}^k)\leq p} for all {k\geq 1}.

From this we see we need to sum over sequences such that the multiset has each pair occuring at least two times, as if any pair occurs once, the expectation is {0}.

Therefore the bound is

\displaystyle  \mathop{\mathbb E} (M_{11}^k)\leq \sum_{i_1,\ldots i_{k-1}} p^\ell

where {\ell} is the number of distinct pairs and the sum is taken over multisets where each pair occurs at least twice. For large {\ell}, the number of sequences where each pair occurs at least twice with {\ell} distinct pairs is approximately {2^{O(\ell)}n^\ell}. This would give us

\displaystyle  \sum_{i_1,\ldots i_{k-1}}p^\ell=\sum_\ell p^\ell 2^{O(\ell)}n^\ell\leq O(p^\frac{k}{2}2^{O(k)}n^{\frac{k}{2}})

so the bound on {\|M\|} is {O(\sqrt{np})}. However, the bound on the number of sequences with {\ell} distict pairs breaks down when {\ell} is much smaller than {k}. In a full proof much more complicated calculations must be done.

6. Problems with sparse graphs

If {p=\Theta(\frac{1}{n})} , then {\|A-\mathop{\mathbb E}(A)\|\geq \Omega\left(\sqrt\frac{\log n}{\log\log n}\right)} w.h.p.

This breaks down the nice bound we obtained in section 5. This follows from the irregularity of sparse graphs. There will be isolated vertices and vertices with degree much higher than average.

Lemma 1 If {p=\Theta(\frac{1}{n})} then w.h.p. the highest degree vertex of {G} is of order {\Theta\left(\frac{\log n}{\log \log n}\right)}.

If G has a node of degree {\geq d}, then, for every {p< \frac 1 {4\sqrt d}}, {\lambda_{\max} (A-pJ) \geq\Omega(\sqrt d)}. This implies that {\forall 0< p < . \frac 1 {4\sqrt d}, \ \|A-pJ\|\geq \Omega(\sqrt d)}.

Proof: We have

\displaystyle  \lambda_{\max}(A-pJ)=\max_{{\bf x}}\frac{{\bf x}^T(A-pJ){\bf x}}{\|{\bf x}\|^2}

where the maximum is taken over all nonzero vectors {{\bf x}}. Call {v} a node of degree {\geq d} and call {d} of its neighbors {u_1,\ldots, u_d}.

Consider the vector {{\bf x}} such that {x_v=1, x_{u_i}=\frac{1}{\sqrt d}} and {x_w=0} for other vertices {w}. We have

\displaystyle {\bf x}^TA{\bf x}\geq 2 \sqrt d

\displaystyle  {\bf x}^TpJ{\bf x}=p\cdot \left(\sum x_i\right)^2=p\cdot (1+\sqrt d)^2\leq 4pd

\displaystyle  || {\bf x}||^2=2

Therefore if {p\leq \frac 1{4\sqrt d}},

\displaystyle \frac{{\bf x}^T(A-pJ){\bf x}}{\|{\bf x}\|^2}\geq \sqrt d - \frac 12 \sqrt d=\Omega(\sqrt d)

yielding the desired bound.

\Box

Theorem 4 proceeds immediately from Proposition 5 and Lemma 1.

Advertisements

The Alon-Boppana Theorem

Let {G=(V,E)} be a {d}-regular graph, and let

\displaystyle  d= \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n

be the eigenvalues of the adjacency matrix of {A} counted with multiplicities and sorted in descending order.

How good can the spectral expansion of {G} be?

1. Simple Bounds

The simplest bound comes from a trace method. We have

\displaystyle  trace(A^2) = \sum_i \lambda_i^2

by using one definition of the trace and

\displaystyle  trace(A^2) = \sum_{v,v} (A^2)_{v,v} \geq dn

using the other definition and observing that {(A^2)_{v,v}} counts the paths that go from {v} to {v} in two steps, of which there are at least {d}: follow an edge to a neighbor of {v}, then follow the same edge back. (There could be more if {G} has multiple edges or self-loops.)

So we have

\displaystyle  dn \leq d^2 + \sum_{i=2,\ldots,n} \lambda_i ^2

and so

\displaystyle  \max_{i=2,\ldots,n} |\lambda_i | \geq \sqrt d \cdot \sqrt{\frac {n-d}{n-1}}

The condition {d \leq n(1-\epsilon)} is necessary to get lower bounds of {\Omega(\sqrt d)}; in the clique, for example, we have {d=n-1} and {\lambda_2 = \lambda_n = -1}.

A trace argument does not give us a lower bound on {\lambda_2}, and in fact it is possible to have {\lambda_2=0} and {d= n/2}, for example in the bipartite complete graph.

If the diameter of {G} is at least 4, it is easy to see that {\lambda_2 \geq \sqrt d}. Let {a,b} be two vertices at distance 4. Define a vector {x} as follows: {x_a = 1}, {x_v = 1/\sqrt d} if {v} is a neighbor of {a}, {x_b=-1} and {x_v = - 1/\sqrt d} if {v} is a neighbor of {b}. Note that there cannot be any edge between a neighbor of {a} and a neighbor of {b}. Then we see that {||x||^2 = 4}, that {x^T A x \geq 4\sqrt d} (because there are {2d} edges, each counted twice, that give a contribution of {1/\sqrt d} to {\sum_{u,v} A_{uv} x_u x_v}) and that {x} is orthogonal to {(1,\ldots,1)}.

2. Nilli’s Proof of the Alon-Boppana Theorem

Nilli’s proof of the Alon-Boppana theorem gives

\displaystyle  \lambda_2 \geq 2 \sqrt{ d-1 } - O \left( \frac {\sqrt{d-1}}{diam(G)-4} \right)

where {diam(G) \geq \frac {\log |V|}{\log d-1}} is the diameter of {G}. This means that if one has a family of (constant) degree-{d} graphs, and every graph in the family satisfies {\lambda_2 \leq \lambda}, then one must have {\lambda \geq 2 \sqrt{d -1}}. This is why families of Ramanujan graphs, in which {\lambda_2 \leq 2 \sqrt{d-1}}, are special, and so hard to construct, or even to prove existence of.

Friedman proves a stronger bound, in which the error term goes down with the square of the diameter. Friedman’s proof is the one presented in the Hoory-Linial-Wigderson survey. I like Nilli’s proof, even if it is a bit messier than Friedman’s, because it starts off with something that clearly is going to work, but the first two or three ways you try to establish the bound don’t work (believe me, I tried, because I didn’t see why some steps in the proof had to be that way), but eventually you find the right way to break up the estimate and it works.

So here is Nilli’s proof. Continue reading