In which we generalize the notion of normalized Laplacian to irregular graphs, we extend the basic spectral graph theory results from last lecture to irregular graphs, and we prove the easy direction of Cheeger’s inequalities.
1. Irregular Graphs
Let be an undirected graph, not necessarily regular. We will assume that every vertex has non-zero degree. We would like to define a normalized Laplacian matrix associated to
so that the properties we proved last time are true: that the multiplicity of 0 as an eigenvalue is equal to the number of connected components of
, that the largest eigenvalue is at most 2, and that it is 2 if and only if (a connected component of) the graph is bipartite.
In order to have a matrix such that zero is the smallest eigenvalue, and such that multiplicity of zero is the number of connected component, we want a matrix such that the numerator of the Rayleigh quotient is (a multiple of)
and the matrix such that
is the above expression is the matrix
, where
is the diagonal matrix such that
, the degree of
. The matrix
is called the Laplacian matrix of
. Note that there is no fixed constant upper bound to the largest eigenvalue of
; for example, if
is a
-regular bipartite graph, the largest eigenvalue is
, as proved in the last lecture.
Some calculations shows that the right analog of the normalization that we did in the regular case (in which we divided by the degree ) would be to have a matrix whose Rayleigh quotient is
and it’s clear that the above expression is at most 2 for every , and it is possible to find an
for which the above expression is 2 if and only if
has a bipartite connected component.
Unfortunately, there is no matrix whose Rayleigh quotient equals (1), because the denominator of a Rayleigh quotient is, by definition, regardless of the matrix.
One way to work around this problem would be to give a more general form of the variational characterization of eigenvalues, in which we have an arbitrary inner product , and the Rayleigh quotient is defined as
.
Here we will proceed in a way that is essentially equivalent, but without introducing this additional definitional framework.
The point is that, if we look at the Rayleigh quotient of the vector , where
is the diagonal matrix such that
, then the denominator will indeed be
, and that we can find a matrix
such that the numerator of the Rayleigh quotient of
is
, so that the Rayleigh quotient
is indeed the expression in (1).
This matrix is called the normalized Laplacian of
and, by the above observation, it has to be
. Note that, in a
-regular graph, we get
, consistent with our definition from the last lecture.
Now the point is that the mapping is linear and bijective, so it maps the set of all possible vectors to the set of all possible vectors, and it maps a
-dimensional space to a (possibly different)
-dimensional space.
If we let be the eigenvalues of
, counting repetitions, the variational characterization gives us
and
from which we have that and that the multiplicity of zero is equal to the number of connected components.
We also have
from which we see that and that
if and only if one of the connected components of
is bipartite.
2. Edge Expansion, Fiedler’s Algorithm and Cheeger’s Inequalities
We will now return, for simplicity, to the regular case.
If is an undirected
-regular graph, and
is a set of vertices, we call the quantity
the edge expansion of . The quantity
is the average fraction of neighbors outside of
for a random element of
, and it compares the actual number of edges crossing the cut
with the trivial upper bound
.
We define the expansion of a cut as
The edge expansion of the graph is defined as
(Note: it is common in the literature to use the notation to refer to the quantity that we call
.)
Finding cuts of small expansion is a problem with several applications. It is an open question if there is a polynomial-time approximation with a constant-factor approximation ratio; a positive answer would refute the “small-set expansion conjecture” which is closely related to the unique games conjecture.
The following algorithm was proposed by Fiedler, and it works well in practice when is the eigenvector of
.
- Input: graph
, vector
- Sort the vertices according the values
, and let
be the sorted order
- Find a
that minimizes
, and output such a cut
- Sort the vertices according the values
Note that Fiedler’s algorithm can be implemented in time , because it takes time
to sort the vertices, and the cut of minimal expansion that respects the sorted order can be found in time
. (To see that this is the case, consider that, in order to find such a cut, we just need to compute the numbers
for each
. We see that
is equal to the degree of
, and that, given
, the value of
can be computed by just adding to
the number of neighbors of
in
, and subtracting the number of neighbors of
in
, on operation that can be done in time
. Thus the total running time is of the order of
, that is,
.)
We will prove the following result
Theorem 1 (Cheeger’s Inequalities) Let
be an undirected regular graph and
be the eigenvalues of the normalized Laplacian, with repetitions, then
Furthermore, if
is the cut found by Fiedler’s algorithm given the eigenvector of
, then
Note that, from the furthermore part of the Theorem, it follows that, if is the cut found by Fiedler’s algorithm given an eigenvector of
, we have
which is a worst-case guarantee of the quality of the solution found by the algorithm.
3. Proof that
Let be a set of vertices such that
. Recall that for every set
, we have that expansion of
is the same as the Rayleigh quotient of the indicator vector
. (The indicator vector of a set
is the 0/1 vector
whose
-th coordinate is 1 if and only if
.) So we have
also recall that, from the variational characterization of eigenvalues, we have
We will prove the inequality by showing that all the vectors in the 2-dimensional space
of linear combinations of the orthogonal vectors
have Rayleigh quotient at most
. This is a consequence of the following useful fact.
Lemma 2 Let
and
be two orthogonal vectors, and let
be a positive semidefinite matrix. Then
Proof: Let be the eigenvalues of
and
be a corresponding basis of eigenvectors. Let us write
and
.
The Rayleigh quotient of is
In the first inequality, we used orthogonality of and
to derive
and we used the Cauchy-Schwarz inequality
.
4. First Part of the Analysis of Fiedler’s Algorithm
The vector is an eigenvector for 0, which is the smallest eigenvalue of the normalized Laplacian of
, and so, from the variational characterization of eigenvalues, we have that
and that any eigenvector of
is a minimizer of the above expression. We will prove that
and that the Furthermore part of Theorem 1 is true by showing the following stronger result:
Lemma 3 Let
be a vector orthogonal to
and let
be the cut found by Fiedler’s algorithm given
. Then
This stronger form is useful, because often one runs Fiedler’s algorithm on an approximate eigenvector, and Lemma 3 shows that one gets a guarantee on the quality of the resulting cut that does not require to be an eigenvector, as long as its Rayleigh quotient is small.
We divide the proof of Lemma 3 into two parts: we analyze the performance of the algorithm given a vector that, instead of being orthogonal to
, has the property of having non-negative entries and at most
non-zero entries, and we show that analyzing the performance of the algorithm on vectors of the former type reduces to analyzing the performance on vectors of the latter type.
Lemma 4 Let
be a vector with non-negative entries. Then there is a
such that
Lemma 5 Let
be orthogonal to
. Then there is a vector
with at most
non-zero entries such that
Furthermore, for every
, the cut
is one of the cuts considered by Fiedler’s algorithm on input
.
Let us quickly see how to prove Lemma 3 given Lemma 4 and Lemma 5. Let be a vector orthogonal to
, and let
be the cut found by Fiedler’s algorithm given
. Let
be the non-negative vector with at most
positive entries and such that
as promised by Lemma 5. Let
be a threshold such that
as promised by Lemma 5. The set contains at most
vertices, and the cut
is one of the cuts considered by Fiedler’s algorithm on input
, and so
We will prove Lemma 4 next time. We conclude this lecture with a proof of Lemma~5.
Proof: (Of Lemma 5) First we observe that, for every constant ,
because the numerator of and the numerator of
are the same, and the denominator of
is
.
Let be the median value of the entries of
, and call
. Then we have
, and the median of the entries of
is zero, meaning that
has at most
positive entries and at most
negative entries. We will refer to the vertices
such that
as the positive vertices, and the vertices
such that
as the negative vertices.
We write
where if
is positive and
otherwise; similarly,
if
is negative, and
otherwise. Note that
and
are orthogonal, non-negative, and each of them has at most
nonzero entries. Note also that, for every
, the cut defined by the set
is one of the cuts considered by Fiedler’s algorithm on input
, because it is the cut
Similarly, for every , the cut defined by the set
is one of the cuts considered by Fiedler’s algorithm on input
, because it is the cut
It remains to show that at least one of or
has Rayleigh quotient smaller than or equal to the Rayleigh quotient of
(and, hence, of
). We claim that
The only step that we need to justify is that for every edge we have
If is an edge between two non-positive vertices, or between two non-negative vertices, then the left-hand side and the right-hand side are clearly equal. If it is an edge between a positive vertex
and a negative vertex
, then the left-hand side is equal to
, and the right-hand side is equal to
.
Typo: In the PDF file, pg 6, during the proof that lemma 4 and lemma 5 implies lemma 3. You use lemma 5 and lemma 5 again. In the second time it should be lemma 4.