In which we review linear algebra prerequisites.
The following background from linear algebra will be sufficient for the sake of this course: to know what is an eigenvalue and an eigenvector, to know that real symmetric matrices have real eigenvalues and their real eigenvectors are orthogonal, and to know the variational characterization of eigenvalues.
1. Basic Definitions
If is a complex number, then we let denote its conjugate. Note that a complex number is real if and only if . If is a matrix, then denotes the conjugate transpose of , that is, . If the entries of are real, then , where is the transpose of , that is, the matrix such that .
We say that a matrix is Hermitian if . In particular, real symmetric matrices are Hermitian.
If are two vectors, then their inner product is defined as
Notice that, by definition, we have and . Note also that, for two matrices , we have , and that for every matrix and every two vectors , , we have
If is a square matrix, is a scalar, is a non-zero vector and we have
then we say that is an eigenvalue of and that is eigenvector of corresponding to the eigenvalue .
2. The Spectral Theorem
We want to prove
Theorem 1 (Spectral Theorem) Let be a symmetric matrix with real-valued entries, then there are real numbers (not necessarily distinct) and orthonormal real vectors , such that is an eigenvector of .
Assuming the fundamental theorem of algebra (that every polynomial has a complex root) and basic properties of the determinant, the cleanest proof of the spectral theorem is to proceed by induction on , and to show that must have a real eigenvalue with a real eigenvector , and to show that maps vectors orthogonal to to vectors orthogonal to . Then one applies the inductive hypothesis to restricted to the -dimensional space of vectors orthogonal to and one recovers the remaining eigenvalues and eigenvectors.
The cleanest way to formalize the above proof is to give all definitions and results in terms of linear operators where is an arbitrary vector space over the reals. This way, however, we would be giving several definitions that we would never use in the future, so, instead, the inductive proof will use a somewhat inelegant change of basis to pass from to an matrix .
We begin by showing that a real symmetric matrix has real eigenvalues and eigenvectors.
We begin by noting that every matrix has a complex eigenvalue.
Lemma 3 For every matrix , there is an eigenvalue and an eigenvector such that .
Proof: Note that is an eigenvalue for if and only if
which is true if and only if the rows of are not linearly independent, which is true if and only if
Now note that the mapping is a univariate polynomial of degree in , and so it must have a root by the fundamental theorem of algebra.
Next we show that if is real and symmetric, then its eigenvalues are real.
Lemma 4 If is Hermitian, then, for every and ,
Lemma 5 If is Hermitian, then all the eigenvalues of are real.
Proof: Let be an Hermitian matrix and let be a scalar and be a non-zero vector such that . We will show that , which implies that is a real number.
We note that
and by the fact that , we have .
In order to prove Theorem 2, it remains to argue that, for a real eigenvalue of a real symmetric matrix, we can find a real eigenvector.
Proof: } Let be a real symmetric matrix, then has a real eigenvalue and a (possibly complex valued) eigenvector , where and are real vectors. We have
from which (recalling that the entries of and the scalar are real) it follows that and that ; since and cannot both be zero, it follows that has a real eigenvector.
We are now ready to prove the spectral theorem
Proof: We proceed by induction on . The case is trivial.
Assume that the statement is true for dimension . Let be a real eigenvalue of and be a real eigenvector .
Now we claim that for every vector that is orthogonal to , then is also orthogonal to . Indeed, the inner product of and is
Let be the -dimensional subspace of that contains all the vectors orthogonal to . We want to apply the inductive hypothesis to restricted to ; we cannot literally do that, because the theorem is not stated in terms of arbitrary linear operators over vector spaces, so we will need to do that by fixing an appropriate basis for .
let be a matrix that computes a bijective map from to . (If is an orthonormal basis for , then is just the matrix whose columns are the vectors .) Let also be the matrix such that, for every , . (We can set where is as described above.) We apply the inductive hypothesis to the matrix
and we find eigenvalues and orthonormal eigenvectors for .
For every , we have
Since is orthogonal to , it follows that is also orthogonal to , and so , so we have
and, defining , we have
Finally, we observe that the vectors are orthogonal. By construction, is orthogonal to , and, for every , we have that
We have not verified that the vectors have norm 1 (which is true), but we can scale them to have norm 1 if not.
3. Variational Characterization of Eigenvalues
We conclude these notes with the variational characterization of eigenvalues for real symmetric matrices.
Theorem 6 Let be a symmetric matrix, and be the eigenvalues of in non-increasing order. Then
The quantity is called the Rayleigh quotient of with respect to , and we will denote it by .
Proof: Let be orthonormal eigenvectors of the eigenvalues , as promised by the spectral theorem. Consider the -dimensional space spanned by . For every vector in such a space, the numerator of the Rayleigh quotient is
The denominator is clearly , and so . This shows that
For the other direction, let be any -dimensional space: we will show that must contain a vector of Rayleigh quotient . Let be the span of ; since has dimension and has dimension , they must have some non-zero vector in common. Let be one such vector, and let us write . The numerator of the Rayleigh quotient of is
and the denominator is , so .
We have the following easy consequence.
Furthermore, every minimizer is an eigenvector of .
Proof: The identity is the case of the previous theorem. For the furthermore part, let be the list of eigenvalues of in non-decreasing order, and be corresponding eigenvectors. If is any vector, then
If , then for every such that , that is, is a linear combination of eigenvectors of , and hence it is an eigenvector of .
Fact 8 If is the largest eigenvalue of a real symmetric matrix , then
Furthermore, every maximizer is an eigenvector of .
Proof: Apply Fact 7 to the matrix .
Fact 9 If is the smallest eigenvalue of a real symmetric matrix , and is an eigenvector of , then
Furthermore, every minimizer is an eigenvector of .
Proof: A more conceptual proof would be to consider the restriction of to the space orthogonal to , and then apply Fact 7 to such a linear operator. But, since we have not developed the theory for general linear operators, we would need to explicitly reduce to an -dimensional case via a projection operator as in the proof of the spectral theorem.
Instead, we will give a more hands-on proof. Let be the list of eigenvalues of , with multiplicities, and be orthonormal vectors as given by the spectral theorem. We may assume that , possibly by changing the orthonormal basis of the eigenspace of . For every vector orthogonal to , its Rayleigh quotient is
and the minimum is achieved by vectors such that for every , that is, for vectors which are linear combinations of the eigenvectors of , and so every minimizer is an eigenvector of .