Skip to main content

Section EE Eigenvalues and Eigenvectors

In this section, we will define the eigenvalues and eigenvectors of a matrix, and see how to compute them. More theoretical properties will be taken up in the next section.

Subsection EEM Eigenvalues and Eigenvectors of a Matrix

We start with the principal definition for this chapter.

Definition EEM. Eigenvalues and Eigenvectors of a Matrix.

Suppose that \(A\) is a square matrix of size \(n\text{,}\) \(\vect{x}\neq\zerovector\) is a vector in \(\complex{n}\text{,}\) and \(\lambda\) is a scalar in \(\complexes\text{.}\) Then we say \(\vect{x}\) is an eigenvector of \(A\) with eigenvalue \(\lambda\) if

\begin{equation*} A\vect{x}=\lambda\vect{x}\text{.} \end{equation*}

Before going any further, perhaps we should convince you that such things ever happen at all. Understand the next example, but do not concern yourself with where the pieces come from. We will have methods soon enough to be able to discover these eigenvectors ourselves.

Consider the matrix

\begin{equation*} A= \begin{bmatrix} 204 & 98 & -26 & -10\\ -280 & -134 & 36 & 14\\ 716 & 348 & -90 & -36\\ -472 & -232 & 60 & 28 \end{bmatrix} \end{equation*}

and the vectors

\begin{align*} \vect{x}=\colvector{1\\-1\\2\\5}&& % ev=4 \vect{y}=\colvector{-3\\4\\-10\\4}&& % ev = 0 \vect{z}=\colvector{-3\\7\\0\\8}&& % ev=2 \vect{w}=\colvector{1\\-1\\4\\0} % ev=2\text{.} \end{align*}

Then

\begin{equation*} A\vect{x}= \begin{bmatrix} 204 & 98 & -26 & -10\\ -280 & -134 & 36 & 14\\ 716 & 348 & -90 & -36\\ -472 & -232 & 60 & 28 \end{bmatrix} \colvector{1\\-1\\2\\5}= \colvector{4\\-4\\8\\20}= 4\colvector{1\\-1\\2\\5}=4\vect{x} \end{equation*}

so \(\vect{x}\) is an eigenvector of \(A\) with eigenvalue \(\lambda=4\text{.}\)

Also,

\begin{equation*} A\vect{y}= \begin{bmatrix} 204 & 98 & -26 & -10\\ -280 & -134 & 36 & 14\\ 716 & 348 & -90 & -36\\ -472 & -232 & 60 & 28 \end{bmatrix} \colvector{-3\\4\\-10\\4}= \colvector{0\\0\\0\\0}= 0\colvector{-3\\4\\-10\\4}=0\vect{y} \end{equation*}

so \(\vect{y}\) is an eigenvector of \(A\) with eigenvalue \(\lambda=0\text{.}\)

Also,

\begin{equation*} A\vect{z}= \begin{bmatrix} 204 & 98 & -26 & -10\\ -280 & -134 & 36 & 14\\ 716 & 348 & -90 & -36\\ -472 & -232 & 60 & 28 \end{bmatrix} \colvector{-3\\7\\0\\8}= \colvector{-6\\14\\0\\16}= 2\colvector{-3\\7\\0\\8}=2\vect{z} \end{equation*}

so \(\vect{z}\) is an eigenvector of \(A\) with eigenvalue \(\lambda=2\text{.}\)

Also,

\begin{equation*} A\vect{w}= \begin{bmatrix} 204 & 98 & -26 & -10\\ -280 & -134 & 36 & 14\\ 716 & 348 & -90 & -36\\ -472 & -232 & 60 & 28 \end{bmatrix} \colvector{1\\-1\\4\\0}= \colvector{2\\-2\\8\\0}= 2\colvector{1\\-1\\4\\0}=2\vect{w} \end{equation*}

so \(\vect{w}\) is an eigenvector of \(A\) with eigenvalue \(\lambda=2\text{.}\)

So we have demonstrated four eigenvectors of \(A\text{.}\) Are there more? Yes, any nonzero scalar multiple of an eigenvector is again an eigenvector. In this example, set \(\vect{u}=30\vect{x}\text{.}\) Then

\begin{align*} A\vect{u} &=A(30\vect{x})\\ &=30A\vect{x}&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}}\\ &=30(4\vect{x})&& \knowl{./knowl/definition-EEM.html}{\text{Definition EEM}}\\ &=4(30\vect{x})&& \knowl{./knowl/property-SMAM.html}{\text{Property SMAM}}\\ &=4\vect{u} \end{align*}

so that \(\vect{u}\) is also an eigenvector of \(A\) for the same eigenvalue, \(\lambda=4\text{.}\)

The vectors \(\vect{z}\) and \(\vect{w}\) are both eigenvectors of \(A\) for the same eigenvalue \(\lambda=2\text{,}\) yet this is not as simple as the two vectors just being scalar multiples of each other (they are not). Look what happens when we add them together, to form \(\vect{v}=\vect{z}+\vect{w}\text{,}\) and multiply by \(A\text{,}\)

\begin{align*} A\vect{v} &=A(\vect{z}+\vect{w})\\ &=A\vect{z}+A\vect{w}&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &=2\vect{z}+2\vect{w}&& \knowl{./knowl/definition-EEM.html}{\text{Definition EEM}}\\ &=2(\vect{z}+\vect{w})&& \knowl{./knowl/property-DVAC.html}{\text{Property DVAC}}\\ &=2\vect{v} \end{align*}

so that \(\vect{v}\) is also an eigenvector of \(A\) for the eigenvalue \(\lambda=2\text{.}\) So it would appear that the set of eigenvectors that are associated with a fixed eigenvalue is closed under the vector space operations of \(\complex{n}\text{.}\) Hmmm.

The vector \(\vect{y}\) is an eigenvector of \(A\) for the eigenvalue \(\lambda=0\text{,}\) so we can use Theorem ZSSM to write \(A\vect{y}=0\vect{y}=\zerovector\text{.}\) But this also means that \(\vect{y}\in\nsp{A}\text{.}\) There would appear to be a connection here also.

Sage can compute eigenvalues and eigenvectors of matrices. We will see shortly that there are subtleties involved with using these routines, but here is a quick example to start. These two commands should be enough to get you started with this chapter.

For a square matrix, the methods .eigenvalues() and .eigenvectors_right() will produce what you expect, though the format of the eigenvector output requires some explanation. Here is Example SEE from the start of this chapter.

The three eigenvalues we know are included in the output of eigenvalues(), though for some reason the eigenvalue \(\lambda=2\) shows up twice.

The output of the eigenvectors_right() method is a list of triples. Each triple begins with an eigenvalue. This is followed by a list of eigenvectors for that eigenvalue. Notice the first eigenvector is identical to the one we described in Example SEE. The eigenvector for \(\lambda=0\) is different, but is just a scalar multiple of the one from Example SEE. For \(\lambda=2\text{,}\) we now get two eigenvectors, and neither looks like either of the ones from Example SEE. (Hint: try writing each of the eigenvectors for \(\lambda=2\) from the example as linear combinations of the two eigenvectors provided by Sage.) An explanation of the third part of each triple (an integer) will have to wait, though it can be optionally suppressed if desired.

One cautionary note: The word lambda has a special purpose in Sage, so do not try to use this as a name for your eigenvalues.

Example SEE hints at a number of intriguing properties, and there are many more. We will explore the general properties of eigenvalues and eigenvectors in Section PEE, but in this section we will concern ourselves with the question of actually computing eigenvalues and eigenvectors. First we need a bit of background material on polynomials and matrices.

Subsection PM Polynomials and Matrices

A polynomial is a combination of powers, multiplication by scalar coefficients, and addition (with subtraction just being the inverse of addition). We never have occasion to divide when computing the value of a polynomial. So it is with matrices. We can add and subtract matrices, we can multiply matrices by scalars, and we can form powers of square matrices by repeated applications of matrix multiplication. We do not normally divide matrices (though sometimes we can multiply by an inverse). If a matrix is square, all the operations constituting a polynomial will preserve the size of the matrix. So it is natural to consider evaluating a polynomial with a matrix, effectively replacing the variable of the polynomial by a matrix. We will demonstrate with an example.

Let

\begin{align*} p(x)=14+19x-3x^2-7x^3+x^4&& D=\begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} \end{align*}

and we will compute \(p(D)\text{.}\)

First, the necessary powers of \(D\text{.}\) Notice that \(D^0\) is defined to be the multiplicative identity, \(I_3\text{,}\) as will be the case in general. We have

\begin{align*} D^0&=I_3=\begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{bmatrix}\\ D^1&=D=\begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix}\\ D^2&=DD^1= \begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} \begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} = \begin{bmatrix}-2 & -1 & -6\\5 & 1 & 0\\1 & -8 & -7\end{bmatrix}\\ D^3&=DD^2= \begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} \begin{bmatrix}-2 & -1 & -6\\5 & 1 & 0\\1 & -8 & -7\end{bmatrix} = \begin{bmatrix}19 & -12 & -8\\-4 & 15 & 8\\12 & -4 & 11\end{bmatrix}\\ D^4&=DD^3= \begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} \begin{bmatrix}19 & -12 & -8\\-4 & 15 & 8\\12 & -4 & 11\end{bmatrix} = \begin{bmatrix}-7 & 49 & 54\\-5 & -4 & -30\\-49 & 47 & 43\end{bmatrix}\text{.} \end{align*}

Then

\begin{align*} p(D)&=14+19D-3D^2-7D^3+D^4\\ &= 14\begin{bmatrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{bmatrix} +19\begin{bmatrix}-1 & 3 & 2\\1 & 0 & -2\\-3 & 1 & 1\end{bmatrix} -3\begin{bmatrix}-2 & -1 & -6\\5 & 1 & 0\\1 & -8 & -7\end{bmatrix}\\ &\quad\quad -7\begin{bmatrix}19 & -12 & -8\\-4 & 15 & 8\\12 & -4 & 11\end{bmatrix} +\begin{bmatrix}-7 & 49 & 54\\-5 & -4 & -30\\-49 & 47 & 43\end{bmatrix}\\ &= \begin{bmatrix} -139 & 193 & 166\\ 27 & -98 & -124\\ -193 & 118 & 20 \end{bmatrix}\text{.} \end{align*}

Notice that \(p(x)\) factors as

\begin{equation*} p(x)=14+19x-3x^2-7x^3+x^4=(x-2)(x-7)(x+1)^2\text{.} \end{equation*}

Because \(D\) commutes with itself (\(DD=DD\)), we can use distributivity of matrix multiplication across matrix addition (Theorem MMDAA) without being careful with any of the matrix products, and just as easily evaluate \(p(D)\) using the factored form of \(p(x)\text{,}\)

\begin{align*} p(D)&=14+19D-3D^2-7D^3+D^4=(D-2I_3)(D-7I_3)(D+I_3)^2\\ &= \begin{bmatrix} -3 & 3 & 2\\ 1 & -2 & -2\\ -3 & 1 & -1 \end{bmatrix}\, \begin{bmatrix} -8 & 3 & 2\\ 1 & -7 & -2\\ -3 & 1 & -6 \end{bmatrix}\, \begin{bmatrix} 0 & 3 & 2\\ 1 & 1 & -2\\ -3 & 1 & 2 \end{bmatrix}^2\\ &= \begin{bmatrix} -139 & 193 & 166\\ 27 & -98 & -124\\ -193 & 118 & 20 \end{bmatrix}\text{.} \end{align*}

This example is not meant to be too profound. It is meant to show you that it is natural to evaluate a polynomial with a matrix, and that the factored form of the polynomial is as good as (or maybe better than) the expanded form. And do not forget that constant terms in polynomials are really multiples of the identity matrix when we are evaluating the polynomial with a matrix.

One simple fact about two different polynomials of the same matrix will spare us some effort in later proofs (especially in Section ME).

We need concrete versions of the polynomials, so let

\begin{align*} p(x) &= a_0+a_1x+a_2x^2+\cdots+a_nx^n&q(x) &= b_0+b_1x+b_2x^2+\cdots+b_mx^m\text{.} \end{align*}

There is very little mysterious about this proof, it mostly requires basic properties of matrix operations. The desired commutativity really boils down to the observation that \(A^iA^j = A^{i+j} = A^jA^i\text{,}\) which is used in the middle of the following chain of matrix equalities.

\begin{align*} p(A)q(A) &= \left(\sum_{i=0}^{n}a_iA^i\right)\left(\sum_{j=0}^{m}b_jA^j\right)\\ &= \sum_{i=0}^{n}\left(a_iA^i\sum_{j=0}^{m}b_jA^j\right)&&\knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= \sum_{i=0}^{n}\sum_{j=0}^{m}a_iA^ib_jA^j&&\knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= \sum_{i=0}^{n}\sum_{j=0}^{m}a_ib_jA^iA^j&&\knowl{./knowl/property-SMAM.html}{\text{Property SMAM}}\\ &= \sum_{i=0}^{n}\sum_{j=0}^{m}a_ib_jA^{i+j}\\ &= \sum_{j=0}^{m}\sum_{i=0}^{n}a_ib_jA^{i+j}&&\knowl{./knowl/property-CM.html}{\text{Property CM}}\\ &= \sum_{j=0}^{m}\sum_{i=0}^{n}a_ib_jA^jA^i&&\\ &= \sum_{j=0}^{m}\sum_{i=0}^{n}b_jA^ja_iA^i&&\knowl{./knowl/property-SMAM.html}{\text{Property SMAM}}\\ &= \sum_{j=0}^{m}\left(b_jA^j\sum_{i=0}^{n}a_iA^i\right)&&\knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= \left(\sum_{j=0}^{m}b_jA^j\right)\left(\sum_{i=0}^{n}a_iA^i\right)&&\knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= q(A)p(A)\text{.} \end{align*}

Subsection EEE Existence of Eigenvalues and Eigenvectors

Before we embark on computing eigenvalues and eigenvectors, we will prove that every matrix has at least one eigenvalue (and an eigenvector to go with it). Later, in Theorem MNEM, we will determine the maximum number of eigenvalues a matrix may have.

The next theorem follows a similar result from Axler's Linear Algebra Done Right. It is key for our ability to defer Chapter D until later. Employed with proofs by induction (Proof Technique I), we can later establish more comprehensive results about all of the eigenvalues of a matrix.

Suppose that \(A\) has size \(n\text{,}\) and choose \(\vect{x}\) as any nonzero vector from \(\complex{n}\text{.}\) (Notice how much latitude we have in our choice of \(\vect{x}\text{.}\) Only the zero vector is off-limits.) Consider the set

\begin{equation*} S=\set{\vect{x},\,A\vect{x},\,A^2\vect{x},\,A^3\vect{x},\,\ldots,\,A^n\vect{x}}\text{.} \end{equation*}

This is a set of \(n+1\) vectors from \(\complex{n}\text{,}\) so by Theorem MVSLD, \(S\) is linearly dependent. Let \(a_0,\,a_1,\,a_2,\,\ldots,\,a_n\) be a collection of \(n+1\) scalars from \(\complexes\text{,}\) not all zero, that provide a relation of linear dependence on \(S\text{.}\) In other words,

\begin{equation*} a_0\vect{x}+a_1A\vect{x}+a_2A^2\vect{x}+a_3A^3\vect{x}+\cdots+a_nA^n\vect{x}=\zerovector\text{.} \end{equation*}

Some of the \(a_i\) are nonzero. Suppose that just \(a_0\neq 0\text{,}\) and \(a_1=a_2=a_3=\cdots=a_n=0\text{.}\) Then \(a_0\vect{x}=\zerovector\) and by Theorem SMEZV, either \(a_0=0\) or \(\vect{x}=\zerovector\text{,}\) which are both contradictions. So \(a_i\neq 0\) for some \(i\geq 1\text{.}\) Let \(m\) be the largest integer such that \(a_m\neq 0\text{.}\) From this discussion we know that \(m\geq 1\text{.}\) We can also assume that \(a_m=1\text{,}\) for if not, replace each \(a_i\) by \(a_i/a_m\) to obtain scalars that serve equally well in providing a relation of linear dependence on \(S\text{.}\)

Define the polynomial

\begin{equation*} p(x)=a_0+a_1x+a_2x^2+a_3x^3+\cdots+a_{m-1}x^{m-1}+x^m\text{.} \end{equation*}

Because we have consistently used \(\complexes\) as our set of scalars (rather than \({\mathbb R}\)), we know that we can factor \(p(x)\) into linear factors of the form \((x-b_i)\text{,}\) where \(b_i\in\complexes\text{.}\) So there are scalars, \(\scalarlist{b}{m}\text{,}\) from \(\complexes\) so that,

\begin{equation*} p(x)=(x-b_m)(x-b_{m-1})\cdots(x-b_3)(x-b_2)(x-b_1)\text{.} \end{equation*}

Put it all together and

\begin{align*} \zerovector&=a_0\vect{x}+a_1A\vect{x}+a_2A^2\vect{x}+\cdots+a_nA^n\vect{x}\\ &=a_0\vect{x}+a_1A\vect{x}+a_2A^2\vect{x}+\cdots+a_{m-1}A^{m-1}\vect{x}+A^m\vect{x}&& a_i=0\text{ for }i\gt m\\ &=\left(a_0I_n+a_1A+a_2A^2+\cdots+a_{m-1}A^{m-1}+A^m\right)\vect{x}&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &=p(A)\vect{x}&& \text{Definition of }p(x)\\ &=(A-b_mI_n)(A-b_{m-1}I_n)\cdots(A-b_2I_n)(A-b_1I_n)\vect{x}\text{.} \end{align*}

Let \(k\) be the smallest integer such that

\begin{equation*} (A-b_kI_n)(A-b_{k-1}I_n)\cdots(A-b_2I_n)(A-b_1I_n)\vect{x}=\zerovector\text{.} \end{equation*}

From the preceding equation, we know that \(k\leq m\text{.}\) Define the vector \(\vect{z}\) by

\begin{equation*} \vect{z}=(A-b_{k-1}I_n)\cdots(A-b_2I_n)(A-b_1I_n)\vect{x}\text{.} \end{equation*}

Notice that by the definition of \(k\text{,}\) the vector \(\vect{z}\) must be nonzero. In the case where \(k=1\text{,}\) we understand that \(\vect{z}\) is defined by \(\vect{z}=\vect{x}\text{,}\) and \(\vect{z}\) is still nonzero. Now

\begin{equation*} (A-b_kI_n)\vect{z}=(A-b_kI_n)(A-b_{k-1}I_n)\cdots(A-b_3I_n)(A-b_2I_n)(A-b_1I_n)\vect{x}=\zerovector \end{equation*}

which allows us to write

\begin{align*} A\vect{z} &=(A+\zeromatrix)\vect{z}&& \knowl{./knowl/property-ZM.html}{\text{Property ZM}}\\ &=(A-b_kI_n+b_kI_n)\vect{z}&& \knowl{./knowl/property-AIM.html}{\text{Property AIM}}\\ &=(A-b_kI_n)\vect{z}+b_kI_n\vect{z}&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &=\zerovector+b_kI_n\vect{z}&& \text{Defining property of }\vect{z}\\ &=b_kI_n\vect{z}&& \knowl{./knowl/property-ZM.html}{\text{Property ZM}}\\ &=b_k\vect{z}&& \knowl{./knowl/theorem-MMIM.html}{\text{Theorem MMIM}}\text{.} \end{align*}

Since \(\vect{z}\neq\zerovector\text{,}\) this equation says that \(\vect{z}\) is an eigenvector of \(A\) for the eigenvalue \(\lambda=b_k\) (Definition EEM), so we have shown that any square matrix \(A\) does have at least one eigenvalue.

The proof of Theorem EMHE is constructive (it contains an unambiguous procedure that leads to an eigenvalue), but it is not meant to be practical. We will illustrate the theorem with an example, the purpose being to provide a companion for studying the proof and not to suggest this is the best procedure for computing an eigenvalue.

This example illustrates the proof of Theorem EMHE, and so will employ the same notation as the proof — look there for full explanations. It is not meant to be a final example of a reasonable computational approach to finding eigenvalues and eigenvectors. OK, warnings in place, here we go.

Consider the matrix \(A\text{,}\) and choose the vector \(\vect{x}\text{,}\)

\begin{align*} A&=\begin{bmatrix} -7 & -1 & 11 & 0 & -4\\ 4 & 1 & 0 & 2 & 0\\ -10 & -1 & 14 & 0 & -4\\ 8 & 2 & -15 & -1 & 5\\ -10 & -1 & 16 & 0 & -6 \end{bmatrix} & \vect{x}&=\colvector{3\\0\\3\\-5\\4}\text{.} \end{align*}

It is important to notice that the choice of \(\vect{x}\) could be anything, so long as it is not the zero vector. We have not chosen \(\vect{x}\) totally at random, but so as to make our illustration of the theorem as general as possible. You could replicate this example with your own choice and the computations are guaranteed to be reasonable, provided you have a computational tool that will factor a fifth degree polynomial for you.

The set

\begin{align*} S&=\set{\vect{x},\,A\vect{x},\,A^2\vect{x},\,A^3\vect{x},\,A^4\vect{x},\,A^5\vect{x}}\\ &= \set{ \colvector{3\\0\\3\\-5\\4},\, \colvector{-4\\2\\-4\\4\\-6},\, \colvector{6\\-6\\6\\-2\\10},\, \colvector{-10\\14\\-10\\-2\\-18},\, \colvector{18\\-30\\18\\10\\34},\, \colvector{-34\\62\\-34\\-26\\-66} } \end{align*}

is guaranteed to be linearly dependent, as it has six vectors from \(\complex{5}\) (Theorem MVSLD).

We will search for a nontrivial relation of linear dependence by solving a homogeneous system of equations whose coefficient matrix has the vectors of \(S\) as columns through row operations,

\begin{equation*} \begin{bmatrix} 3 & -4 & 6 & -10 & 18 & -34\\ 0 & 2 & -6 & 14 & -30 & 62\\ 3 & -4 & 6 & -10 & 18 & -34\\ -5 & 4 & -2 & -2 & 10 & -26\\ 4 & -6 & 10 & -18 & 34 & -66 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & -2 & 6 & -14 & 30\\ 0 & \leading{1} & -3 & 7 & -15 & 31\\ 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

There are four free variables for describing solutions to this homogeneous system, so we have our pick of solutions. The most expedient choice would be to set \(x_3=1\) and \(x_4=x_5=x_6=0\text{.}\) However, we will again opt to maximize the generality of our illustration of Theorem EMHE and choose \(x_3=-8\text{,}\) \(x_4=-3\text{,}\) \(x_5=1\) and \(x_6=0\text{.}\) This leads to a solution with \(x_1=16\) and \(x_2=12\text{.}\)

This relation of linear dependence then says that

\begin{align*} \zerovector&=16\vect{x}+12A\vect{x}-8A^2\vect{x}-3A^3\vect{x}+A^4\vect{x}+0A^5\vect{x}\\ \zerovector&=\left(16+12A-8A^2-3A^3+A^4\right)\vect{x}\text{.} \end{align*}

So we define \(p(x)=16+12x-8x^2-3x^3+x^4\text{,}\) and as advertised in the proof of Theorem EMHE, we have a polynomial of degree \(m=4\geq 1\) such that \(p(A)\vect{x}=\zerovector\text{.}\) Now we need to factor \(p(x)\) over \(\complexes\text{.}\) If you made your own choice of \(\vect{x}\) at the start, this is where you might have a fifth degree polynomial, and where you might need to use a computational tool to find roots and factors. We have

\begin{equation*} p(x)=16+12x-8x^2-3x^3+x^4=(x-4)(x+2)(x-2)(x+1)\text{.} \end{equation*}

So we know that

\begin{equation*} \zerovector=p(A)\vect{x}=(A-4I_5)(A+2I_5)(A-2I_5)(A+1I_5)\vect{x}\text{.} \end{equation*}

We apply one factor at a time, until we get the zero vector, so as to determine the value of \(k\) described in the proof of Theorem EMHE,

\begin{align*} (A+1I_5)\vect{x}&= \begin{bmatrix} -6 & -1 & 11 & 0 & -4\\ 4 & 2 & 0 & 2 & 0\\ -10 & -1 & 15 & 0 & -4\\ 8 & 2 & -15 & 0 & 5\\ -10 & -1 & 16 & 0 & -5 \end{bmatrix} \colvector{3\\0\\3\\-5\\4} = \colvector{-1\\2\\-1\\-1\\-2}\\ (A-2I_5)(A+1I_5)\vect{x}&= \begin{bmatrix} -9 & -1 & 11 & 0 & -4\\ 4 & -1 & 0 & 2 & 0\\ -10 & -1 & 12 & 0 & -4\\ 8 & 2 & -15 & -3 & 5\\ -10 & -1 & 16 & 0 & -8 \end{bmatrix} \colvector{-1\\2\\-1\\-1\\-2} = \colvector{4\\-8\\4\\4\\8}\\ (A+2I_5)(A-2I_5)(A+1I_5)\vect{x}&= \begin{bmatrix} -5 & -1 & 11 & 0 & -4\\ 4 & 3 & 0 & 2 & 0\\ -10 & -1 & 16 & 0 & -4\\ 8 & 2 & -15 & 1 & 5\\ -10 & -1 & 16 & 0 & -4 \end{bmatrix} \colvector{4\\-8\\4\\4\\8} = \colvector{0\\0\\0\\0\\0}\text{.} \end{align*}

So \(k=3\) and

\begin{equation*} \vect{z}=(A-2I_5)(A+1I_5)\vect{x}=\colvector{4\\-8\\4\\4\\8} \end{equation*}

is an eigenvector of \(A\) for the eigenvalue \(\lambda=-2\text{,}\) as you can check by doing the computation \(A\vect{z}\text{.}\)

Using the same choice of \(\vect{x}\text{,}\) will lead to the same polynomial \(p(x)\text{,}\) but we can work with the factorization in a different order.

\begin{align*} (A+2I_5)\vect{x}&= \begin{bmatrix} -5 & -1 & 11 & 0 & -4\\ 4 & 3 & 0 & 2 & 0\\ -10 & -1 & 16 & 0 & -4\\ 8 & 2 & -15 & 1 & 5\\ -10 & -1 & 16 & 0 & -4 \end{bmatrix} \colvector{3\\0\\3\\-5\\4} = \colvector{2\\2\\2\\-6\\2}\\ (A+1I_5)(A+2I_5)\vect{x}&= \begin{bmatrix} -6 & -1 & 11 & 0 & -4\\ 4 & 2 & 0 & 2 & 0\\ -10 & -1 & 15 & 0 & -4\\ 8 & 2 & -15 & 0 & 5\\ -10 & -1 & 16 & 0 & -5 \end{bmatrix} \colvector{2\\2\\2\\-6\\2} = \colvector{0\\0\\0\\0\\0} \end{align*}

Now \(k=2\) and

\begin{equation*} \vect{z}=(A+2I_5)\vect{x}=\colvector{2\\2\\2\\-6\\2} \end{equation*}

is an eigenvector of \(A\) for the eigenvalue \(\lambda=-1\text{,}\) as you can check by doing the computation \(A\vect{z}\text{.}\)

Theorem EMHE guarantees the existence of an eigenvalue, and the constructive proof (see Proof Technique C) provides a procedure for finding an eigenvalue, provided we can factor a polynomial partway through. But it does not seem to offer much control. We choose (guess?) any vector \(\vect{x}\text{,}\) and the set of linearly dependent vectors created via powers of the matrix could have many different choices (guesses?) for relations of linear dependence leading to many different polynomials. In the extreme you could be unlucky (or lucky?) and end up with a degree 1 polynomial. For example, choose

\begin{equation*} \vect{x}=\colvector{2\\1\\3\\-3\\3} \end{equation*}

and observe that the vectors \(A^i\vect{x}\) lead with

\begin{align*} A^0\vect{x} &= \colvector{2\\1\\3\\-3\\3} & A^1\vect{x} &= \colvector{6\\3\\9\\-9\\9} \end{align*}

which is already a linearly dependent set. So a possible polynomial is \(p(x)=x-3\text{,}\) and we recognize that \(\vect{x}\) happens to be an eigenvector of \(A\text{!}\) If we want to find some of the other eigenvalues of \(A\) we need to start over with another choice of \(\vect{x}\text{.}\) (Why start over? Compute a few more \(A^i\vect{x}\) to see what happens.)

So the goal of this chapter is to make eigenvalues less mysterious, but knowing that they actually exist is a good start.

If you work through this example with your own choice of the vector \(\vect{x}\) (strongly recommended) then the eigenvalue you will find may be different, but will be in the set \(\set{3,\,0,\,1,\,-1,\,-2}\text{.}\) See Exercise EE.M60 for a suggested starting vector.

Subsection CEE Computing Eigenvalues and Eigenvectors

It may already be clear from Theorem EMHE and Example CASE that for an eigenvalue \(\lambda\) the matrix \(A-\lambda I_n\) plays an important role. Here is the formal version as a theorem.

Both directions of the proof can be accomplished with the following chain of equivalences,

\begin{align*} A\vect{x}=\lambda\vect{x} &\iff A\vect{x}=\lambda I_n\vect{x}&&\knowl{./knowl/theorem-MMIM.html}{\text{Theorem MMIM}}\\ &\iff A\vect{x}-\lambda I_n\vect{x}=\zerovector&&\knowl{./knowl/property-AIM.html}{\text{Property AIM}}\\ &\iff \left(A-\lambda I_n\right)\vect{x}=\zerovector&&\knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &\iff \vect{x}\text{ is a solution to }\homosystem{A-\lambda I_n}&&\knowl{./knowl/theorem-SLEMM.html}{\text{Theorem SLEMM}}\text{.} \end{align*}
(⇒) 

Suppose \(\lambda\) is an eigenvalue of \(A\) with eigenvector \(\vect{x}\neq\zerovector\text{.}\) Then the equivalences above and Definition NM show that \(A-\lambda I_n\) is singular.

(⇐) 

Suppose that \(A-\lambda I_n\) is singular. Then Definition NM guarantees the existence of \(\vect{x}\text{,}\) a nonzero solution to \(\homosystem{A-\lambda I_n}\text{.}\) Then the equivalences above show that \(\vect{x}\) is an eigenvector of \(A\) with eigenvalue \(\lambda\text{.}\)

So, for an eigenvalue \(\lambda\) and associated eigenvector \(\vect{x}\neq\zerovector\text{,}\) the vector \(\vect{x}\) will be a nonzero element of the null space of \(A-\lambda I_n\text{,}\) while the matrix \(A-\lambda I_n\) will be singular.

Let us now turn our attention to the computation of eigenvectors.

Definition EM. Eigenspace of a Matrix.

Suppose that \(A\) is a square matrix and \(\lambda\) is an eigenvalue of \(A\text{.}\) Then the eigenspace of \(A\) for \(\lambda\text{,}\) \(\eigenspace{A}{\lambda}\text{,}\) is the set of all the eigenvectors of \(A\) for \(\lambda\text{,}\) together with the inclusion of the zero vector.

Example SEE hinted that the set of eigenvectors for a single eigenvalue might have some closure properties, and with the addition of the one eigenvector that is never an eigenvector, \(\zerovector\text{,}\) we indeed get a whole subspace.

We will check the three conditions of Theorem TSS. First, Definition EM explicitly includes the zero vector in \(\eigenspace{A}{\lambda}\text{,}\) so the set is nonempty.

Suppose that \(\vect{x},\,\vect{y}\in\eigenspace{A}{\lambda}\text{,}\) that is, \(\vect{x}\) and \(\vect{y}\) are two eigenvectors of \(A\) for \(\lambda\text{.}\) Then

\begin{align*} A\left(\vect{x}+\vect{y}\right)&=A\vect{x}+A\vect{y}&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &=\lambda\vect{x}+\lambda\vect{y}&& \knowl{./knowl/definition-EEM.html}{\text{Definition EEM}}\\ &=\lambda\left(\vect{x}+\vect{y}\right)&& \knowl{./knowl/property-DVAC.html}{\text{Property DVAC}}\text{.} \end{align*}

So either \(\vect{x}+\vect{y}=\zerovector\text{,}\) or \(\vect{x}+\vect{y}\) is an eigenvector of \(A\) for \(\lambda\) (Definition EEM). So, in either event, \(\vect{x}+\vect{y}\in\eigenspace{A}{\lambda}\text{,}\) and we have additive closure.

Suppose that \(\alpha\in\complexes\text{,}\) and that \(\vect{x}\in\eigenspace{A}{\lambda}\text{,}\) that is, \(\vect{x}\) is an eigenvector of \(A\) for \(\lambda\text{.}\) Then

\begin{align*} A\left(\alpha\vect{x}\right)&=\alpha\left(A\vect{x}\right)&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}}\\ &=\alpha\lambda\vect{x}&& \knowl{./knowl/definition-EEM.html}{\text{Definition EEM}}\\ &=\lambda\left(\alpha\vect{x}\right)&& \knowl{./knowl/property-SMAC.html}{\text{Property SMAC}}\text{.} \end{align*}

So either \(\alpha\vect{x}=\zerovector\text{,}\) or \(\alpha\vect{x}\) is an eigenvector of \(A\) for \(\lambda\) (Definition EEM). So, in either event, \(\alpha\vect{x}\in\eigenspace{A}{\lambda}\text{,}\) and we have scalar closure.

With the three conditions of Theorem TSS met, we know \(\eigenspace{A}{\lambda}\) is a subspace.

Theorem EMS tells us that an eigenspace is a subspace (and hence a vector space in its own right). Our next theorem tells us how to quickly construct this subspace.

The conclusion of this theorem is an equality of sets, so normally we would follow the advice of Definition SE. However, in this case we can construct a sequence of equivalences which will together provide the two subset inclusions we need. First, notice that \(\zerovector\in\eigenspace{A}{\lambda}\) by Definition EM and \(\zerovector\in\nsp{A-\lambda I_n}\) by Theorem HSC. Now consider any nonzero vector \(\vect{x}\in\complex{n}\text{,}\)

\begin{align*} \vect{x}\in\eigenspace{A}{\lambda}&\iff A\vect{x}=\lambda\vect{x}&& \knowl{./knowl/definition-EM.html}{\text{Definition EM}}\\ &\iff A\vect{x}-\lambda\vect{x}=\zerovector&& \knowl{./knowl/property-AIC.html}{\text{Property AIC}}\\ &\iff A\vect{x}-\lambda I_n\vect{x}=\zerovector&& \knowl{./knowl/theorem-MMIM.html}{\text{Theorem MMIM}}\\ &\iff\left(A-\lambda I_n\right)\vect{x}=\zerovector&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &\iff\vect{x}\in\nsp{A-\lambda I_n}&& \knowl{./knowl/definition-NSM.html}{\text{Definition NSM}}\text{.} \end{align*}

Since Theorem EMNS describes the set of all the eigenvectors of \(A\) as a null space we can use techniques such as Theorem BNS to provide concise descriptions of eigenspaces. Theorem EMNS also provides a trivial proof for Theorem EMS.

Consider

\begin{equation*} F= \begin{bmatrix} -13 & -8 & -4\\ 12 & 7 & 4\\ 24 & 16 & 7 \end{bmatrix}\text{.} \end{equation*}

Following Theorem EMHE we choose \(\vect{x}=\vect{e}_1\text{,}\) the first column of the identity matrix of size \(3\) (Definition SUV), and compute the powers \(F^i\vect{e}_1\text{,}\) \(0\leq i\leq 2\text{,}\) which is the fewest powers to produce a linearly dependent set. We have

\begin{equation*} \left\lbrack F^0\vect{e}_1|F^1\vect{e}_1|F^2\vect{e}_1\right\rbrack = \begin{bmatrix} 1 & -13 & -23 \\ 0 & 12 & 24 \\ 0 & 24 & 48 \end{bmatrix} \rref \begin{bmatrix} 1 & 0 & 3 \\ 0 & 1 & 2 \\ 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

So we have a relation of linear dependence on the columns

\begin{equation*} (-3)F^0\vect{e}_1 + (-2)F^1\vect{e}_1 + 1F^2\vect{e}_1 \end{equation*}

and therefore potential eigenvalues come as roots of the polynomial

\begin{equation*} p(x) = -3 - 2x + x^2 = (x+1)(x-3)\text{.} \end{equation*}

Rather than following Theorem EMHE we can simply proceed to look for eigenvectors of \(F\) for each root of \(p(x)\text{.}\) Theorem ESM implies that a root will be an eigenvalue and we will find a nontrivial eigenspace, or a root will not be an eigenvalue and we will not find a nontrivial eigenspace.

We will now take each root in turn and attempt to compute an eigenspace. To do this, we row-reduce the matrix \(F-\lambda I_3\) in order to determine solutions to the homogeneous system \(\homosystem{F-\lambda I_3}\) and then express the eigenspace as the null space of \(F-\lambda I_3\) (Theorem EMNS). Theorem BNS then tells us how to write the null space as the span of a basis. We have

\begin{align*} \lambda&=3&F-3I_3&= \begin{bmatrix} -16 & -8 & -4\\ 12 & 4 & 4\\ 24 & 16 & 4 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & \frac{1}{2}\\ 0 & \leading{1} & -\frac{1}{2}\\ 0 & 0 & 0 \end{bmatrix}\\ &&\eigenspace{F}{3}&=\nsp{F-3I_3} =\spn{\set{\colvector{-\frac{1}{2}\\\frac{1}{2}\\1}}} =\spn{\set{\colvector{-1\\1\\2}}}\\ \lambda&=-1&F+1I_3&= \begin{bmatrix} -12 & -8 & -4\\ 12 & 8 & 4\\ 24 & 16 & 8 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & \frac{2}{3} & \frac{1}{3}\\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix}\\ &&\eigenspace{F}{-1}&=\nsp{F+1I_3} =\spn{\set{\colvector{-\frac{2}{3}\\1\\0},\,\colvector{-\frac{1}{3}\\0\\1}}} =\spn{\set{\colvector{-2\\3\\0},\,\colvector{-1\\0\\3}}}\text{.} \end{align*}

So each root of \(p(x)\) leads to a nontrivial eigenspace and is therefore an eigenvalue. Could \(F\) have other eigenvalues? Yes, it could, since we do not yet have enough theorems at our disposal to be certain these are the only eigenvalues. If you try different vectors for the choice of \(\vect{x}\text{,}\) such as \(\vect{e}_2\) or \(\vect{e}_3\text{,}\) and many, many others, the procedure above leads to the exact same polynomial. So you are unlikely to find more. To find fewer eigenvalues than we did, you would need to to have an unfortunate accident like innocently choosing an actual eigenvector for \(\vect{x}\text{,}\) but then with one or two more different guesses you would find the same two eigenvalues we have.

Consider the \(5\times 5\) matrix

\begin{equation*} A = \begin{bmatrix} 21 & -3 & 42 & 45 & -21 \\ -39 & 0 & -80 & -78 & 40 \\ 23 & 1 & 45 & 43 & -24 \\ -31 & 0 & -61 & -60 & 32 \\ 6 & -1 & 13 & 14 & -6 \end{bmatrix}\text{.} \end{equation*}

Following Theorem EMHE we set \(\vect{x}=\vect{e}_1\) and compute \(A^i\vect{x}\text{.}\)

\begin{align*} A^0\vect{x} &= \colvector{1 \\ 0 \\ 0 \\ 0 \\ 0} & A^1\vect{x} &= \colvector{21 \\ -39 \\ 23 \\ -31 \\ 6} & A^2\vect{x} &= \colvector{3 \\ -1 \\ 2 \\ -2 \\ -6} & A^3\vect{x} &= \colvector{186 \\ -361 \\ 216 \\ -287 \\ 53} & A^4\vect{x} &= \colvector{33 \\ -28 \\ 24 \\ -26 \\ -51} & A^5\vect{x} &= \colvector{1686 \\ -3219 \\ 1917 \\ -2559 \\ 480} \end{align*}

and a relation of linear dependence on all six vectors is

\begin{equation*} -3A^0\vect{x} - 27A^1\vect{x} - 6A^3\vect{x} + A^5\vect{x} \end{equation*}

so candidates for eigenvalues are roots of \(p(x) = x^5 - 6x^3 - 27x - 3\text{.}\) Any other \(\vect{x}\) you might try will also lead to this polynomial.

You should know the quadratic equation for finding roots of a degree \(2\) polynomial. Maybe you also have experience with the similar cubic and quartic equations. And perhaps you have been told there is no quintic equation. This means there are polynomials of degree \(5\) whose roots cannot be expressed simply with arithmetic and taking roots (square, cube, …). This polynomial is one of them. It does have five roots (three real, two complex)—we just cannot write them down simply. In practice we would use computer programs to find close approximations. Finding eigenvalues can be hard. Or impossible.

If you want to learn more about the insolvability of the quintic, take an abstract algebra course that includes Galois Theory as one of its topics.

Subsection ECEE Examples of Computing Eigenvalues and Eigenvectors

In this subsection we will illustrate some more concrete examples of determining the eigenvalues (and eigenvectors) of matrices. We do not yet have all the theorems we can bring to bear, but it will help to see some examples. We begin with a definition we will use regularly, and demonstrate in these examples.

Definition GME. Geometric Multiplicity of an Eigenvalue.

Suppose that \(A\) is a square matrix and \(\lambda\) is an eigenvalue of \(A\text{.}\) Then the geometric multiplicity of \(\lambda\text{,}\) \(\geomult{A}{\lambda}\text{,}\) is the dimension of the eigenspace \(\eigenspace{A}{\lambda}\text{.}\)

Consider the square matrix \(C\text{,}\) of size \(4\text{.}\)

\begin{equation*} C = \begin{bmatrix} 0 & 0 & 0 & 2 \\ -1 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & -2 & 0 \end{bmatrix}\text{.} \end{equation*}

We will determine its eigenvalues by the method of Theorem EMHE. We can arrive at an informed choice of \(\vect{x}\) by looking at powers of \(C\text{.}\)

\begin{align*} C^0 &= \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} & C^1 &= \begin{bmatrix} 0 & 0 & 0 & 2 \\ -1 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & -2 & 0 \end{bmatrix} & C^2 &= \begin{bmatrix} 0 & 0 & -4 & 0 \\ 0 & 0 & 0 & -2 \\ -4 & 0 & 0 & 0 \\ 0 & -8 & 0 & 0 \end{bmatrix}\\ C^3 &= \begin{bmatrix} 0 & -16 & 0 & 0 \\ 0 & 0 & 4 & 0 \\ 0 & 0 & 0 & -8 \\ 8 & 0 & 0 & 0 \end{bmatrix} & C^4 &= \begin{bmatrix} 16 & 0 & 0 & 0 \\ 0 & 16 & 0 & 0 \\ 0 & 0 & 16 & 0 \\ 0 & 0 & 0 & 16 \end{bmatrix} \end{align*}

The first column of the first four powers can be seen to be a linearly independent set, with a little thought given to the organization of the zero and nonzero entries.

\begin{equation*} \set{C^0\vect{e}_1,\,C^1\vect{e}_1,\,C^2\vect{e}_1,\,C^3\vect{e}_1,\,C^4\vect{e}_1} = \set{\colvector{1\\0\\0\\0},\,\colvector{0\\-1\\0\\0},\,\colvector{0\\0\\-4\\0},\,\colvector{0\\0\\0\\8},\,\colvector{16\\0\\0\\0}} \end{equation*}

We know adding the first column of the final power will be a linearly dependent set, but even better, a relation of linear dependence can be formed with just the first and last vector. So we use \(\vect{x}=\vect{e}_1\text{.}\) (This discussion applies equally well to the second columns, third columns, or fourth columns of the powers of \(C\text{,}\) and they all lead to the same polynomial.)

\begin{equation*} -16C^0\vect{e}_1 + C^4\vect{e}_1 = \zerovector \end{equation*}

So the polynomial is \(p(x)=x^4-16\text{.}\) This polynomial does have four roots, but we need to use complex numbers to express them. Factoring yields

\begin{equation*} x^4 - 16 = (x^2 - 4)(x^2 + 4) = (x - 2)(x + 2)(x - 2i)(x + 2i)\text{.} \end{equation*}

So potential eigenvalues are \(\lambda = 2, -2, 2i, -2i\text{.}\)

While Theorem EMHE provides a construction for eigenvectors (as a way to establish that a polynomial's root is really an eigenvalue), as a practical matter eigenvectors and eigenspaces are more easily (and comprehensively) determined via Theorem EMNS. We will compute \(\nsp{C-\lambda I_4}\) for each root of \(p(x)\) and a nontrivial null space will (a) indicate we have actually found an eigenvalue, and (b) give the eigenspace.

\begin{align*} \lambda&= 2 & C- 2 I_4&= \begin{bmatrix} -2 & 0 & 0 & 2 \\ -1 & -2 & 0 & 0 \\ 0 & 4 & -2 & 0 \\ 0 & 0 & -2 & -2 \end{bmatrix} \rref \begin{bmatrix} 1 & 0 & 0 & -1 \\ 0 & 1 & 0 & \frac{1}{2} \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{2}&=\nsp{C-2 I_4} =\spn{\set{\colvector{1 \\ -1/2 \\ -1 \\ 1}}} =\spn{\set{\colvector{2 \\ -1 \\ -2 \\ 2}}}\\ \lambda&= -2 & C-(-2)I_4&= \begin{bmatrix} 2 & 0 & 0 & 2 \\ -1 & 2 & 0 & 0 \\ 0 & 4 & 2 & 0 \\ 0 & 0 & -2 & 2 \end{bmatrix} \rref \begin{bmatrix} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & \frac{1}{2} \\ 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{-2}&=\nsp{C-(-2)I_4} =\spn{\set{\colvector{-1 \\ -1/2 \\ 1 \\ 1}}} =\spn{\set{\colvector{-2 \\ -1 \\ 2 \\2}}}\\ \lambda&= 2i & C- 2i I_4&= \begin{bmatrix} -2 i & 0 & 0 & 2 \\ -1 & -2 i & 0 & 0 \\ 0 & 4 & -2 i & 0 \\ 0 & 0 & -2 & -2 i \end{bmatrix} \rref \begin{bmatrix} 1 & 0 & 0 & i \\ 0 & 1 & 0 & -\frac{1}{2} \\ 0 & 0 & 1 & i \\ 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{2i}&=\nsp{C-2i I_4} =\spn{\set{\colvector{-i \\1/2 \\ -i \\ 1}}} =\spn{\set{\colvector{-2i \\ 1 \\ -2i \\ 2}}}\\ \lambda&= -2i & C- (-2i)I_4&= \begin{bmatrix} 2 i & 0 & 0 & 2 \\ -1 & 2 i & 0 & 0 \\ 0 & 4 & 2 i & 0 \\ 0 & 0 & -2 & 2 i \end{bmatrix} \rref \begin{bmatrix} 1 & 0 & 0 & -i \\ 0 & 1 & 0 & -\frac{1}{2} \\ 0 & 0 & 1 & -i \\ 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{-2i}&=\nsp{C-(-2i)I_4} =\spn{\set{\colvector{i \\ 1/2 \\ i \\ 1}}} =\spn{\set{\colvector{2i \\ 1 \\ 2i \\ 2}}} \end{align*}

So we have four distinct eigenvalues, each with an eigenspace of dimension \(1\text{.}\) Thus the geometric multiplicity of each eigenvalue is \(1\text{.}\) For example, we write \(\geomult{C}{-2}=1\text{.}\) We are not yet able to say if we have found all of the eigenvalues of \(C\text{,}\) but soon Theorem MNEM will help us out.

See Exercise EE.T50 for a generalization of this example.

Our second example belongs to a class of matrices that are very important for a more advanced topic known as rational canonical form. A polynomial is known as monic if the coefficient of its highest degree term is \(1\text{.}\)

Definition CMP. Companion Matrix of a Polynomial.

Given a monic polynomial \(p(x) = x^n + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \cdots + a_1x_1 + a_0\text{,}\) the companion matrix of \(p(x)\) is the \(n\times n\) matrix

\begin{equation*} C=\begin{bmatrix} 0 & 0 & \dots & 0 & 0 & -a_0 \\ 1 & 0 & \dots & 0 & 0 & -a_1 \\ 0 & 1 & \dots & 0 & 0 & -a_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots \\ 0 & 0 & \dots & 1 & 0 & -a_{n-2} \\ 0 & 0 & \dots & 0 & 1 & -a_{n-1} \end{bmatrix}\text{.} \end{equation*}

In other words, the entries of the subdiagonal are all one, the entries of the final column are the negatives of the coefficients of the polynomial (starting with the constant term), and every other entry is zero.

You may see this matrix called the Frobenius companion matrix, and in some situations it is easier to work with the transpose, so then the coefficients are across the bottom row.

Consider the companion matrix of \(p(x)=x^5 - 8x^4 + 21x^3 - 14x^2 - 20x + 24\text{,}\)

\begin{equation*} C =\begin{bmatrix} 0 & 0 & 0 & 0 & -24 \\ 1 & 0 & 0 & 0 & 20 \\ 0 & 1 & 0 & 0 & 14 \\ 0 & 0 & 1 & 0 & -21 \\ 0 & 0 & 0 & 1 & 8 \end{bmatrix}\text{.} \end{equation*}

We will find eigenvalues of \(C\) using the techniques of Theorem EMHE. But rather than blindly choosing a vector \(\vect{x}\) to be multiplied by the powers of \(C\text{,}\) we list the first six powers of \(C\text{.}\)

\begin{align*} C^0&=\begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} & C^1&=\begin{bmatrix} 0 & 0 & 0 & 0 & -24 \\ 1 & 0 & 0 & 0 & 20 \\ 0 & 1 & 0 & 0 & 14 \\ 0 & 0 & 1 & 0 & -21 \\ 0 & 0 & 0 & 1 & 8 \end{bmatrix}\\ C^2&=\begin{bmatrix} 0 & 0 & 0 & -24 & -192 \\ 0 & 0 & 0 & 20 & 136 \\ 1 & 0 & 0 & 14 & 132 \\ 0 & 1 & 0 & -21 & -154 \\ 0 & 0 & 1 & 8 & 43 \end{bmatrix} & C^3&=\begin{bmatrix} 0 & 0 & -24 & -192 & -1032 \\ 0 & 0 & 20 & 136 & 668 \\ 0 & 0 & 14 & 132 & 738 \\ 1 & 0 & -21 & -154 & -771 \\ 0 & 1 & 8 & 43 & 190 \end{bmatrix}\\ C^4&=\begin{bmatrix} 0 & -24 & -192 & -1032 & -4560 \\ 0 & 20 & 136 & 668 & 2768 \\ 0 & 14 & 132 & 738 & 3328 \\ 0 & -21 & -154 & -771 & -3252 \\ 1 & 8 & 43 & 190 & 749 \end{bmatrix} & C^5&=\begin{bmatrix} -24 & -192 & -1032 & -4560 & -17976 \\ 20 & 136 & 668 & 2768 & 10420 \\ 14 & 132 & 738 & 3328 & 13254 \\ -21 & -154 & -771 & -3252 & -12401 \\ 8 & 43 & 190 & 749 & 2740 \end{bmatrix} \end{align*}

Notice how the column with the coefficients of the polynomial marches across the matrix from right to left as the powers increase, landing in the first column of \(C^5\text{.}\) And the first column of each lesser power is a different vector from the standard basis of \(\complex{5}\) (Definition SUV). So we judiciously choose to use \(\vect{x}=\vect{e}_1\) and pick off the first column of each power. So we have

\begin{equation*} \set{C^0\vect{e}_1,\,C^1\vect{e}_1,\,C^2\vect{e}_1,\,C^3\vect{e}_1,\,C^4\vect{e}_1,\,C^5\vect{e}_1} = \set{ \colvector{1\\0\\0\\0\\0},\, \colvector{0\\1\\0\\0\\0},\, \colvector{0\\0\\1\\0\\0},\, \colvector{0\\0\\0\\1\\0},\, \colvector{0\\0\\0\\0\\1},\, \colvector{-24\\20\\14\\-21\\8} } \end{equation*}

We of course expected a linearly dependent set, but now the relation of linear dependence is obvious.

\begin{equation*} 24C^0\vect{e}_1 + (-20)C^1\vect{e}_1 + (-14)C^2\vect{e}_1 + (21)C^3\vect{e}_1 + (-8)C^4\vect{e}_1 + C^5\vect{e}_1 = \zerovector \end{equation*}

and the polynomial, much to our surprise, is

\begin{equation*} x^5 - 8x^4 + 21x^3 - 14x^2 - 20x + 24\text{.} \end{equation*}

Factor the polynomial as \((x+1)(x-2)^3(x-3)\text{,}\) so we have three candidates for eigenvalues. (Note that Theorem EMHE only guarantees that one of the roots of this polynomial must be an eigenvalue.)

As before, we attempt to construct an eigenspace for each root via Theorem EMNS, and when we see the eigenspace is nontrivial we have succeeded, and know the root is actually an eigenvalue.

\begin{align*} \lambda&=-1 & C-(-1)I_5&= \begin{bmatrix} 1 & 0 & 0 & 0 & -24 \\ 1 & 1 & 0 & 0 & 20 \\ 0 & 1 & 1 & 0 & 14 \\ 0 & 0 & 1 & 1 & -21 \\ 0 & 0 & 0 & 1 & 9 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & 0 & 0 & -24 \\ 0 & \leading{1} & 0 & 0 & 44 \\ 0 & 0 & \leading{1} & 0 & -30 \\ 0 & 0 & 0 & \leading{1} & 9 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{-1}&=\nsp{C-(-1)I_5} =\spn{\set{\colvector{24\\-44\\30\\-9\\1}}}\\ \lambda&=2 & C-2 I_5&= \begin{bmatrix} -2 & 0 & 0 & 0 & -24 \\ 1 & -2 & 0 & 0 & 20 \\ 0 & 1 & -2 & 0 & 14 \\ 0 & 0 & 1 & -2 & -21 \\ 0 & 0 & 0 & 1 & 6 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & 0 & 0 & 12 \\ 0 & \leading{1} & 0 & 0 & -4 \\ 0 & 0 & \leading{1} & 0 & -9 \\ 0 & 0 & 0 & \leading{1} & 6 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{2}&=\nsp{C-2 I_5} =\spn{\set{\colvector{-12\\4\\9\\-6\\1}}}\\ \lambda&=3 & C-3 I_5&= \begin{bmatrix} -3 & 0 & 0 & 0 & -24 \\ 1 & -3 & 0 & 0 & 20 \\ 0 & 1 & -3 & 0 & 14 \\ 0 & 0 & 1 & -3 & -21 \\ 0 & 0 & 0 & 1 & 5 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & 0 & 0 & 8 \\ 0 & \leading{1} & 0 & 0 & -4 \\ 0 & 0 & \leading{1} & 0 & -6 \\ 0 & 0 & 0 & \leading{1} & 5 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ & &\eigenspace{C}{3}&=\nsp{C-3 I_5} =\spn{\set{\colvector{-8\\4\\6\\-5\\1}}} \end{align*}

So we see that each root of the polynomial used to define the companion matrix is an eigenvalue and each eigenspace has dimension \(1\text{,}\) so the geometric multiplicity of each eigenvalue is \(1\text{.}\) Note that we are certain of these three eigenvalues, and their eigenspaces, but we do not know if this matrix might still have other eigenvalues. Our next theorem will rectify that.

There are a lot of patterns in the previous example, which look like they might not be tied to the actual choice of the polynomial \(p(x)\text{.}\) Here's one you might miss. For each \(\lambda\text{,}\) build a monic polynomial of degree \(4\) from the negatives of the entries of the final column of the reduced row-echelon version of \(C-\lambda I_5\text{.}\) For example, when \(\lambda=3\text{,}\) build \(q(x)=x^4 - 5x^3 + 6x^2 + 4x - 8\text{.}\) Now multiply this polynomial by \(x-\lambda\text{.}\) What is \((x-3)q(x)\text{?}\) Hmmm, unlikely to be a coincidence. See Exercise EE.T60 to exploit this observation.

The next theorem is a more general version of the example.

Suppose \(C\) has size \(n\) and denote \(p(x)=x^n + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \cdots + a_1x_1 + a_0\text{.}\) We will employ Theorem ESM, and so consider the matrix \(C - \lambda I_n\text{,}\) where \(\lambda\) is not necessarily an eigenvalue. To determine when this matrix is nonsingular, we will find a row-equivalent matrix in reduced row-echelon form. While we have the well-known procedure in Theorem REMEF, for this proof we will find a different sequence of row operations quicker and easier. All of the ones on the subdiagonal are very close to being leading ones, so we will not disturb them. Later we will move them into place as leading ones.

First perform a sequence of \(n-1\) row operations \(\rowopadd{\lambda}{i}{i-1}\text{,}\) starting with \(i=n\) and ending with \(i=2\text{.}\) Notice that this will convert the diagonal, except the final element, into zeros, and not disturb any other zeros or ones. We do need to carefully track the changes to entries in the final column. We list each entry in turn after it changes, due to the effect of each row operation:

\begin{align*} &-\lambda-a_{n-1}&&\text{Row }n\\ &\lambda(-\lambda - a_{n-1}) - a_{n-2} = -\lambda^2 - \lambda a_{n-1} - a_{n-2}&&\text{Row }n-1\\ &\lambda(-\lambda^2 - \lambda a_{n-1} - a_{n-2}) - a_{n-3} = -\lambda^3 - \lambda^2 a_{n-1} - \lambda a_{n-2} - a_{n-3}&&\text{Row }n-2\\ &\quad\vdots\\ &\lambda\left(-\lambda^i - \sum_{k=1}^i \lambda^{i-k} a_{n-k})\right) - a_{n-i-1} = -\lambda^{i+1} - \sum_{k=1}^{i+1} \lambda^{i+1-k} a_{n-k}&&\text{Row }n-i\\ &\quad\vdots\\ &-\lambda^{n} - \sum_{k=1}^{n} \lambda^{n-k} a_{n-k}&&\text{Row }n-(n-1) = \text{Row }1 \end{align*}

and the key observation is that the entry of the first row, in the last column, is \(-p(\lambda)\text{.}\)

Now use \(n-1\) row operations \(\rowopswap{i}{i+1}\text{,}\) for \(1\leq i\leq n\text{,}\) to move the first row down to become the last row. Note that this has the effect of making the subdiagonal ones into leading ones, and the first \(n-1\) columns are now pivot columns. Call this row-equivalent matrix \(B\text{.}\)

(⇐) 

If \(\lambda\) is a root of \(p(x)\text{,}\) \(p(\lambda)=0\text{,}\) and the last row is a zero row, so then \(B\) is in reduced row-echelon form, and the final column is not a pivot column. So \(C - \lambda I_n\) does not row-reduce to the identity matrix, and hence is singular (Theorem NMRRI). Then by Theorem ESM, \(\lambda\) is an eigenvalue.

(⇒) 

We will prove the contrapositive (Proof Technique CP): if \(\lambda\) is not a root of \(p(x)\text{,}\) then \(\lambda\) is not an eigenvalue of \(C\text{.}\) If \(p(\lambda)\) is not zero, then a row operation can be used to scale the last entry of the last row to one. Then \(n-1\) more row operations can add multiples of this last row to every other row and form a pivot column in the last column. So \(C - \lambda I_n\) is row-equivalent to the identity matrix, and by Theorem NMRRI is nonsingular. Then Theorem ESM tells us that \(\lambda\) is not an eigenvalue.

A continuing theme is that finding eigenvalues can be tricky. But once you have eigenvalues, a computation of a null space (Theorem EMNS) is a straightforward way to get eigenvectors, or the definition (Definition EEM) may be employed. The latter is used in Exercise EE.T60 to determine the eigenvectors of a companion matrix. So look at its solution if you want to know what they are.

We will not have a great need for companion matrices in this course, but you may see them again in a subsequent course. Another broad class of matrices whose eigenvalues may be determined is the upper triangular matrices. They, and their eigenvalues, will be very important for the remainder of this course (and also subsequent courses!).

Suppose \(a\) is an entry on the diagonal of \(T\text{.}\) Then \(T - a I_n\) has a zero on the diagonal, and by Theorem NTM is singular. Then Theorem ESM says \(a\) is an eigenvalue of \(T\text{.}\)

Suppose \(\lambda\) is an eigenvalue of \(T\text{.}\) Then by Theorem ESM, \(T - \lambda I_n\) is singular, and Theorem NTM implies that \(T - \lambda I_n\) has a zero on the diagonal. But this will only happen if \(T\) has a diagonal entry equal to \(\lambda\text{.}\)

We can now give a more careful explanation about eigenvalues in Sage. Sage will compute the characteristic polynomial of a matrix, with amazing ease (in other words, quite quickly, even for large matrices). The two matrix methods .charpoly() and .characteristic_polynomial() do exactly the same thing. We will use the longer name just to be more readable, you may prefer the shorter.

We now can appreciate a very fundamental obstacle to determining the eigenvalues of a matrix, which is a theme that will run through any advanced study of linear algebra. Study this example carefully before reading the discussion that follows.

We know by Theorem EMRCP that to compute eigenvalues, we need the roots of the characteristic polynomial, and from basic algebra, we know these correspond to linear factors. However, with our matrix defined with entries from QQ, the factorization of the characteristic polynomial does not “leave” that number system, only factoring “far enough” to retain factors with rational coefficients. The solutions to \(x^2 - 2 = 0\) are somewhat obvious (\(\pm\sqrt{2}\approx\pm 1.414213\)), but the roots of the cubic factor are more obscure.

But then we have QQbar to the rescue. Since this number system contains the roots of every possible polynomial with integer coefficients, we can totally factor any characteristic polynomial that results from a matrix with entries from QQbar. A common situation will be to begin with a matrix having rational entries, yet the matrix has a characteristic polynomial with roots that are complex numbers.

We can demonstrate this behavior with the extend keyword option, which tells Sage whether or not to expand the number system to contain the eigenvalues.

For matrices with entries from QQ, the default behavior is to extend to QQbar when necessary. But for other number systems, you may need to explicitly use the extend=True option.

From a factorization of the characteristic polynomial, we can see the algebraic multiplicity of each eigenvalue as the second entry of the each pair returned in the list. We demonstrate with Example SEE, extending to QQbar, which is not strictly necessary for this simple matrix.

One more example, which illustrates the behavior when we use floating-point approximations as entries (in other words, we use CDF as our number system). This is [cross-reference to target(s) "example-EMMS4" missing or not unique], both as an exact matrix with entries from QQbar and as an approximate matrix with entries from CDF.

So, we see \(\lambda=2\) as an eigenvalue with algebraic multiplicity 3, while the numerical results contain three complex numbers, each very, very close to 2. The approximate nature of these eigenvalues may be disturbing (or alarming). However, their computation, as floating-point numbers, can be incredibly fast with sophisticated algorithms allowing the analysis of huge matrices with millions of entries. And perhaps your original matrix includes data from an experiment, and is not even exact in the first place. Designing and analyzing algorithms to perform these computations quickly and accurately is part of the field known as numerical linear algebra.

One cautionary note: Sage uses a definition of the characteristic polynomial slightly different than ours, namely \(\detname{xI_n-A}\text{.}\) This has the advantage that the \(x^n\) term always has a positive one as the leading coefficient. For even values of \(n\) the two definitions create the identical polynomial, and for odd values of \(n\text{,}\) the two polynomials differ only by a multiple of \(-1\text{.}\) The reason this is not very critical is that Theorem EMRCP is true in either case, and this is a principal use of the characteristic polynomial. Our definition is more amenable to computations by hand.

There are three ways to get eigenvectors in Sage. For each eigenvalue, the method .eigenvectors_right() will return a list of eigenvectors that is a basis for the associated eigenspace. The method .eigenspaces_right() will return an eigenspace (in other words, a vector space, rather than a list of vectors) for each eigenvalue. There are also eigenmatrix methods which we will describe at the end of the chapter in Sage MD.

The matrix method .eigenvectors_right() (or equivalently the matrix method .right_eigenvectors()) produces a list of triples, one triple per eigenvalue. Each triple has an eigenvalue, a list, and then the algebraic multiplicity of the eigenvalue. The list contains vectors forming a basis for the eigenspace. Notice that the length of the list of eigenvectors will be the geometric multiplicity (and there is no easier way to get this information).

Note that this is a good place to practice burrowing down into Sage output that is full of lists (and lists of lists). See if you can extract just the second eigenvector for \(\lambda=3\) using a single statement. Or perhaps try obtaining the geometric multiplicity of \(\lambda=-2i\text{.}\) Notice, too, that Sage has automatically upgraded to QQbar to get the complex eigenvalues.

The matrix method .eigenspaces_right() (equal to .right_eigenspaces()) produces a list of pairs, one pair per eigenvalue. Each pair has an eigenvalue, followed by the eigenvalue's eigenspace. Notice that the basis matrix of the eigenspace may not have the same eigenvectors you might get from other methods. Similar to the eigenvectors method, the dimension of the eigenspace will yield the geometric multiplicity (and there is no easier way to get this information). If you need the algebraic multiplicities, you can supply the keyword option algebraic_multiplicity=True to get back triples with the algebraic multiplicity in the third entry of the triple. We will recycle the example above, and not demonstrate the algebraic multiplicity option. (We have formatted the one-row basis matrices over QQbar across several lines.)

Notice how the output includes a subspace of dimension two over the rationals, and two subspaces of dimension one over the algebraic numbers.

The upcoming Subsection EE.ECEE has many examples, which mostly reflect techniques that might be possible to verify by hand. Here is the same matrix as above, analyzed in a similar way. Practicing the examples in this subsection, either directly with the higher-level Sage commands, and/or with more primitive commands (as below) would be an extremely good exercise at this point.

Notice how we changed the number system to the algebraic numbers before working with the complex eigenvalues. Also, we are using the basis='pivot' keyword option so that bases for the eigenspaces look more like the bases described in Theorem BNS.

By now, it should be clear why we keep using the “right” variants of these methods. Eigenvectors can be defined “on the right”, \(A\vect{x}=\lambda\vect{x}\) as we have done, or “on the left,” \(\vect{x}A=\lambda\vect{x}\text{.}\) So use the “right” versions of the eigenvalue and eigenvector commands to stay consistent with the text. Recognize, too, that eigenspaces may be computed with different bases than those given in the text (typically like those for null spaces with the basis='echelon' option).

Why does the .eigenvalues() method not come in left and right versions? The upcoming Theorem ETM can be used to show that the two versions would have identical output, so there is no need.

Reading Questions EE Reading Questions

1.

Suppose \(A\) is the \(2\times 2\) matrix

\begin{equation*} A=\begin{bmatrix} -5 & 8\\-4 & 7 \end{bmatrix}\text{.} \end{equation*}

Find the eigenvalues of \(A\) by using the techniques of Theorem EMHE. Demonstrate that you have not simply consulted Sage (or similar).

2.

For each eigenvalue of \(A\text{,}\) find the corresponding eigenspace. Again, demonstrate that you have not simply consulted Sage (or similar).

3.

For the polynomial \(p(x)=3x^2-x+2\) and \(A\) from above, compute \(p(A)\text{.}\)

Exercises EE Exercises

C19.

Find the eigenvalues, eigenspaces, and geometric multiplicities for the matrix below. It is possible to do all these computations by hand, and it would be instructive to do so.

\begin{equation*} C= \begin{bmatrix} -1 & 2 \\ -6 & 6 \end{bmatrix}\text{.} \end{equation*}
Solution

We follow Theorem EMHE to obtain some candidates for eigenvalues. We use \(\vect{x}=\colvector{2\\1}\text{,}\) with no particular justification. You can try any nonzero vector. We need the powers \(C^i\vect{x}\text{.}\)

\begin{align*} C^0\vect{x} &= \colvector{2\\1} & C^1\vect{x} &= \colvector{0\\-6} & C^2\vect{x} &= \colvector{-12\\-36}\text{.} \end{align*}

As a set, the first two vectors are linearly independent, and as expected, the full set of three is linearly dependent, with the following nontrivial relation of linear dependence:

\begin{equation*} 6C^0\vect{x} - 5C^1\vect{x} + C^2\vect{x} = \zerovector\text{.} \end{equation*}

So candidate eigenvalues are the roots of \(p(x)=x^2-5x+6=(x-2)(x-3)\text{.}\) By computing nontrivial eigenspaces for both \(\lambda=2\) and \(\lambda=3\text{,}\) we will see that each is an eigenvalue.

To obtain the eigenspaces, construct the appropriate singular matrices and find expressions for the null spaces of these matrices.

\begin{align*} \lambda&=2\\ C-(2)I_2&= \begin{bmatrix} -3 & 2\\ -6 & 4 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & -\frac{2}{3}\\ 0 & 0 \end{bmatrix}\\ \eigenspace{C}{2}&=\nsp{C-(2)I_2}= \spn{\set{\colvector{\frac{2}{3}\\1}}} = \spn{\set{\colvector{2\\3}}} \end{align*}
\begin{align*} \lambda&=3\\ C-(3)I_2&= \begin{bmatrix} -4 & 2\\ -6 & 3 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & -\frac{1}{2}\\ 0 & 0 \end{bmatrix}\\ \eigenspace{C}{3}&=\nsp{C-(3)I_2}= \spn{\set{\colvector{\frac{1}{2}\\1}}} = \spn{\set{\colvector{1\\2}}}\text{.} \end{align*}

Each eigenspace has a single basis vector, so the dimensions are both \(1\) and the geometric multiplicities are \(\geomult{C}{2}=1\) and \(\geomult{C}{3}=1\text{.}\)

C21.

The matrix \(A\) below has \(\lambda=2\) as an eigenvalue. Find the geometric multiplicity of \(\lambda=2\) using your calculator only for row-reducing matrices.

\begin{equation*} A= \begin{bmatrix} 18 & -15 & 33 & -15\\ -4 & 8 & -6 & 6\\ -9 & 9 & -16 & 9\\ 5 & -6 & 9 & -4 \end{bmatrix} \end{equation*}
Solution

If \(\lambda=2\) is an eigenvalue of \(A\text{,}\) the matrix \(A-2I_4\) will be singular, and its null space will be the eigenspace of \(A\text{.}\) So we form this matrix and row-reduce,

\begin{equation*} A-2I_4= \begin{bmatrix} 16 & -15 & 33 & -15\\ -4 & 6 & -6 & 6\\ -9 & 9 & -18 & 9\\ 5 & -6 & 9 & -6 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & 3 & 0\\ 0 & \leading{1} & 1 & 1\\ 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

With two free variables, we know a basis of the null space (Theorem BNS) will contain two vectors. Thus the null space of \(A-2I_4\) has dimension two, and so the eigenspace of \(\lambda=2\) has dimension two also (Theorem EMNS), \(\geomult{A}{2}=2\text{.}\)

C22.

Without using a calculator, find the eigenvalues of the matrix \(B\text{.}\)

\begin{equation*} B= \begin{bmatrix} 2 & -1 \\ 1 & 1 \end{bmatrix} \end{equation*}
C27.

For the matrix

\begin{align*} A = \begin{bmatrix} 0 & 4 & -1 & 1\\ -2 & 6 & -1 & 1\\ -2 & 8 & -1 & -1\\ -2 & 8 & -3 & 1 \end{bmatrix}, \end{align*}

Find the eigenvalues and corresponding eigenspaces of \(A\text{.}\)

Solution

Following Theorem EMHE, we choose a simple nonzero vector, with no particular justification.

\begin{equation*} \vect{x}=\colvector{1\\0\\1\\0} \end{equation*}

The products \(A^i\vect{x}\) are

\begin{align*} A^0\vect{x} &= \colvector{1 \\ 0 \\ 1 \\ 0} & A^1\vect{x} &= \colvector{-1 \\ -3 \\ -3 \\ -5} & A^2\vect{x} &= \colvector{-14 \\ -18 \\ -14 \\ -18} & A^3\vect{x} &= \colvector{-76 \\ -84 \\ -84 \\ -92} & A^4\vect{x} &= \colvector{-344 \\ -360 \\ -344 \\ -360} \end{align*}
.

We know this is a linearly dependent set, but we can also determine that the first four vectors can be used to form a relation of linear dependence:

\begin{equation*} 16A^0\vect{x} - 4A^1\vect{x} - 4A^2\vect{x} + A^3\vect{x} = \zerovector\text{.} \end{equation*}

So candidate eigenvalues are roots of \(p(x)=x^3-4x^2-4x+16=(x-2)(x+2)(x-4)\text{,}\) namely \(\lambda = -2\text{,}\) \(\lambda = 2\) and \(\lambda = 4\text{.}\) The eigenspaces are

\begin{align*} \eigenspace{A}{-2}&= \spn{\set{\colvector{0\\0\\1\\1}}}\\ \eigenspace{A}{2} &= \spn{\set{\colvector{1\\1\\2\\0},\colvector{3\\1\\0\\2}}}\\ \eigenspace{A}{4} &= \spn{\set{\colvector{1\\1\\1\\1}}}\text{.} \end{align*}
M60.

Repeat Example CASE by choosing \(\vect{x}=\colvector{0\\8\\2\\1\\2}\) and then arrive at an eigenvalue and eigenvector of the matrix \(A\text{.}\)

Solution

Form the matrix \(C\) whose columns are \(\vect{x},\,A\vect{x},\,A^2\vect{x},\,A^3\vect{x},\,A^4\vect{x},\,A^5\vect{x}\) and row-reduce the matrix,

\begin{equation*} \begin{bmatrix} 0 & 6 & 32 & 102 & 320 & 966 \\ 8 & 10 & 24 & 58 & 168 & 490 \\ 2 & 12 & 50 & 156 & 482 & 1452 \\ 1 & -5 & -47 & -149 & -479 & -1445 \\ 2 & 12 & 50 & 156 & 482 & 1452 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & 0 & -3 & -9 & -30 \\ 0 & \leading{1} & 0 & 1 & 0 & 1 \\ 0 & 0 & \leading{1} & 3 & 10 & 30 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

The simplest possible relation of linear dependence on the columns of \(C\) comes from using scalars \(\alpha_4=1\) and \(\alpha_5=\alpha_6=0\) for the free variables in a solution to \(\homosystem{C}\text{.}\) The remainder of this solution is \(\alpha_1=3\text{,}\) \(\alpha_2=-1\text{,}\) \(\alpha_3=-3\text{.}\) This solution gives rise to the polynomial

\begin{equation*} p(x)=3-x-3x^2+x^3=(x-3)(x-1)(x+1) \end{equation*}

which then has the property that \(p(A)\vect{x}=\zerovector\text{.}\)

No matter how you choose to order the factors of \(p(x)\text{,}\) the value of \(k\) (in the language of Theorem EMHE and Example CASE) is \(k=2\text{.}\) For each of the three possibilities, we list the resulting eigenvector, \(\vect{z}\text{,}\) and the associated eigenvalue

\begin{align*} \vect{z} &= (C-3I_5)(C-I_5)\vect{x}=\colvector{8\\8\\8\\-24\\8} & \lambda&=-1\\ \vect{z} &= (C-3I_5)(C+I_5)\vect{x}=\colvector{20\\-20\\20\\-40\\20} & \lambda&= 1\\ \vect{z} &= (C+I_5)(C-I_5)\vect{x} =\colvector{32\\16\\48\\-48\\48} & \lambda&= 3\text{.} \end{align*}

These three possibilities for your solution depend on the factor of \(p(x)\) that is not shown, and which determines the eigenvalue. Note that each of these eigenvectors can be simplified by an appropriate scalar multiple, but what we have shown here is the actual vector obtained by the product specified in the theorem.

T10.

A matrix \(A\) is idempotent if \(A^2=A\text{.}\) Show that the only possible eigenvalues of an idempotent matrix are \(\lambda=0\) and \(\lambda=1\text{.}\) Then give an example of a matrix that is idempotent and has both of these two values as eigenvalues.

Solution

Suppose that \(\lambda\) is an eigenvalue of \(A\text{.}\) Then there is an eigenvector \(\vect{x}\text{,}\) such that \(A\vect{x}=\lambda\vect{x}\text{.}\) We have,

\begin{align*} \lambda\vect{x}&=A\vect{x}&& \vect{x}\text{ eigenvector of }A\\ &=A^2\vect{x}&& A\text{ is idempotent}\\ &=A(A\vect{x})\\ &=A(\lambda\vect{x})&& \vect{x}\text{ eigenvector of }A\\ &=\lambda(A\vect{x})&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}}\\ &=\lambda(\lambda\vect{x})&& \vect{x}\text{ eigenvector of }A\\ &=\lambda^2\vect{x}\\ \end{align*}

From this we get

\begin{align*} \zerovector&=\lambda^2\vect{x}-\lambda\vect{x}\\ &=(\lambda^2-\lambda)\vect{x}&& \knowl{./knowl/property-DSAC.html}{\text{Property DSAC}}\text{.} \end{align*}

Since \(\vect{x}\) is an eigenvector, it is nonzero, and Theorem SMEZV leaves us with the conclusion that \(\lambda^2-\lambda=0\text{,}\) and the solutions to this quadratic polynomial equation in \(\lambda\) are \(\lambda=0\) and \(\lambda=1\text{.}\)

The matrix

\begin{equation*} \begin{bmatrix} 1&0\\0&0 \end{bmatrix} \end{equation*}

is idempotent (check this!) and since it is a diagonal matrix, its eigenvalues are the diagonal entries, \(\lambda=0\) and \(\lambda=1\text{,}\) so each of these possible values for an eigenvalue of an idempotent matrix actually occurs as an eigenvalue of some idempotent matrix. So we cannot state any stronger conclusion about the eigenvalues of an idempotent matrix, and we can say that this theorem is the “best possible.”

T20.

Suppose that \(\lambda\) and \(\rho\) are two different eigenvalues of the square matrix \(A\text{.}\) Prove that the intersection of the eigenspaces for these two eigenvalues is trivial. That is, \(\eigenspace{A}{\lambda}\cap\eigenspace{A}{\rho}=\set{\zerovector}\text{.}\)

Solution

This problem asks you to prove that two sets are equal, so use Definition SE.

First show that \(\set{\zerovector}\subseteq\eigenspace{A}{\lambda}\cap\eigenspace{A}{\rho}\text{.}\) Choose \(\vect{x}\in\set{\zerovector}\text{.}\) Then \(\vect{x}=\zerovector\text{.}\) Eigenspaces are subspaces (Theorem EMS), so both \(\eigenspace{A}{\lambda}\) and \(\eigenspace{A}{\rho}\) contain the zero vector, and therefore \(\vect{x}\in\eigenspace{A}{\lambda}\cap\eigenspace{A}{\rho}\) (Definition SI).

To show that \(\eigenspace{A}{\lambda}\cap\eigenspace{A}{\rho}\subseteq\set{\zerovector}\text{,}\) suppose that \(\vect{x}\in\eigenspace{A}{\lambda}\cap\eigenspace{A}{\rho}\text{.}\) Then \(\vect{x}\) is an eigenvector of \(A\) for both \(\lambda\) and \(\rho\) (Definition SI) and so

\begin{align*} \vect{x} &=1\vect{x}&& \knowl{./knowl/property-O.html}{\text{Property O}}\\ &=\frac{1}{\lambda-\rho}\left(\lambda-\rho\right)\vect{x}&&\lambda\neq\rho,\ \lambda-\rho\neq 0\\ &=\frac{1}{\lambda-\rho}\left(\lambda\vect{x}-\rho\vect{x}\right)&& \knowl{./knowl/property-DSAC.html}{\text{Property DSAC}}\\ &=\frac{1}{\lambda-\rho}\left(A\vect{x}-A\vect{x}\right)&& \vect{x}\text{ eigenvector of }A\text{ for }\lambda,\,\rho\\ &=\frac{1}{\lambda-\rho}\left(\zerovector\right)\\ &=\zerovector&& \knowl{./knowl/theorem-ZVSM.html}{\text{Theorem ZVSM}}\text{.} \end{align*}

So \(\vect{x}=\zerovector\text{,}\) and trivially, \(\vect{x}\in\set{\zerovector}\text{.}\)

T40.

The proof of Theorem ECM turns on determining when \(C - \lambda I_n\) is nonsingular. Provide another proof that determines when this matrix is nonsingular by determining when its columns are linearly independent, and then applying Theorem NMLIC.

Solution

Denote the polynomial of a companion matrix \(C\) by \(p(x)=x^n + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \cdots + a_1x_1 + a_0\text{.}\) Suppose further that \(\scalarlist{c}{n}\) are the scalars for a relation of linear dependence (trivial, or not) on the columns of \(C - \lambda I_n\text{.}\) We will determine scalar equations from the entries of the vector equality that is the relation of linear dependence. From the first entry we have

\begin{equation*} 0 = c_1(-\lambda) + c_n(-a_0) = -c_1\lambda - c_na_0\text{.} \end{equation*}

From entries \(2\leq i\leq n-1\) we have

\begin{equation*} 0 = c_{i-1} + c_i(-\lambda) + c_n(-a_{i-1}) = c_{i-1} - c_i\lambda - c_na_{i-1}\text{.} \end{equation*}

Finally, from the last entry we have

\begin{equation*} 0 = c_{n-1} + c_n(-a_{n-1} -\lambda) = c_{n-1} - c_na_{n-1} - c_n\lambda\text{.} \end{equation*}

Notice that the last equation rearranges to match the previous equation for the case \(i=n\text{,}\) so only the first equation is exceptional. Later, we will need to recognize that if we set a value for \(c_n\text{,}\) then the equation with \(i=n\) will determine a value for \(c_{n-1}\text{.}\) Then the equation with \(i=n-1\) will determine a value for \(c_{n-2}\text{.}\) Continuing this way, all of the scalars are determined by the value of \(c_n\)by using the equations up through the second one. The resulting values may, or may not, make the first equation true and provide a relation of linear dependence. But notice that if \(c_n = 0\text{,}\) then all the other \(c_i\) are determined to be \(c_i=0\text{,}\) \(1\leq i\leq n\) and the first equation is also satisfied, providing a trivial relation of linear dependence.

When is the first equation true? Let's investigate the consequences of the first equation, by repeatedly massaging the expression with the other equations, consistently replacing \(c_{i-1}\) in favor of \(c_i\) and \(c_n\text{.}\)

\begin{align*} 0 &= -c_1\lambda - c_na_0\\ &= -(c_2\lambda + c_na_{1})\lambda - c_na_0&&i=2\\ &= -c_2\lambda^2 - c_n\left(a_{1}\lambda + a_0\right)\\ &= -(c_3\lambda + c_na_{2})\lambda^2 - c_n\left(a_{1}\lambda + a_0\right)&&i=3\\ &= -c_3\lambda^3 - c_n\left(a_{2}\lambda^2 + a_{1}\lambda + a_0\right)\\ &\quad\vdots\\ &= -(c_{n-1}\lambda + c_na_{n-2})\lambda^{n-2} - c_n\left(a_{n-3}\lambda^{n-3} + \dots + a_{1}\lambda + a_0\right)&&i=n-1\\ &= -c_{n-1}\lambda^{n-1} - c_n\left(a_{n-2}\lambda^{n-2} + a_{n-3}\lambda^{n-3} + \dots + a_{1}\lambda + a_0\right)\\ &= -(c_{n}\lambda + c_na_{n-1})\lambda^{n-1} - c_n\left(a_{n-2}\lambda^{n-2} + a_{n-3}\lambda^{n-3} + \dots + a_{1}\lambda + a_0\right)&&i=n\\ &= -c_{n}\lambda^{n} - c_na_{n-1}\lambda^{n-1} - c_n\left(a_{n-2}\lambda^{n-2} + a_{n-3}\lambda^{n-3} + \dots + a_{1}\lambda + a_0\right)\\ &= - c_n\left(\lambda^{n} + a_{n-1}\lambda^{n-1} + a_{n-2}\lambda^{n-2} + a_{n-3}\lambda^{n-3} + \dots + a_{1}\lambda + a_0\right)\\ &= - c_np(\lambda)\text{.} \end{align*}

Suppose \(\lambda\) is not a root of \(p(x)\text{.}\) Then the previous equation implies that \(c_n=0\text{,}\) and as observed earlier all the scalars in the relation of linear dependence must be zero. By Definition LI, the columns of \(C - \lambda I_n\) are linearly independent, so by Theorem NMLIC the matrix is nonsingular, and Theorem ESM says \(\lambda\) is not an eigenvalue of \(C\text{.}\)

For the converse suppose \(\lambda\) is a root of \(p(x)\text{,}\) so \(p(\lambda)=0\) and we can make the first equation true with any choice of \(c_n\text{,}\) and particularly, the consensus nonzero choice of \(c_n=1\text{.}\) This leads to a nontrivial relation of linear dependence on the columns of \(C - \lambda I_n\text{,}\) so by Theorem NMLIC the matrix is singular, and Theorem ESM says \(\lambda\) is an eigenvalue of \(C\text{.}\)

T50.

In Example ESM4 we found the eigenvalues and eigenvectors of size \(4\) matrix. This guided exercise will generalize that problem. So comparing more symbolic expressions here with more numerical expressions in the example can be helpful.

Define a size \(n\) matrix \(C\) with \(n\) nonzero entries specified as \(\scalarlist{c}{n}\)

\begin{equation*} C = \begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & c_n \\ c_1 & 0 & 0 & \cdots & 0 & 0 \\ 0 & c_2 & 0 & \cdots & 0 & 0 \\ 0 & 0 & c_3 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 0 & 0 \\ 0 & 0 & 0 & \cdots & c_{n-1} & 0 \end{bmatrix}\text{.} \end{equation*}
(a)

Find a general expression for \(C^k\vect{e}_1\) for \(0\leq k\leq n\text{.}\) Do not compute a general expression for the powers of the matrix. Each item requested is a column vector, and can be found from the previous vector with a matrix-vector product (Definition MVP). Do use induction, Proof Technique I. The case of \(k=n\) is exceptional.

Solution

Our solution for each part of this problem does not include every last detail. In particular, it is still an exercise to use the right notation with the right indices to create a clean proof.

The nonzero entries of \(C\text{,}\) excepting the entry in column \(n\) can be described as \(\matrixentry{C}{i+1,i}=c_i\text{.}\) Then the vector \(C^k\vect{e}_1\) is a vector with a single nonzero entry in entry \(k+1\) that equals \(c_1c_2\cdots c_{k-1}\text{,}\) where the empty product is understood to be \(1\text{.}\)

With the right notation and Definition MVP, a proof by Proof Technique I will suffice for \(1\leq k\leq n-1\text{.}\) For \(k=n\) there is a single nonzero entry in the first entry and is equal to \(c_1c_2\cdots c_n\text{.}\)

(b)

Prove that the first \(n\) vectors in the previous part are linearly independent.

Solution

The pattern of zero entries and the single nonzero entry makes this relatively straightforward.

(c)

The \(n+1\) vectors from the first part must be linearly dependent. Find a relation of linear dependence on these vectors and in the style of Example ESM4 find a polynomial whose roots might be eigenvalues.

Solution

Since the first \(n\) powers are linearly independent, a relation of linear dependence must use a nonzero multiple of the vector \(C^n\vect{e}_1\text{.}\)

\begin{equation*} -(c_1c_2\cdots c_n)C^0\vect{e}_1 + 1C^n\vect{e}_1 = \zerovector \end{equation*}

So the polynomial is \(p(x) = x^n - c_1c_2\cdots c_n\text{.}\)

(d)

Let \(\rho\) be a root of the polynomial from the previous part. Explain why \(\rho\) is nonzero. Then prove that \(\rho\) is an eigenvalue by demonstrating that \(\vect{x}\) is an eigenvector of \(C\) for \(\rho\text{.}\)

\begin{equation*} \vect{x}=\colvector{ \rho^{-1} \\ c_1\rho^{-2} \\ c_1c_2\rho^{-3} \\ \vdots \\ c_1c_2\cdots c_{n-1}\rho^{-n} } \end{equation*}
Solution

Again, getting the notation correct is key. We check that \(C\vect{x}=\rho\vect{x}\) in accordance with Definition EEM. For \(2\leq i\leq n\)

\begin{align*} \vectorentry{C\vect{x}}{i} & = \sum_{k=1}^n \matrixentry{C}{ik}\vectorentry{\vect{x}}{k} = \matrixentry{C}{i,i-1} \vectorentry{\vect{x}}{i-1} = c_{i-1}\left(c_1c_2\cdots c_{i-2}\rho^{-(i-1)}\right)\\ & = c_1c_2\cdots c_{i-1}\rho^{-i+1} = \rho\left(c_1c_2\cdots c_{i-1}\rho^{-i}\right) = \rho\vectorentry{\vect{x}}{i} = \vectorentry{\rho\vect{x}}{i} \end{align*}

so, along with a separate argument for \(i=1\text{,}\) Definition CVE gives \(C\vect{x}=\rho\vect{x}\text{.}\)

Note that \(p(x) = x^n - c_1c_2\cdots c_n\) has \(n\) distinct (complex) roots, which will be relevant for the upcoming Theorem EDELI and Theorem DED.

T60.

The proof of Theorem ECM finds a matrix in reduced row-echelon form that is row-equivalent to \(C - \lambda I_n\) for a companion matrix \(C\text{.}\) So if \(\lambda\) is an eigenvalue, this matrix can be used to find the eigenspace of \(C\) for \(\lambda\text{.}\) Study the comments right after Example ECM5 to help you form a spanning set for \(\eigenspace{C}{\lambda}\text{,}\) and prove that it is correct.

Solution

The proof of Theorem ECM shows that \(C-\lambda I_n\) has nullity 1, so it is enough to fashion a single eigenvector of \(C\) for \(\lambda\) in order to have a spanning set for the eigenspace. Let \(n\) be the size of \(C\) and degree of \(p(x)\text{.}\)

To this end, since \(\lambda\) is an eigenvalue, and thus a root of \(p(x)\text{,}\) the expression \(x-\lambda\) is a factor of \(p(x)\text{.}\) Define a new polynomial of degree \(n-1\) by

\begin{align*} p(x)&=x^n + a_{n-1}x^{n-1} + a_{n-2}x^{n-2} + \cdots + a_1x_1 + a_0\\ &=\left(x-\lambda\right)\left(x^{n-1} + b_{n-2}x^{n-2} + b_{n-3}x^{n-3} + \cdots + b_1x_1 + b_0\right)\text{.} \end{align*}

If you expand this final expression and equate coefficents of the two monic polynomials of degree \(n\text{,}\) we have scalar equations

\begin{align*} a_0 &= -\lambda b_0 & a_i&=b_{i-1} - \lambda b_i, \quad 1\leq i\leq n-1\text{.} \end{align*}

We will use the coefficients of this new polynomial to fashion a putative (nonzero) eigenvector of \(C\) for \(\lambda\text{,}\)

\begin{equation*} \vect{x}=\colvector{b_0\\b_1\\b_2\\\vdots\\b_{n-2}\\1}\text{.} \end{equation*}

It remains to verify that \(\vect{x}\) is indeed an eigenvector. We compute,

\begin{align*} \vectorentry{C\vect{x}}{1} &= -a_0 = \lambda b_0 = \vectorentry{\lambda\vect{x}}{1}\\ \vectorentry{C\vect{x}}{i} &= b_{i-2} - a_{i-1} = b_{i-2} - \left(b_{i-2} - \lambda b_{i-1}\right) = \lambda b_{i-1} = \vectorentry{\lambda\vect{x}}{i},\quad 2\leq i\leq n \end{align*}

establishing via Definition CVE that \(C\vect{x}=\lambda\vect{x}\) and so \(\vect{x}\) is an eigenvector of \(C\) for \(\lambda\text{.}\)