Skip to main content

Section 3.2 Nilpotent Linear Transformations

We will discover that nilpotent linear transformations are the essential obstacle in a non-diagonalizable linear transformation. So we will study them carefully, both as an object of inherent mathematical interest, but also as the object at the heart of the argument that leads to a pleasing canonical form for any linear transformation. Once we understand these linear transformations thoroughly, we will be able to easily analyze the structure of any linear transformation.

Subsection 3.2.1 Nilpotent Linear Transformations

Definition 3.2.1. Nilpotent Linear Transformation.

Suppose that \(\ltdefn{T}{V}{V}\) is a linear transformation such that there is an integer \(p\gt 0\) such that \(\lteval{T^p}{\vect{v}}=\zerovector\) for every \(\vect{v}\in V\text{.}\) The smallest \(p\) for which this condition is met is called the \(\) of \(T\text{.}\)

Of course, the linear transformation \(T\) defined by \(\lteval{T}{\vect{v}}=\zerovector\) will qualify as nilpotent of index \(1\text{.}\) But are there others? Yes, of course.

Recall that our definitions and theorems are being stated for linear transformations on abstract vector spaces, while our examples will work with square matrices (and use the same terms interchangeably). In this case, to demonstrate the existence of nontrivial nilpotent linear transformations, we desire a matrix such that some power of the matrix is the zero matrix. Consider powers of a \(6\times 6\) matrix \(A\text{,}\)

\begin{align*} A&=\begin{bmatrix} -3 & 3 & -2 & 5 & 0 & -5 \\ -3 & 5 & -3 & 4 & 3 & -9 \\ -3 & 4 & -2 & 6 & -4 & -3 \\ -3 & 3 & -2 & 5 & 0 & -5 \\ -3 & 3 & -2 & 4 & 2 & -6 \\ -2 & 3 & -2 & 2 & 4 & -7 \end{bmatrix}\\ \end{align*}

and compute powers of \(A\text{,}\)

\begin{align*} A^2&=\begin{bmatrix} 1 & -2 & 1 & 0 & -3 & 4 \\ 0 & -2 & 1 & 1 & -3 & 4 \\ 3 & 0 & 0 & -3 & 0 & 0 \\ 1 & -2 & 1 & 0 & -3 & 4 \\ 0 & -2 & 1 & 1 & -3 & 4 \\ -1 & -2 & 1 & 2 & -3 & 4 \end{bmatrix}\\ A^3&=\begin{bmatrix} 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \end{bmatrix}\\ A^4&=\begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{align*}

Thus we can say that \(A\) is nilpotent of index 4.

Because it will presage some upcoming theorems, we will record some extra information about the eigenvalues and eigenvectors of \(A\) here. \(A\) has just one eigenvalue, \(\lambda=0\text{,}\) with algebraic multiplicity \(6\) and geometric multiplicity \(2\text{.}\) The eigenspace for this eigenvalue is

\begin{equation*} \eigenspace{A}{0}= \spn{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1} } \end{equation*}

If there were degrees of singularity, we might say this matrix was very singular, since zero is an eigenvalue with maximum algebraic multiplicity (Theorem SMZE, Theorem ME). Notice too that \(A\) is “far” from being diagonalizable (Theorem DMFE).

With the existence of nontrivial nilpotent matrices settled, let's look at another example.

Consider the matrix

\begin{align*} B&= \begin{bmatrix} -1 & 1 & -1 & 4 & -3 & -1 \\ 1 & 1 & -1 & 2 & -3 & -1 \\ -9 & 10 & -5 & 9 & 5 & -15 \\ -1 & 1 & -1 & 4 & -3 & -1 \\ 1 & -1 & 0 & 2 & -4 & 2 \\ 4 & -3 & 1 & -1 & -5 & 5 \end{bmatrix}\\ \end{align*}

and compute the second power of \(B\text{,}\)

\begin{align*} B^2&=\begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{align*}

So \(B\) is nilpotent of index 2.

Again, the only eigenvalue of \(B\) is zero, with algebraic multiplicity \(6\text{.}\) The geometric multiplicity of the eigenvalue is \(3\text{,}\) as seen in the eigenspace,

\begin{equation*} \eigenspace{B}{0}=\spn{ \colvector{1 \\ 3 \\ 6 \\ 1 \\ 0 \\ 0},\, \colvector{0 \\ -4 \\ -7 \\ 0 \\ 1 \\ 0},\, \colvector{0 \\ 2 \\ 1 \\ 0 \\ 0 \\ 1} } \end{equation*}

Again, Theorem DMFE tells us that \(B\) is far from being diagonalizable.

On a first encounter with the definition of a nilpotent matrix, you might wonder if such a thing was possible at all. That a high power of a nonzero object could be zero is so very different from our experience with scalars that it seems very unnatural. Hopefully the two previous examples were somewhat surprising. But we have seen that matrix algebra does not always behave the way we expect (Example MMNC), and we also now recognize matrix products not just as arithmetic, but as function composition (Theorem MRCLT). With a couple examples completed, we turn to some general properties.

Let \(\vect{x}\) be an eigenvector of \(T\) for the eigenvalue \(\lambda\text{,}\) and suppose that \(T\) is nilpotent with index \(p\text{.}\) Then

\begin{equation*} \zerovector=\lteval{T^p}{\vect{x}}=\lambda^p\vect{x} \end{equation*}

Because \(\vect{x}\) is an eigenvector, it is nonzero, and therefore Theorem SMEZV tells us that \(\lambda^p=0\) and so \(\lambda=0\text{.}\)

Paraphrasing, all of the eigenvalues of a nilpotent linear transformation are zero. So in particular, the characteristic polynomial of a nilpotent linear transformation, \(T\text{,}\) on a vector space of dimension \(n\text{,}\) is simply \(\charpoly{T}{x}=(x-0)^n=x^n\text{.}\)

The next theorem is not critical for what follows, but it will explain our interest in nilpotent linear transformations. More specifically, it is the first step in backing up the assertion that nilpotent linear transformations are the essential obstacle in a non-diagonalizable linear transformation. While it is not obvious from the statement of the theorem, it says that a nilpotent linear transformation is not diagonalizable, unless it is trivially so.

(⇐) We start with the easy direction. Let \(n=\dimension{V}\text{.}\) The linear transformation \(\ltdefn{Z}{V}{V}\) defined by \(\lteval{Z}{\vect{v}}=\zerovector\) for all \(\vect{v}\in V\) is nilpotent of index \(p=1\) and a matrix representation relative to any basis of \(V\) is the \(n\times n\) zero matrix, \(\zeromatrix\text{.}\) Quite obviously, the zero matrix is a diagonal matrix (Definition DIM) and hence \(Z\) is diagonalizable (Definition DZM).

(⇒) Assume now that \(T\) is diagonalizable, so \(\geomult{T}{\lambda}=\algmult{T}{\lambda}\) for every eigenvalue \(\lambda\) (Theorem DMFE). By Theorem Theorem 3.2.4, \(T\) has only one eigenvalue (zero), which therefore must have algebraic multiplicity \(n\) (Theorem NEM). So the geometric multiplicity of zero will be \(n\) as well, \(\geomult{T}{0}=n\text{.}\)

Let \(B\) be a basis for the eigenspace \(\eigenspace{T}{0}\text{.}\) Then \(B\) is a linearly independent subset of \(V\) of size \(n\text{,}\) and thus a basis of \(V\text{.}\) For any \(\vect{x}\in B\) we have

\begin{equation*} \lteval{T}{\vect{x}}=0\vect{x}=\zerovector \end{equation*}

So \(T\) is identically zero on a basis for \(B\text{,}\) and since the action of a linear transformation on a basis determines all of the values of the linear transformation (Theorem LTDB), it is easy to see that \(\lteval{T}{\vect{v}}=\zerovector\) for every \(\vect{v}\in V\text{.}\)

So, other than one trivial case (the zero linear transformation), every nilpotent linear transformation is not diagonalizable. It remains to see what is so “essential” about this broad class of non-diagonalizable linear transformations.

Subsection 3.2.2 Powers of Kernels of Nilpotent Linear Transformations

We return to our discussion of kernels of powers of linear transformations, now specializing to nilpotent linear transformations. We reprise Theorem Theorem 3.1.1, gaining just a little more precision in the conclusion.

Since \(T^p=0\) it follows that \(T^{p+j}=0\) for all \(j\geq 0\) and thus \(\krn{T^{p+j}}=V\) for \(j\geq 0\text{.}\) So the value of \(m\) guaranteed by Theorem KPLT is at most \(p\text{.}\) The only remaining aspect of our conclusion that does not follow from Theorem Theorem 3.1.1 is that \(m=p\text{.}\) To see this, we must show that \(\krn{T^k} \subsetneq\krn{T^{k+1}}\) for \(0\leq k\leq p-1\text{.}\) If \(\krn{T^k}=\krn{T^{k+1}}\) for some \(k\lt p\text{,}\) then \(\krn{T^k}=\krn{T^p}=V\text{.}\) This implies that \(T^k=0\text{,}\) violating the fact that \(T\) has index \(p\text{.}\) So the smallest value of \(m\) is indeed \(p\text{,}\) and we learn that \(p\lt n\text{.}\)

The structure of the kernels of powers of nilpotent linear transformations will be crucial to what follows. But immediately we can see a practical benefit. Suppose we are confronted with the question of whether or not an \(n\times n\) matrix, \(A\text{,}\) is nilpotent or not. If we don't quickly find a low power that equals the zero matrix, when do we stop trying higher and higher powers? Theorem Theorem 3.2.6 gives us the answer: if we don't see a zero matrix by the time we finish computing \(A^n\text{,}\) then it is not going to ever happen. We will now take a look at one example of Theorem Theorem 3.2.6 in action.

We will recycle the nilpotent matrix \(A\) of index 4 from Example Example 3.2.2. We now know that would have only needed to look at the first 6 powers of \(A\) if the matrix had not been nilpotent and we wanted to discover that. We list bases for the null spaces of the powers of \(A\text{.}\) (Notice how we are using null spaces for matrices interchangeably with kernels of linear transformations, see Theorem KNSI for justification.)

\begin{align*} \nsp{A}&=\nsp{ \begin{bmatrix} -3 & 3 & -2 & 5 & 0 & -5 \\ -3 & 5 & -3 & 4 & 3 & -9 \\ -3 & 4 & -2 & 6 & -4 & -3 \\ -3 & 3 & -2 & 5 & 0 & -5 \\ -3 & 3 & -2 & 4 & 2 & -6 \\ -2 & 3 & -2 & 2 & 4 & -7 \end{bmatrix}} =\spn{\set{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1} }}\\ \nsp{A^2}&=\nsp{ \begin{bmatrix} 1 & -2 & 1 & 0 & -3 & 4 \\ 0 & -2 & 1 & 1 & -3 & 4 \\ 3 & 0 & 0 & -3 & 0 & 0 \\ 1 & -2 & 1 & 0 & -3 & 4 \\ 0 & -2 & 1 & 1 & -3 & 4 \\ -1 & -2 & 1 & 2 & -3 & 4 \end{bmatrix}} =\spn{\set{ \colvector{0 \\ 1 \\ 2 \\ 0 \\ 0 \\ 0},\, \colvector{2 \\ 1 \\ 0 \\ 2 \\ 0 \\ 0},\, \colvector{0 \\ -3 \\ 0 \\ 0 \\ 2 \\ 0},\, \colvector{0 \\ 2 \\ 0 \\ 0 \\ 0 \\ 1} }}\\ \nsp{A^3}&= \nsp{ \begin{bmatrix} 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \\ 1 & 0 & 0 & -1 & 0 & 0 \end{bmatrix}} =\spn{\set{ \colvector{0 \\ 1 \\ 0 \\ 0 \\ 0 \\ 0},\, \colvector{0 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0},\, \colvector{1 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 0},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 1} }}\\ \nsp{A^4}&= \nsp{ \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}} =\spn{\set{ \colvector{1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0},\, \colvector{0 \\ 1 \\ 0 \\ 0 \\ 0 \\ 0},\, \colvector{0 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0},\, \colvector{0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 0},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 1} }} \end{align*}

With the exception of some convenience scaling of the basis vectors in \(\nsp{A^2}\) these are exactly the basis vectors described in Theorem BNS. We can see that the dimension of \(\nsp{A}\) equals the geometric multiplicity of the zero eigenvalue. Why is this not an accident? We can see the dimensions of the kernels consistently increasing, and we can see that \(\nsp{A^4}=\complex{6}\text{.}\) But Theorem Theorem 3.2.6 says a little more. Each successive kernel should be a superset of the previous one. We ought to be able to begin with a basis of \(\nsp{A}\) and extend it to a basis of \(\nsp{A^2}\text{.}\) Then we should be able to extend a basis of \(\nsp{A^2}\) into a basis of \(\nsp{A^3}\text{,}\) all with repeated applications of Theorem ELIS. Verify the following,

\begin{align*} \nsp{A}&= \spn{\set{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1} }}\\ \nsp{A^2}&=\spn{\set{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1},\, \colvector{0 \\ -3 \\ 0 \\ 0 \\ 2 \\ 0},\, \colvector{0 \\ 2 \\ 0 \\ 0 \\ 0 \\ 1} }}\\ \nsp{A^3}&= \spn{\set{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1},\, \colvector{0 \\ -3 \\ 0 \\ 0 \\ 2 \\ 0},\, \colvector{0 \\ 2 \\ 0 \\ 0 \\ 0 \\ 1},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 1} }}\\ \nsp{A^4}&= \spn{\set{ \colvector{2 \\ 2 \\ 5 \\ 2 \\ 1 \\ 0},\, \colvector{-1 \\ -1 \\ -5 \\ -1 \\ 0 \\ 1},\, \colvector{0 \\ -3 \\ 0 \\ 0 \\ 2 \\ 0},\, \colvector{0 \\ 2 \\ 0 \\ 0 \\ 0 \\ 1},\, \colvector{0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 1},\, \colvector{0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0} }} \end{align*}

Do not be concerned at the moment about how these bases were constructed since we are not describing the applications of Theorem ELIS here. Do verify carefully for each alleged basis that, (1) it is a superset of the basis for the previous kernel, (2) the basis vectors really are members of the kernel of the associated power of \(A\text{,}\) (3) the basis is a linearly independent set, (4) the size of the basis is equal to the size of the basis found previously for each kernel. With these verifications, you will know that we have successfully demonstrated what Theorem Theorem 3.2.6 guarantees.

Subsection 3.2.3 Restrictions to Generalized Eigenspaces

We have seen that we can decompose the domain of a linear transformation into a direct sum of generalized eigenspaces (Theorem Theorem 3.1.10). And we know that we can then easily obtain a basis that leads to a block diagonal matrix representation. The blocks of this matrix representation are matrix representations of restrictions to the generalized eigenspaces (for example, Example Example 3.1.12). And the next theorem tells us that these restrictions, adjusted slightly, provide us with a broad class of nilpotent linear transformations.

Notice first that every subspace of \(V\) is invariant with respect to \(I_V\text{,}\) so \(I_{\geneigenspace{T}{\lambda}}=\restrict{I_V}{\geneigenspace{T}{\lambda}}\text{.}\) Let \(n=\dimension{V}\) and choose \(\vect{v}\in\geneigenspace{T}{\lambda}\text{.}\) Then with an application of Theorem Theorem 3.1.6,

\begin{equation*} \lteval{\left(\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}\right)^n}{\vect{v}} =\lteval{\left(T-\lambda I_V\right)^n}{\vect{v}} =\zerovector \end{equation*}

So by Definition NLT, \(\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}\) is nilpotent.

The proof of Theorem Theorem 3.2.8 shows that the index of the linear transformation \(\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}\)is less than or equal to the dimension of \(V\text{.}\) In practice, it must be less than or equal to the dimension of the domain, \(\geneigenspace{T}{\lambda}\text{.}\) In any event, the exact value of this index will be of some interest, so we define it now. Notice that this is a property of the eigenvalue \(\lambda\text{.}\) In many ways it is similar to the algebraic and geometric multiplicities of an eigenvalue (Definition AME, Definition GME).

Definition 3.2.9. Index of an Eigenvalue.

Suppose \(\ltdefn{T}{V}{V}\) is a linear transformation with eigenvalue \(\lambda\text{.}\) Then the index of \(\lambda\text{,}\) \(\indx{T}{\lambda}\text{,}\) is the index of the nilpotent linear transformation \(\restrict{T}{\geneigenspace{T}{\lambda}}-\lambda I_{\geneigenspace{T}{\lambda}}\text{.}\)

In Example Example 3.1.9 we computed the generalized eigenspaces of the linear transformation \(\ltdefn{S}{\complex{6}}{\complex{6}}\) defined by \(\lteval{S}{\vect{x}}=B\vect{x}\) where

\begin{equation*} B=\begin{bmatrix} 2 & -4 & 25 & -54 & 90 & -37 \\ 2 & -3 & 4 & -16 & 26 & -8 \\ 2 & -3 & 4 & -15 & 24 & -7 \\ 10 & -18 & 6 & -36 & 51 & -2 \\ 8 & -14 & 0 & -21 & 28 & 4 \\ 5 & -7 & -6 & -7 & 8 & 7 \end{bmatrix} \end{equation*}

The generalized eigenspace \(\geneigenspace{S}{3}\) has dimension \(2\text{,}\) while \(\geneigenspace{S}{-1}\) has dimension \(4\text{.}\) We will investigate each thoroughly in turn, with the intent being to illustrate Theorem Theorem 3.2.8. Many of our computations will be repeats of those done in Example Example 3.1.12.

For \(U=\geneigenspace{S}{3}\) we compute a matrix representation of \(\restrict{S}{U}\) using the basis found in Example Example 3.1.9,

\begin{equation*} D=\set{\vect{u}_1,\,\vect{u}_2}=\set{\colvector{4\\1\\1\\2\\1\\0},\,\colvector{-5\\-1\\-1\\-1\\0\\1}} \end{equation*}

Since \(D\) has size 2, we obtain a \(2\times 2\) matrix representation from

\begin{align*} \vectrep{D}{\lteval{\restrict{S}{U}}{\vect{u}_1}} &=\vectrep{D}{\colvector{11\\3\\3\\7\\4\\1}} =\vectrep{D}{4\vect{u}_1+\vect{u}_2} =\colvector{4\\1}\\ \vectrep{D}{\lteval{\restrict{S}{U}}{\vect{u}_2}} &=\vectrep{D}{\colvector{-14\\-3\\-3\\-4\\-1\\2}} =\vectrep{D}{(-1)\vect{u}_1+2\vect{u}_2} =\colvector{-1\\2} \end{align*}

Thus

\begin{equation*} M=\matrixrep{\restrict{S}{U}}{U}{U}=\begin{bmatrix} 4 & -1 \\ 1 & 2 \end{bmatrix} \end{equation*}

Now we can illustrate Theorem Theorem 3.2.8 with powers of the matrix representation (rather than the restriction itself),

\begin{align*} M-3I_2&= \begin{bmatrix}1 & -1 \\ 1 & -1\end{bmatrix}\\ \left(M-3I_2\right)^2&= \begin{bmatrix}0 & 0 \\ 0 & 0\end{bmatrix} \end{align*}

So \(M-3I_2\) is a nilpotent matrix of index 2 (meaning that \(\restrict{S}{U}-3I_U\) is a nilpotent linear transformation of index 2) and according to Definition Definition 3.2.9 we say \(\indx{S}{3}=2\text{.}\)

For \(W=\geneigenspace{S}{-1}\) we compute a matrix representation of \(\restrict{S}{W}\) using the basis found in Example Example 3.1.9,

\begin{equation*} E=\set{\vect{w}_1,\,\vect{w}_2,\,\vect{w}_3,\,\vect{w}_4} =\set{ \colvector{5\\3\\1\\0\\0\\0},\, \colvector{-2\\-3\\0\\1\\0\\0},\, \colvector{4\\5\\0\\0\\1\\0},\, \colvector{-5\\-3\\0\\0\\0\\1} } \end{equation*}

Since \(E\) has size 4, we obtain a \(4\times 4\) matrix representation (Definition MR) from

\begin{align*} \vectrep{E}{\lteval{\restrict{S}{W}}{\vect{w}_1}} &=\vectrep{E}{\colvector{23\\5\\5\\2\\-2\\-2}} =\vectrep{E}{ 5\vect{w}_1+ 2\vect{w}_2+ (-2)\vect{w}_3+ (-2)\vect{w}_4 } =\colvector{5\\2\\-2\\-2}\\ \vectrep{E}{\lteval{\restrict{S}{W}}{\vect{w}_2}} &=\vectrep{E}{\colvector{-46\\-11\\-10\\-2\\5\\4}} =\vectrep{E}{ (-10)\vect{w}_1+ (-2)\vect{w}_2+ 5\vect{w}_3+ 4\vect{w}_4 } =\colvector{-10\\-2\\5\\4}\\ \vectrep{E}{\lteval{\restrict{S}{W}}{\vect{w}_3}} &=\vectrep{E}{\colvector{78\\19\\17\\1\\-10\\-7}} =\vectrep{E}{ 17\vect{w}_1+ \vect{w}_2+ (-10)\vect{w}_3+ (-7)\vect{w}_4 } =\colvector{17\\1\\-10\\-7}\\ \vectrep{E}{\lteval{\restrict{S}{W}}{\vect{w}_4}} &=\vectrep{E}{\colvector{-35\\-9\\-8\\2\\6\\3}} =\vectrep{E}{ (-8)\vect{w}_1+ 2\vect{w}_2+ 6\vect{w}_3+ 3\vect{w}_4 } =\colvector{-8\\2\\6\\3} \end{align*}

Thus

\begin{equation*} N=\matrixrep{\restrict{S}{W}}{W}{W} = \begin{bmatrix} 5 & -10 & 17 & -8 \\ 2 & -2 & 1 & 2 \\ -2 & 5 & -10 & 6 \\ -2 & 4 & -7 & 3 \end{bmatrix} \end{equation*}

Now we can illustrate Theorem Theorem 3.2.8 with powers of the matrix representation (rather than the restriction itself),

\begin{align*} N-(-1)I_4&=\begin{bmatrix} 6 & -10 & 17 & -8 \\ 2 & -1 & 1 & 2 \\ -2 & 5 & -9 & 6 \\ -2 & 4 & -7 & 4 \end{bmatrix}\\ \left(N-(-1)I_4\right)^2&=\begin{bmatrix} -2 & 3 & -5 & 2 \\ 4 & -6 & 10 & -4 \\ 4 & -6 & 10 & -4 \\ 2 & -3 & 5 & -2 \end{bmatrix}\\ \left(N-(-1)I_4\right)^3&=\begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \end{align*}

So \(N-(-1)I_4\) is a nilpotent matrix of index 3 (meaning that \(\restrict{S}{W}-(-1)I_W\) is a nilpotent linear transformation of index 3) and according to Definition Definition 3.2.9 we say \(\indx{S}{-1}=3\text{.}\)

Notice that if we were to take the union of the two bases of the generalized eigenspaces, we would have a basis for \(\complex{6}\text{.}\) Then a matrix representation of \(S\) relative to this basis would be the same block diagonal matrix we found in Example Example 3.1.12, only we now understand each of these blocks as being very close to being a nilpotent matrix.

Subsection 3.2.4 Jordan Blocks

We conclude this section about nilpotent linear transformations with an infinite family of nilpotent matrices and a doubly-infinite family of nearly nilpotent matrices.

Definition 3.2.11. Jordan Block.

Given the scalar \(\lambda\in\complexes\text{,}\) the Jordan block \(\jordan{n}{\lambda}\) is the \(n\times n\) matrix defined by

\begin{equation*} \matrixentry{\jordan{n}{\lambda}}{ij}=\begin{cases} \lambda & i=j\\ 1 & j=i+1\\ 0 & \text{otherwise} \end{cases} \end{equation*}

A simple example of a Jordan block,

\begin{equation*} \jordan{4}{5}=\begin{bmatrix} 5 & 1 & 0 & 0\\ 0 & 5 & 1 & 0\\ 0 & 0 & 5 & 1\\ 0 & 0 & 0 & 5 \end{bmatrix} \end{equation*}

We will return to general Jordan blocks later, but in this section we are only interested in Jordan blocks where \(\lambda=0\text{.}\) (But notice that \(\jordan{n}{\lambda}-\lambda I_n=\jordan{n}{0}\text{.}\)) Here is an example of why we are specializing in the \(\lambda=0\) case now.

Consider

\begin{align*} \jordan{5}{0}&=\begin{bmatrix} 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \end{align*}

and compute powers,

\begin{align*} \left(\jordan{5}{0}\right)^2&\begin{bmatrix} 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \left(\jordan{5}{0}\right)^3&=\begin{bmatrix} 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \left(\jordan{5}{0}\right)^4&=\begin{bmatrix} 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \left(\jordan{5}{0}\right)^5&=\begin{bmatrix} 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{align*}

So \(\jordan{5}{0}\) is nilpotent of index \(5\text{.}\) As before, we record some information about the eigenvalues and eigenvectors of this matrix. The only eigenvalue is zero, with algebraic multiplicity 5, the maximum possible (Theorem ME). The geometric multiplicity of this eigenvalue is just 1, the minimum possible (Theorem ME), as seen in the eigenspace,

\begin{equation*} \eigenspace{\jordan{5}{0}}{0}=\spn{\colvector{1 \\ 0 \\ 0 \\ 0 \\ 0}} \end{equation*}

There should not be any real surprises in this example. We can watch the ones in the powers of \(\jordan{5}{0}\) slowly march off to the upper-right hand corner of the powers. Or we can watch the columns of the identity matrix march right, falling off the edge as they go. In some vague way, the eigenvalues and eigenvectors of this matrix are equally extreme.

We can form combinations of Jordan blocks to build a variety of nilpotent matrices. Simply create a block diagonal matrix, where each block is a Jordan block.

Consider the matrix

\begin{align*} C&=\begin{bmatrix} \jordan{3}{0} & \zeromatrix & \zeromatrix \\ \zeromatrix & \jordan{3}{0} & \zeromatrix \\ \zeromatrix & \zeromatrix & \jordan{2}{0} \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ \end{align*}

and compute powers,

\begin{align*} C^2&=\begin{bmatrix} 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}\\ C^3&=\begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{align*}

So \(C\) is nilpotent of index 3. You should notice how block diagonal matrices behave in products (much like diagonal matrices) and that it was the largest Jordan block that determined the index of this combination. All eight eigenvalues are zero, and each of the three Jordan blocks contributes one eigenvector to a basis for the eigenspace, resulting in zero having a geometric multiplicity of 3.

Since nilpotent matrices only have zero as an eigenvalue (Theorem Theorem 3.2.4), the algebraic multiplicity will be the maximum possible. However, by creating block diagonal matrices with Jordan blocks on the diagonal you should be able to attain any desired geometric multiplicity for this lone eigenvalue. Likewise, the size of the largest Jordan block employed will determine the index of the matrix. So nilpotent matrices with various combinations of index, geometric multiplicity and algebraic multiplicity are easy to manufacture. The predictable properties of block diagonal matrices in matrix products and eigenvector computations, along with the next theorem, make this possible. You might find Example NJB5 a useful companion to this proof.

We need to establish a specific matrix is nilpotent of a specified index. The first column of \(\jordan{n}{0}\) is the zero vector, and the remaining \(n-1\) columns are the standard unit vectors \(\vect{e}_i\text{,}\) \(1\leq i\leq n-1\) (Definition SUV), which are also the first \(n-1\) columns of the size \(n\) identity matrix \(I_n\text{.}\) As shorthand, write \(J=\jordan{n}{0}\text{.}\)

\begin{equation*} J=\left[\zerovector\left|\vect{e}_1\right.\left|\vect{e}_2\right.\left|\vect{e}_3\right.\left|\dots\right.\left|\vect{e}_{n-1}\right.\right] \end{equation*}

We will use the definition of matrix multiplication (Definition MM), together with a proof by induction, to study the powers of \(J\text{.}\) Our claim is that

\begin{equation*} J^k= \left[\zerovector\left|\zerovector\right.\left|\dots\right.\left|\zerovector\right.\left|\vect{e}_1\right.\left|\vect{e}_2\right.\left|\dots\right.\left|\vect{e}_{n-k}\right.\right]\text{ for }0\leq k\leq n \end{equation*}

For the base case, \(k=0\text{,}\) and the definition of \(J^0=I_n\) establishes the claim.

For the induction step, first note that \(J\vect{e_1}=\zerovector\) and \(J\vect{e}_i=\vect{e}_{i-1}\) for \(2\leq i\leq n\text{.}\) Then, assuming the claim is true for \(k\text{,}\) we examine the \(k+1\) case,

\begin{align*} J^{k+1}&=JJ^k\\ &=J\left[\zerovector\left|\zerovector\right.\left|\dots\right.\left|\zerovector\right.\left|\vect{e}_1\right.\left|\vect{e}_2\right.\left|\dots\right.\left|\vect{e}_{n-k}\right.\right]\\ &=\left[J\zerovector\left|J\zerovector\right.\left|\dots\right.\left|J\zerovector\right.\left|J\vect{e}_1\right.\left|J\vect{e}_2\right.\left|\dots\right.\left|J\vect{e}_{n-k}\right.\right]\\ &=\left[\zerovector\left|\zerovector\right.\left|\dots\right.\left|\zerovector\right.\left|\zerovector\right.\left|\vect{e}_1\right.\left|\vect{e}_2\right.\left|\dots\right.\left|\vect{e}_{n-k-1}\right.\right]\\ &=\left[\zerovector\left|\zerovector\right.\left|\dots\right.\left|\zerovector\right.\left|\vect{e}_1\right.\left|\vect{e}_2\right.\left|\dots\right.\left|\vect{e}_{n-(k+1)}\right.\right] \end{align*}

This concludes the induction.

So \(J^k\) has a nonzero entry (a one) in row \(n-k\) and column \(n\text{,}\) for \(0\leq k\leq n-1\text{,}\) and is therefore a nonzero matrix. However,

\begin{equation*} J^n=\left[\zerovector\left|\zerovector\right.\left|\dots\right.\left|\zerovector\right.\right]=\zeromatrix \end{equation*}

Thus, by Definition 3.2.1, \(J\) is nilpotent of index \(n\text{.}\)