Skip to main content

Section IS Invariant Subspaces

We have already seen examples of invariant subspaces though we have not described them as such, and we will soon see that they are a critical concept for subsequent ideas about eigenvalues and eigenvectors.

Subsection IS Invariant Subspaces

Definition IS. Invariant Subspace.

Suppose \(A\) is a square matrix of size \(n\text{.}\) Then a subspace \(U\subseteq\complex{n}\) is an invariant subspace for \(A\) if for each \(\vect{u}\in U\text{,}\) then \(A\vect{u}\in U\text{.}\)

We do not have any special notation for an invariant subspace, so it is important to recognize that an invariant subspace is a property of a matrix. When the context is clear, we may just say \(U\) is an invariant subspace, or if not, we might alternatively say that \(U\) is \(A\)-invariant.

If \(\vect{x}\in\eigenspace{A}{\lambda}\) then \(\vect{x}\) is an eigenvector of \(A\) for \(\lambda\text{,}\) and so by scalar closure, \(A\vect{x}=\lambda\vect{x}\in\eigenspace{A}{\lambda}\text{.}\) Thus, \(\eigenspace{A}{\lambda}\) is an \(A\)-invariant subspace, and this is part of our motivation for considering invariant subspaces now. Other general examples include the column space and null space of a square matrix (see Exercise IS.T10 and Exercise IS.T11).

As is our habit, we begin with an example that demonstrates the existence of invariant subspaces, while leaving other questions unanswered for the moment. We will return later to understand how this example was constructed, but for now, just understand how we check the existence of the invariant subspaces.

Let

\begin{equation*} A=\begin{bmatrix} -8 & 6 & -15 & 9 \\ -8 & 14 & -10 & 18 \\ 1 & 1 & 3 & 0 \\ 3 & -8 & 2 & -11 \end{bmatrix} \end{equation*}

Define (with zero motivation),

\begin{align*} \vect{w_1}&=\colvector{-7\\-2\\3\\0} & \vect{w_2}&=\colvector{-1\\-2\\0\\1} \end{align*}

and set \(W=\spn{\set{\vect{w}_1,\,\vect{w}_2}}\text{.}\) We verify that \(W\) is an \(A\)-invariant subspace. By the definition of \(W\text{,}\) any vector chosen from \(W\) can be written as a linear combination of \(\vect{w}_1\) and \(\vect{w}_2\text{.}\) Suppose that \(\vect{w}\in W\text{,}\) and then check the details of the following verification,

\begin{align*} A\vect{w}&=A\left(a_1\vect{w}_1+a_2\vect{w}_2)\right)&&\knowl{./knowl/definition-SS.html}{\text{Definition SS}}\\ &=a_1A\vect{w}_1+a_2A\vect{w}_2&&\knowl{./knowl/theorem-LTLC.html}{\text{Theorem LTLC}}\\ &=a_1\colvector{-1\\-2\\0\\1}+a_2\colvector{5\\-2\\-3\\2}\\ &=a_1\vect{w}_2+a_2\left((-1)\vect{w}_1+2\vect{w}_2\right)\\ &=(-a_2)\vect{w}_1+(a_1+2a_2)\vect{w}_2\in W \end{align*}

So, by Definition IS, \(W\) is an invariant subspace for \(A\text{.}\)

In an entirely similar manner we construct another invariant subspace for \(A\text{.}\) With zero motivation, define

\begin{align*} \vect{x_1}&=\colvector{-3\\-1\\1\\0} & \vect{x_2}&=\colvector{0\\-1\\0\\1} \end{align*}

and set \(X=\spn{\set{\vect{x}_1,\,\vect{x}_2}}\text{.}\) We verify that \(X\) is an invariant subspace for \(A\text{.}\) By the definition of \(X\text{,}\) any vector chosen from \(X\) can be written as a linear combination of \(\vect{x}_1\) and \(\vect{x}_2\text{.}\) Suppose that \(\vect{x}\in X\text{,}\) and then check the details of the following verification,

\begin{align*} A\vect{x}&=A\left(b_1\vect{x}_1+b_2\vect{x}_2\right)&&\knowl{./knowl/definition-SS.html}{\text{Definition SS}}\\ &=b_1A\vect{x}_1+b_2A\vect{x}_2&&\knowl{./knowl/theorem-LTLC.html}{\text{Theorem LTLC}}\\ &=b_1\colvector{3\\0\\-1\\1}+b_2\colvector{3\\4\\-1\\-3}\\ &=b_1\left((-1)\vect{x}_1+\vect{x}_2\right)+b_2\left((-1)\vect{x}_1+(-3)\vect{x}_2\right)\\ &=(-b_1-b_2)\vect{x}_1+(b_1-3b_2)\vect{x}_2\in X \end{align*}

So, by Definition IS, \(X\) is an invariant subspace for \(A\text{.}\)

There is a bit of magic in each of these verifications where the two matrix-vector products just happen to equal linear combinations of the two vectors. But this is the essential nature of an invariant subspace. We will have a peek under the hood at the end of the section in Example TGE, and it will not look so magical after all.

Subsection NSP Null Spaces of Powers

Given a square matrix \(A\) its null space is always an invariant subspace, since if we choose \(\vect{z}\in\nsp{A}\) then \(A\vect{z}=\zerovector\in\nsp{A}\text{.}\) More compactly, we can always say: \(\nsp{A}\) is an \(A\)-invariant subspace. Perhaps we should record this as a theorem, but we have a much more general result.

Suppose that \(\vect{z}\in\nsp{p(A)}\text{.}\) Observe that, trivially, \(A = q(A)\) for the polynomial \(q(x) = x\text{.}\) Then, with an application of Theorem PMC, we have

\begin{equation*} p(A)\left(A\vect{z}\right) = A\left(p(A)\vect{z}\right) = A\zerovector = \zerovector \end{equation*}

which establishes that \(A\vect{z}\in\nsp{p(A)}\text{,}\) and so \(\nsp{p(A)}\) is an invariant subspace for \(A\) by Definition IS.

If \(\lambda\) is an eigenvalue of \(A\text{,}\) then applying this result with \(p(x)=x-\lambda\) shows that \(\nsp{A-\lambda I_n} = \eigenspace{A}{\lambda}\) is an \(A\)-invariant subspace, an observation we already made at the start of this section.

Often we will use a special case of Theorem NSPIS with \(p(x)=x^k\) to see that \(\nsp{A^k}\) is an invariant subspace for \(A\text{.}\) These invariant subspaces have even more structure.

There are several items to verify in the conclusion as stated. First, we show that \(\nsp{A^k}\subseteq\nsp{A^{k+1}}\) for any \(k\text{.}\) Choose \(\vect{z}\in\nsp{A^k}\text{.}\) Then

\begin{equation*} A^{k+1}\vect{z} = A\left(A^k\vect{z}\right) = A\zerovector = \zerovector \end{equation*}

so \(\vect{z}\in\nsp{A^{k+1}}\text{,}\) and by Definition SSET, \(\nsp{A^k}\subseteq\nsp{A^{k+1}}\text{.}\)

Second, we demonstrate the existence of a power \(m\) where consecutive powers result in equal null spaces. A by-product will be the condition that \(m\leq n\text{.}\) To the contrary, suppose that

\begin{equation*} \set{\zerovector} = \nsp{A^0} \subsetneq\nsp{A^1} \subsetneq\nsp{A^2} \subsetneq\cdots \subsetneq\nsp{A^{n-1}} \subsetneq\nsp{A^n} \subsetneq\nsp{A^{n+1}} \subsetneq\cdots\text{.} \end{equation*}

Since \(\nsp{A^k}\subsetneq\nsp{A^{k+1}}\text{,}\) Theorem PSSD implies that the dimensions of the null spaces (nullities) are related by \(\nullity{A^{k+1}}\geq\nullity{A^k}+1\text{.}\) Repeated application of this observation yields

\begin{equation*} \nullity{A^{n+1}} \geq\nullity{A^n}+1 \geq\nullity{A^{n-1}}+2 \geq\nullity{A^{n-2}}+3 \geq\cdots \geq\nullity{A^{0}}+(n+1) =n+1\text{.} \end{equation*}

As \(\nsp{A^{n+1}}\) is a subspace of \(\complex{n}\text{,}\) of dimension \(n\text{,}\) this is a contradiction.

The contradiction yields the existence of some integer \(k\leq n\) such that \(\nsp{A^k}=\nsp{A^{k+1}}\text{,}\) so we can define \(m\) to be the smallest such integer with this property. Thus \(m\leq k\leq n\text{.}\)

It remains to show that once two consecutive null spaces are equal, then all of the remaining null spaces are equal. More formally, if \(\nsp{A^m}=\nsp{A^{m+1}}\text{,}\) then \(\nsp{A^m}=\nsp{A^{m+j}}\) for all \(j\geq 1\text{.}\) The proof is by induction on \(j\text{.}\) The base case (\(j=1\)) is precisely our defining property for \(m\text{.}\)

For the induction step, our hypothesis is that \(\nsp{A^m}=\nsp{A^{m+j}}\text{.}\) We want to establish that \(\nsp{A^m}=\nsp{A^{m+j+1}}\text{.}\) At the outset of this proof we showed that \(\nsp{A^m}\subseteq\nsp{A^{m+j+1}}\text{.}\) So we need only show the subset inclusion in the opposite direction. To wit, choose \(\vect{z}\in\nsp{A^{m+j+1}}\text{.}\) Then

\begin{equation*} A^{m+j}\left(A\vect{z}\right) = A^{m+j+1}\vect{z} = \zerovector \end{equation*}

so \(A\vect{z}\in\nsp{A^{m+j}}\) and thus, by induction, \(A\vect{z}\in\nsp{A^m}\text{.}\) Thus

\begin{equation*} A^{m+1}\vect{z} = A^{m}\left(A\vect{z}\right) = \zerovector \end{equation*}

so \(\vect{z}\in\nsp{A^{m+1}}=\nsp{A^m}\text{,}\) as desired.

So, summarizing informally, null spaces of powers of a matrix “grow bigger” as the powers increase, but at some power (\(m \leq n\)) they “top out” and “never ever get any bigger.”

Subsection GE Generalized Eigenspaces

Every eigenspace is a null space (Theorem EMNS), so by Theorem NSPIS, it will come as no surprise that eigenspaces are invariant subspaces. But we can also see this directly from Definition EEM. If \(\vect{x}\in\eigenspace{A}{\lambda}\text{,}\) then \(A\vect{x} = \lambda\vect{x} \in \eigenspace{A}{\lambda}\) by the closure of scalar multiplication (Property SC). But as before, we will not record this as a theorem, since we will define a more general version of an eigenspace.

To fully understand the relationships between matrices, eigenvalues, and eigenvectors, it is necessary to go beyond “plain” eigenvectors and study a set of generalized eigenvectors known as a generalized eigenspace.

Definition GES. Generalized Eigenspace.

Suppose \(A\) is a square matrix of size \(n\) with an eigenvalue \(\lambda\text{.}\) Define the generalized eigenspace of \(A\) for \(\lambda\) as

\begin{equation*} \geneigenspace{A}{\lambda}=\setparts{\vect{x}}{\left(A-\lambda I_n\right)^k\vect{x}=\zerovector\text{ for some }k\geq 0}\text{.} \end{equation*}

The nonzero vectors in \(\geneigenspace{A}{\lambda}\) are known as generalized eigenvectors.

The generalized eigenspace sounds like a subspace, but just because we use words to suggest that, we need to prove it. That makes for a nice exercise (see Exercise IS.T20), since we will get the result from a theorem that tells us even more. The generalized eigenspace is also an invariant subspace for \(A\text{.}\) That too is an exercise (see Exercise IS.T20), since our next theorem provides this result as well.

The “for some \(k\geq 0\)” clause of the definition suggests a never-ending search for generalized eigenvectors. When do we know that we have checked for elements of null spaces of sufficiently large powers of \(A - \lambda I_n\text{?}\) While this seemingly loose definition can make some proofs easier, when we want to actually compute a generalized eigenspace it will be better to have something more concrete and definite.

To establish the set equality, first suppose that \(\vect{x}\in\geneigenspace{A}{\lambda}\text{.}\) Then there is an integer \(k\) such that \(\left(A-\lambda I_n\right)^k\vect{x}=\zerovector\text{.}\) This is equivalent to the statement that \(\vect{x}\in\nsp{\left(A-\lambda I_n\right)^k}\text{.}\) No matter what the value of \(k\) is, Theorem NSPM gives

\begin{equation*} \vect{x}\in\nsp{\left(A-\lambda I_n\right)^k}\subseteq\nsp{\left(A-\lambda I_n\right)^n}. \end{equation*}

So, \(\geneigenspace{T}{\lambda}\subseteq\nsp{\left(A-\lambda I_n\right)^n}\text{.}\)

For the opposite inclusion, suppose \(\vect{y}\in\nsp{\left(A-\lambda I_n\right)^n}\text{.}\) Then \(\left(A-\lambda I_n\right)^n\vect{y}=\zerovector\text{,}\) so \(\vect{y}\in\geneigenspace{T}{\lambda}\) and thus \(\nsp{\left(A-\lambda I_n\right)^n}\subseteq\geneigenspace{T}{\lambda}\text{.}\) So by Definition SE we have the desired equality of sets.

As promised, the set equality of Theorem GENS gives us two interesting results almost for free.

Theorem GENS tells us that \(\geneigenspace{A}{\lambda}\) is a null space, and so by Theorem NSMS is a subspace.

For the polynomial \(p(x) = (x-\lambda)^n\text{,}\) Theorem GENS translates to \(\geneigenspace{A}{\lambda} = \nsp{p(A)}\text{,}\) which by Theorem NSPIS is an invariant subspace for \(A\text{.}\)

The generalized eigenspace will figure prominently in the remainder of this chapter, and so too, its dimension. So we now see why we have two different names for the two types of multiplicities. We will understand the choice of names as we learn more about them.

Definition AME. Algebraic Multiplicity of an Eigenvalue.

Suppose that \(A\) is a square matrix and \(\lambda\) is an eigenvalue of \(A\text{.}\) Then the algebraic multiplicity of \(\lambda\text{,}\) \(\algmult{A}{\lambda}\text{,}\) is the dimension of the generalized eigenspace \(\geneigenspace{A}{\lambda}\text{.}\)

As dimensions of subspaces, the two different multiplicities have a simple relationship and overall bounds.

Since \(\lambda\) is an eigenvalue of \(A\text{,}\) there is an eigenvector of \(A\) for \(\lambda\text{,}\) \(\vect{x}\text{.}\) Then \(\vect{x}\in\eigenspace{A}{\lambda}\text{,}\) so \(\geomult{A}{\lambda}\geq 1\text{,}\) since we can extend \(\set{\vect{x}}\) into a basis of \(\eigenspace{A}{\lambda}\) (Theorem ELIS).

Every “plain” eigenvector meets the condition of Definition GES, so \(\eigenspace{A}{\lambda}\subseteq\geneigenspace{A}{\lambda}\text{.}\) Applying Theorem PSSD and the definitions of the two multiplicities, we have \(\geomult{A}{\lambda}\leq\algmult{A}{\lambda}\text{.}\)

The generalized eigenspace \(\geneigenspace{A}{\lambda}\) is a subspace of \(\complex{n}\) by Theorem GEIS. Since \(\complex{n}\) has dimension \(n\) (Theorem DCM), Theorem PSSD yields \(\algmult{A}{\lambda}\leq n\text{.}\)

In Example TIS we presented two invariant subspaces of

\begin{equation*} A=\begin{bmatrix} -8 & 6 & -15 & 9 \\ -8 & 14 & -10 & 18 \\ 1 & 1 & 3 & 0 \\ 3 & -8 & 2 & -11 \end{bmatrix}\text{.} \end{equation*}

We show here that these two invariant subspaces are the generalized eigenspaces of the matrix, illustrating Theorem GEIS.

In Exercise IS.C10 you will determine that the eigenvalues of \(A\) are \(\lambda = -2\) and \(\lambda = 1\text{.}\) For both eigenvalues

\begin{equation*} \dimension{\nsp{\left(A-\lambda I_4\right)^2}} = \dimension{\nsp{\left(A-\lambda I_4\right)^3}} = \dimension{\nsp{\left(A-\lambda I_4\right)^4}} = 2 \end{equation*}

as can be seen by the reduced row-echelon form of \(\left(A-\lambda I_4\right)^i\) having two zero rows in each case. So the algebraic multiplicity of each eigenvalue is \(\algmult{A}{\lambda} = 2\text{.}\)

We exhibit the reduced row-echelon form necessary for determining the generalized eigenspace, and use the names of the subspaces from Example TIS. (Note that we could use the second powers, but only after we had determined the nullity of the third power was identical to the nullity of the second power.)

\begin{align*} \left(A-(-2)I_4\right)^4 &\rref \begin{bmatrix} \leading{1} & 0 & 3 & 0 \\ 0 & \leading{1} & 1 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} & X = \geneigenspace{A}{-2} &= \spn{\set{\colvector{-3 \\ -1 \\ 1 \\ 0},\,\colvector{0 \\ -1 \\ 0 \\ 1}}}\\ \left(A-1I_4\right)^4 &\rref \begin{bmatrix} \leading{1} & 0 & \frac{7}{3} & 1 \\ 0 & \leading{1} & \frac{2}{3} & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} & W = \geneigenspace{A}{1} &= \spn{\set{\colvector{-7 \\ -2 \\ 3 \\ 0},\,\colvector{-1 \\ -2 \\ 0 \\ 1}}}\text{.} \end{align*}

Can you express the traditional eigenvectors you found in Example TIS as linear combinations of the basis vectors found here for the generalized eigenspace?

Form the union of the two bases for \(X\) and \(W\) to create a set with four vectors. Prove that this set is a basis of \(\complex{4}\text{.}\) Hmmmmm. See Theorem GEB.

In Example ESMS3 we found eigenvalues of

\begin{equation*} F= \begin{bmatrix} -13 & -8 & -4\\ 12 & 7 & 4\\ 24 & 16 & 7 \end{bmatrix} \end{equation*}

to be \(\lambda=3\) and \(\lambda=-1\text{,}\) along with their traditional eigenspaces. Here we will compute their generalized eigenspaces. By Theorem GENS, we know we to compute a null space of the third power of the relevant nonsingular matrix for each eigenvalue.

We have

\begin{align*} \lambda&=3&\left(F-3I_3\right)^3&= \begin{bmatrix} -256 & -128 & -64 \\ 192 & 64 & 64 \\ 384 & 256 & 64 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & 0 & \frac{1}{2}\\ 0 & \leading{1} & -\frac{1}{2}\\ 0 & 0 & 0 \end{bmatrix}\\ &&\geneigenspace{F}{3}&=\nsp{\left(F-3I_3\right)^3} =\spn{\set{\colvector{-\frac{1}{2}\\\frac{1}{2}\\1}}} =\spn{\set{\colvector{-1\\1\\2}}}\\ \lambda&=-1&\left(F+1I_3\right)^3&= \begin{bmatrix} -192 & -128 & -64 \\ 192 & 128 & 64 \\ 384 & 256 & 128 \end{bmatrix} \rref \begin{bmatrix} \leading{1} & \frac{2}{3} & \frac{1}{3}\\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix}\\ &&\geneigenspace{F}{-1}&=\nsp{\left(F+1I_3\right)^3} =\spn{\set{\colvector{-\frac{2}{3}\\1\\0},\,\colvector{-\frac{1}{3}\\0\\1}}} =\spn{\set{\colvector{-2\\3\\0},\,\colvector{-1\\0\\3}}}\text{.} \end{align*}

If you review Example ESMS3, you will see that this example is not very satisfying. Despite row-reducing new matrices (the third powers), the reduced row-echelon forms are identical, and thus the generalized eigenspaces are identical to the traditional eigenspaces. Note that this is definitely not always the case, as demonstrated by the matrix in Example TGE. While not very exciting here, this is actually a very interesting example of the upcoming Theorem DMFE.

We record the dimensions of the generalized eigenspaces as the algebraic multiplicities

\begin{align*} \algmult{F}{3}&=2 & \algmult{F}{-1}&=1\text{.} \end{align*}

NEEDS MORE COMPUTATIONS OF GENERALIZED EIGENSPACES!

Reading Questions IS Reading Questions

1.

Given the matrix \(A\) and the subspace \(W\text{,}\) provide enough computations, in the spirit of Example TIS, to convince someone that \(W\) is an \(A\)-invariant subspace.

\begin{align*} A & = \begin{bmatrix} 20 & -69 & 54 \\ 6 & -22 & 18 \\ 1 & -5 & 5 \end{bmatrix} & W &= \spn{\set{ \colvector{ 1 \\ 1 \\ 1 },\, \colvector{ 4 \\ 1 \\ 0 } }}\text{.} \end{align*}
2.

Compute the generalized eigenspace of \(A\) for \(\lambda=2\text{,}\) \(\geneigenspace{A}{2}\text{.}\)

3.

Explain why the word “generalized” is a good choice for part of the name of \(\geneigenspace{A}{\lambda}\text{.}\)

Exercises IS Exercises

C10.

Compute the eigenvalues of the matrix \(A\) from Example TIS,

\begin{equation*} A=\begin{bmatrix} -8 & 6 & -15 & 9 \\ -8 & 14 & -10 & 18 \\ 1 & 1 & 3 & 0 \\ 3 & -8 & 2 & -11 \end{bmatrix}\text{.} \end{equation*}

Additionally, compute an eigenvector for each eigenvalue so you can answer the question at the end of Example TGE.

T10.

Suppose \(A\) is a square matrix. Prove that the column space of \(A\text{,}\) \(\csp{A}\) is an invariant subspace of \(A\text{.}\)

Solution

For any vector \(\vect{x}\text{,}\) the matrix-vector product \(A\vect{x}\) is a linear combination of the columns of \(A\text{,}\) and hence an element of the column space. So if \(\vect{x}\in\csp{A}\text{,}\) then \(A\vect{x}\in\csp{A}\text{,}\) making \(\csp{A}\) an invariant subspace of \(A\text{.}\)

T11.

Suppose \(A\) is a square matrix. Prove that the null space of \(A\text{,}\) \(\nsp{A}\) is an invariant subspace of \(A\text{.}\)

Solution

For any vector \(\vect{x}\in\nsp{A}\text{,}\) the matrix-vector product \(A\vect{x}\) is the zero vector, which is an element of any subspace. So if \(\vect{x}\in\nsp{A}\text{,}\) then \(A\vect{x}=\zerovector\in\nsp{A}\text{,}\) making \(\nsp{A}\) an invariant subspace of \(A\text{.}\)

T30.

Theorem NSPM showed that null spaces of powers of a matrix get larger as the power increases, but at some point the null spaces become equal and stay that way. Prove that the opposite is true for column spaces. In other words, for a square matrix \(A\) of size \(n\text{,}\) prove that \(\csp{A^k}\supseteq\csp{A^{k+1}}\) for all \(k\geq 0\text{.}\)

Solution

Grab \(\vect{x}\in\csp{A^{k+1}}\text{.}\) Saying that \(\vect{x}\) is a linear combination of the columns of \(A^{k+1}\) can be expressed by saying that there is a vector \(\vect{y}\) such that \(\vect{x}=A^{k+1}\vect{y}\text{.}\) Then we can write

\begin{equation*} \vect{x} =A^{k+1}\vect{y} = \left(A^kA\right)\vect{y} = A^k\left(A\vect{y}\right)\text{.} \end{equation*}

Since \(A\vect{y}\) is a column vector, we can read this equation to say that \(\vect{x}\) is a linear combination of the columns of \(A^k\text{.}\) In other words, \(\vect{x}\in\csp{A^k}\text{.}\)

T31.

Continue Exercise IS.T30 and complete the analogy with Theorem NSPM by showing that the column spaces of powers “bottom out.” More carefully, there is an integer \(m\) such that \(\csp{A^m} = \csp{A^k}\) for all \(k\geq m\text{,}\) and \(m\leq n\text{.}\)

Solution

We claim that the \(m\) requested in this problem is the same as the \(m\) guaranteed by Theorem NSPM. Theorem RPNC says that the rank (dimension of the column space) plus the nullity (dimension of the null space) will add to \(n\) (the number of columns). So when the null spaces top out, the nullities become constant, and then so too, the ranks become constant. An application of Theorem EDYES then shows that the columns spaces are equal as sets.