From A First Course in Linear Algebra
Version 2.30
© 2004.
Licensed under the GNU Free Documentation License.
http://linear.ups.edu/
We have seen that linear transformations whose domain and codomain are
vector spaces of columns vectors have a close relationship with matrices
(Theorem MBLT, Theorem MLTCV). In this section, we will extend the
relationship between matrices and linear transformations to the setting of linear
transformations between abstract vector spaces.
Definition MR
Matrix Representation
Suppose that T : U → V is a
linear transformation, B = \left \{{u}_{1},\kern 1.95872pt {u}_{2},\kern 1.95872pt {u}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {u}_{n}\right \}
is a basis for U
of size n, and
C is a basis
for V of size
m. Then the matrix
representation of T
relative to B
and C
is the m × n
matrix,
{M}_{B,C}^{T } = \left [\left .{ρ}_{
C}\left (T\left ({u}_{1}\right )\right )\right |\left .{ρ}_{C}\left (T\left ({u}_{2}\right )\right )\right |\left .{ρ}_{C}\left (T\left ({u}_{3}\right )\right )\right |\mathop{\mathop{…}}\left |{ρ}_{C}\left (T\left ({u}_{n}\right )\right )\right .\right ]
|
(This definition contains Notation MR.) △
Example OLTTR
One linear transformation, three representations
Consider the linear transformation
S : {P}_{3} → {M}_{22},\quad S\left (a + bx + c{x}^{2} + d{x}^{3}\right ) = \left [\array{
3a + 7b − 2c − 5d & 8a + 14b − 2c − 11d
\cr
−4a − 8b + 2c + 6d&12a + 22b − 4c − 17d } \right ]
|
First, we build a representation relative to the bases,
We evaluate S with each element of the basis for the domain, B, and coordinatize the result relative to the vectors in the basis for the codomain, C. Notice here how we take elements of vector spaces and decompose them into linear combinations of basis elements as the key step in constructing coordinatizations of vectors. There is a system of equations involved almost every time, but we will omit these details since this should be a routine exercise at this stage.
Thus, employing Definition MR
{
M}_{B,C}^{S} = \left [\array{
−90&−72&114&−220\cr
37 & 29 &−46 & 91
\cr
−40&−34& 54 & −96\cr
4 & 3 & −5 & 10
} \right ]
|
Often we use “nice” bases to build matrix representations and the work involved is much easier. Suppose we take bases
The evaluation of S at the elements of D is easy and coordinatization relative to E can be done on sight,
So the matrix representation of S relative to D and E is
{
M}_{D,E}^{S} = \left [\array{
3 & 7 &−2& −5\cr
8 & 14 &−2 &−11
\cr
−4&−8& 2 & 6\cr
12 & 22 &−4 &−17
} \right ]
|
One more time, but now let’s use bases
and evaluate S with the elements of F, then coordinatize the results relative to G,
So we arrive at an especially economical matrix representation,
{
M}_{F,G}^{S} = \left [\array{
2& 0 &0&0\cr
0&−1 &0 &0
\cr
0& 0 &1&0\cr
0& 0 &0 &0 } \right ]
|
We may choose to use whatever terms we want when we make a definition. Some are arbitrary, while others make sense, but only in light of subsequent theorems. Matrix representation is in the latter category. We begin with a linear transformation and produce a matrix. So what? Here’s the theorem that justifies the term “matrix representation.”
Theorem FTMR
Fundamental Theorem of Matrix Representation
Suppose that T : U → V is a
linear transformation, B
is a basis for U,
C is a basis
for V and
{M}_{B,C}^{T } is the matrix
representation of T
relative to B
and C. Then,
for any u ∈ U,
{ρ}_{C}\left (T\left (u\right )\right ) = {M}_{B,C}^{T }\left ({ρ}_{
B}\left (u\right )\right )
|
or equivalently
T\left (u\right ) = {ρ}_{C}^{−1}\left ({M}_{
B,C}^{T }\left ({ρ}_{
B}\left (u\right )\right )\right )
|
Proof Let B = \left \{{u}_{1},\kern 1.95872pt {u}_{2},\kern 1.95872pt {u}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {u}_{n}\right \} be the basis of U. Since u ∈ U, there are scalars {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt {a}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{n} such that
u = {a}_{1}{u}_{1} + {a}_{2}{u}_{2} + {a}_{3}{u}_{3} + \mathrel{⋯} + {a}_{n}{u}_{n}
|
Then,
The alternative conclusion is obtained as
This theorem says that we can apply T to u and coordinatize the result relative to C in V , or we can first coordinatize u relative to B in U, then multiply by the matrix representation. Either way, the result is the same. So the effect of a linear transformation can always be accomplished by a matrix-vector product (Definition MVP). That’s important enough to say again. The effect of a linear transformation is a matrix-vector product.
The alternative conclusion of this result might be even more striking. It says that to effect a linear transformation (T) of a vector (u), coordinatize the input (with {ρ}_{B}), do a matrix-vector product (with {M}_{B,C}^{T }), and un-coordinatize the result (with {ρ}_{C}^{−1}). So, absent some bookkeeping about vector representations, a linear transformation is a matrix. To adjust the diagram, we “reverse” the arrow on the right, which means inverting the vector representation {ρ}_{C} on V . Now we can go directly across the top of the diagram, computing the linear transformation between the abstract vector spaces. Or, we can around the other three sides, using vector representation, a matrix-vector product, followed by un-coordinatization.
Here’s an example to illustrate how the “action” of a linear transformation can be effected by matrix multiplication.
Example ALTMM
A linear transformation as matrix multiplication
In Example OLTTR we found three representations of the linear transformation
S.
In this example, we will compute a single output of
S
in four different ways. First “normally,” then three times over using
Theorem FTMR.
Choose p(x) = 3 − x + 2{x}^{2} − 5{x}^{3}, for no particular reason. Then the straightforward application of S to p(x) yields
Now use the representation of S relative to the bases B and C and Theorem FTMR. Note that we will employ the following linear combination in moving from the second line to the third,
OK, last time, now with the bases F and G. The coordinatizations will take some work this time, but the matrix-vector product (Definition MVP) (which is the actual action of the linear transformation) will be especially easy, given the diagonal nature of the matrix representation, {M}_{F,G}^{S}. Here we go,
This example is not meant to necessarily illustrate that any one of these four computations is simpler than the others. Instead, it is meant to illustrate the many different ways we can arrive at the same result, with the last three all employing a matrix representation to effect the linear transformation. ⊠
We will use Theorem FTMR frequently in the next few sections. A typical application will feel like the linear transformation T “commutes” with a vector representation, {ρ}_{C}, and as it does the transformation morphs into a matrix, {M}_{B,C}^{T }, while the vector representation changes to a new basis, {ρ}_{B}. Or vice-versa.
In Subsection LT.NLTFO we built new linear transformations from other linear transformations. Sums, scalar multiples and compositions. These new linear transformations will have matrix representations as well. How do the new matrix representations relate to the old matrix representations? Here are the three theorems.
Theorem MRSLT
Matrix Representation of a Sum of Linear Transformations
Suppose that T : U → V
and S : U → V are linear
transformations, B
is a basis of U
and C is a
basis of V .
Then
{M}_{B,C}^{T+S} = {M}_{
B,C}^{T } + {M}_{
B,C}^{S}
|
Proof Let x be any vector in {ℂ}^{n}. Define u ∈ U by u = {ρ}_{B}^{−1}\left (x\right ), so x = {ρ}_{B}\left (u\right ). Then,
Since the matrices {M}_{B,C}^{T+S} and {M}_{B,C}^{T } + {M}_{ B,C}^{S} have equal matrix-vector products for every vector in {ℂ}^{n}, by Theorem EMMVP they are equal matrices. (Now would be a good time to double-back and study the proof of Theorem EMMVP. You did promise to come back to this theorem sometime, didn’t you?) ■
Theorem MRMLT
Matrix Representation of a Multiple of a Linear Transformation
Suppose that T : U → V is a
linear transformation, α ∈ {ℂ}^{},
B is a
basis of U
and C is a
basis of V .
Then
{M}_{B,C}^{αT } = α{M}_{
B,C}^{T }
|
Proof Let x be any vector in {ℂ}^{n}. Define u ∈ U by u = {ρ}_{B}^{−1}\left (x\right ), so x = {ρ}_{B}\left (u\right ). Then,
Since the matrices {M}_{B,C}^{αT } and α{M}_{B,C}^{T } have equal matrix-vector products for every vector in {ℂ}^{n}, by Theorem EMMVP they are equal matrices. ■
The vector space of all linear transformations from U to V is now isomorphic to the vector space of all m × n matrices.
Theorem MRCLT
Matrix Representation of a Composition of Linear Transformations
Suppose that T : U → V
and S : V → W are linear
transformations, B
is a basis of U,
C is a
basis of V ,
and D is a
basis of W.
Then
{M}_{B,D}^{S∘T } = {M}_{
C,D}^{S}{M}_{
B,C}^{T }
|
Proof Let x be any vector in {ℂ}^{n}. Define u ∈ U by u = {ρ}_{B}^{−1}\left (x\right ), so x = {ρ}_{B}\left (u\right ). Then,
Since the matrices {M}_{B,D}^{S∘T } and {M}_{C,D}^{S}{M}_{ B,C}^{T } have equal matrix-vector products for every vector in {ℂ}^{n}, by Theorem EMMVP they are equal matrices. ■
This is the second great surprise of introductory linear algebra. Matrices are linear transformations (functions, really), and matrix multiplication is function composition! We can form the composition of two linear transformations, then form the matrix representation of the result. Or we can form the matrix representation of each linear transformation separately, then multiply the two representations together via Definition MM. In either case, we arrive at the same result.
Example MPMR
Matrix product of matrix representations
Consider the two linear transformations,
and bases for {ℂ}^{2}, {P}_{2} and {M}_{22} (respectively),
Begin by computing the new linear transformation that is the composition of T and S (Definition LTC, Theorem CLTLT), \left (S ∘ T\right ): {ℂ}^{2} → {M}_{ 22},
Now compute the matrix representations (Definition MR) for each of these three linear transformations (T, S, S ∘ T), relative to the appropriate bases. First for T,
So we have the matrix representation of T,
{
M}_{B,C}^{T } = \left [\array{
28& 33\cr
28 & 32
\cr
−9&−11 } \right ]
|
Now, a representation of S,
So we have the matrix representation of S,
{
M}_{C,D}^{S} = \left [\array{
−11& 26 & 34\cr
−21 & 51 & 67
\cr
0 & 0 & 1\cr
17 &−38 &−46 } \right ]
|
Finally, a representation of S ∘ T,
So we have the matrix representation of S ∘ T,
{
M}_{B,D}^{S∘T } = \left [\array{
114 & 95\cr
237 & 202
\cr
−9 & −11\cr
−174 &−149
} \right ]
|
Now, we are all set to verify the conclusion of Theorem MRCLT,
We have intentionally used non-standard bases. If you were to choose “nice” bases for the three vector spaces, then the result of the theorem might be rather transparent. But this would still be a worthwhile exercise — give it a go. ⊠
A diagram, similar to ones we have seen earlier, might make the importance of this theorem clearer,
One of our goals in the first part of this book is to make the definition of matrix multiplication (Definition MVP, Definition MM) seem as natural as possible. However, many are brought up with an entry-by-entry description of matrix multiplication (Theorem ME) as the definition of matrix multiplication, and then theorems about columns of matrices and linear combinations follow from that definition. With this unmotivated definition, the realization that matrix multiplication is function composition is quite remarkable. It is an interesting exercise to begin with the question, “What is the matrix representation of the composition of two linear transformations?” and then, without using any theorems about matrix multiplication, finally arrive at the entry-by-entry description of matrix multiplication. Try it yourself (Exercise MR.T80).
It will not be a surprise to discover that the kernel and range of a linear transformation are closely related to the null space and column space of the transformation’s matrix representation. Perhaps this idea has been bouncing around in your head already, even before seeing the definition of a matrix representation. However, with a formal definition of a matrix representation (Definition MR), and a fundamental theorem to go with it (Theorem FTMR) we can be formal about the relationship, using the idea of isomorphic vector spaces (Definition IVS). Here are the twin theorems.
Theorem KNSI
Kernel and Null Space Isomorphism
Suppose that T : U → V is a
linear transformation, B
is a basis for U
of size n, and
C is a basis for
V . Then the kernel
of T is isomorphic to
the null space of {M}_{B,C}^{T },
K\kern -1.95872pt \left (T\right )\mathop{\mathop{≅}}N\kern -1.95872pt \left ({M}_{B,C}^{T }\right )
|
Proof To establish that two vector spaces are isomorphic, we must find an isomorphism between them, an invertible linear transformation (Definition IVS). The kernel of the linear transformation T, K\kern -1.95872pt \left (T\right ), is a subspace of U, while the null space of the matrix representation, N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) is a subspace of {ℂ}^{n}. The function {ρ}_{B} is defined as a function from U to {ℂ}^{n}, but we can just as well employ the definition of {ρ}_{B} as a function from K\kern -1.95872pt \left (T\right ) to N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ).
We must first insure that if we choose an input for {ρ}_{B} from K\kern -1.95872pt \left (T\right ) that then the output will be an element of N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ). So suppose that u ∈K\kern -1.95872pt \left (T\right ). Then
This says that {ρ}_{B}\left (u\right ) ∈N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ), as desired.
The restriction in the size of the domain and codomain {ρ}_{B} will not affect the fact that {ρ}_{B} is a linear transformation (Theorem VRLT), nor will it affect the fact that {ρ}_{B} is injective (Theorem VRI). Something must be done though to verify that {ρ}_{B} is surjective. To this end, appeal to the definition of surjective (Definition SLT), and suppose that we have an element of the codomain, x ∈N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) ⊆ {ℂ}^{n} and we wish to find an element of the domain with x as its image. We now show that the desired element of the domain is u = {ρ}_{B}^{−1}\left (x\right ). First, verify that u ∈K\kern -1.95872pt \left (T\right ),
}
With {ρ}_{B} demonstrated to be an injective and surjective linear transformation from K\kern -1.95872pt \left (T\right ) to N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ), Theorem ILTIS tells us {ρ}_{B} is invertible, and so by Definition IVS, we say K\kern -1.95872pt \left (T\right ) and N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) are isomorphic. ■
Example KVMR
Kernel via matrix representation
Consider the kernel of the linear transformation
T : {M}_{22} → {P}_{2},\quad T\left (\left [\array{
a&b\cr
c&d } \right ]\right ) = (2a−b+c−5d)+(a+4b+5b+2d)x+(3a−2b+c−8d){x}^{2}
|
We will begin with a matrix representation of T relative to the bases for {M}_{22} and {P}_{2} (respectively),
Then,
So the matrix representation of T (relative to B and C) is
{
M}_{B,C}^{T } = \left [\array{
2 &−24& 5 & −8\cr
0 & 8 & 0 & 4
\cr
−2&−26&−5&−17 } \right ]
|
We know from Theorem KNSI that the kernel of the linear transformation T is isomorphic to the null space of the matrix representation {M}_{B,C}^{T } and by studying the proof of Theorem KNSI we learn that {ρ}_{B} is an isomorphism between these null spaces. Rather than trying to compute the kernel of T using definitions and techniques from Chapter LT we will instead analyze the null space of {M}_{B,C}^{T } using techniques from way back in Chapter V. First row-reduce {M}_{B,C}^{T },
\left [\array{
2 &−24& 5 & −8\cr
0 & 8 & 0 & 4
\cr
−2&−26&−5&−17 } \right ]\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{
\text{1}&0&{5\over
2}&2
\cr
0&\text{1}&0&{1\over
2}
\cr
0&0&0&0 } \right ]
|
So, by Theorem BNS, a basis for N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) is
\left \langle \left \{\left [\array{
−{5\over
2}
\cr
0\cr
1
\cr
0 } \right ],\kern 1.95872pt \left [\array{
−2
\cr
−{1\over
2}
\cr
0\cr
1 } \right ]\right \}\right \rangle
|
We can now convert this basis of N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) into a basis of K\kern -1.95872pt \left (T\right ) by applying {ρ}_{B}^{−1} to each element of the basis,
So the set
\left \{\left [\array{
−{3\over
2}&−3
\cr
{5\over
2} & {1\over
2} } \right ],\kern 1.95872pt \left [\array{
−{1\over
2}&−{1\over
2}
\cr
{1\over
2} & 0 } \right ]\right \}
|
is a basis for K\kern -1.95872pt \left (T\right ) Just for fun, you might evaluate T with each of these two basis vectors and verify that the output is the zero polynomial (Exercise MR.C10). ⊠
An entirely similar result applies to the range of a linear transformation and the column space of a matrix representation of the linear transformation.
Theorem RCSI
Range and Column Space Isomorphism
Suppose that T : U → V is a
linear transformation, B
is a basis for U
of size n, and
C is a basis
for V of size
m. Then the range of
T is isomorphic to the
column space of {M}_{B,C}^{T },
ℛ\kern -1.95872pt \left (T\right )\mathop{\mathop{≅}}C\kern -1.95872pt \left ({M}_{B,C}^{T }\right )
|
Proof To establish that two vector spaces are isomorphic, we must find an isomorphism between them, an invertible linear transformation (Definition IVS). The range of the linear transformation T, ℛ\kern -1.95872pt \left (T\right ), is a subspace of V , while the column space of the matrix representation, C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) is a subspace of {ℂ}^{m}. The function {ρ}_{C} is defined as a function from V to {ℂ}^{m}, but we can just as well employ the definition of {ρ}_{C} as a function from ℛ\kern -1.95872pt \left (T\right ) to C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ).
We must first insure that if we choose an input for {ρ}_{C} from ℛ\kern -1.95872pt \left (T\right ) that then the output will be an element of C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ). So suppose that v ∈ℛ\kern -1.95872pt \left (T\right ). Then there is a vector u ∈ U, such that T\left (u\right ) = v. Consider
This says that {ρ}_{C}\left (v\right ) ∈C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ), as desired.
The restriction in the size of the domain and codomain will not affect the fact that {ρ}_{C} is a linear transformation (Theorem VRLT), nor will it affect the fact that {ρ}_{C} is injective (Theorem VRI). Something must be done though to verify that {ρ}_{C} is surjective. This all gets a bit confusing, since the domain of our isomorphism is the range of the linear transformation, so think about your objects as you go. To establish that {ρ}_{C} is surjective, appeal to the definition of a surjective linear transformation (Definition SLT), and suppose that we have an element of the codomain, y ∈C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) ⊆ {ℂ}^{m} and we wish to find an element of the domain with y as its image. Since y ∈C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ), there exists a vector, x ∈ {ℂ}^{n} with {M}_{B,C}^{T }x = y. We now show that the desired element of the domain is v = {ρ}_{C}^{−1}\left (y\right ). First, verify that v ∈ℛ\kern -1.95872pt \left (T\right ) by applying T to u = {ρ}_{B}^{−1}\left (x\right ),
}
With {ρ}_{C} demonstrated to be an injective and surjective linear transformation from ℛ\kern -1.95872pt \left (T\right ) to C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ), Theorem ILTIS tells us {ρ}_{C} is invertible, and so by Definition IVS, we say ℛ\kern -1.95872pt \left (T\right ) and C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) are isomorphic. ■
Example RVMR
Range via matrix representation
In this example, we will recycle the linear transformation
T and the
bases B
and C
of Example KVMR but now we will compute the range of
T,
T : {M}_{22} → {P}_{2},\quad T\left (\left [\array{
a&b\cr
c&d } \right ]\right ) = (2a−b+c−5d)+(a+4b+5b+2d)x+(3a−2b+c−8d){x}^{2}
|
With bases B and C,
we obtain the matrix representation
{
M}_{B,C}^{T } = \left [\array{
2 &−24& 5 & −8\cr
0 & 8 & 0 & 4
\cr
−2&−26&−5&−17 } \right ]
|
We know from Theorem RCSI that the range of the linear transformation T is isomorphic to the column space of the matrix representation {M}_{B,C}^{T } and by studying the proof of Theorem RCSI we learn that {ρ}_{C} is an isomorphism between these subspaces. Notice that since the range is a subspace of the codomain, we will employ {ρ}_{C} as the isomorphism, rather than {ρ}_{B}, which was the correct choice for an isomorphism between the null spaces of Example KVMR.
Rather than trying to compute the range of T using definitions and techniques from Chapter LT we will instead analyze the column space of {M}_{B,C}^{T } using techniques from way back in Chapter M. First row-reduce {\left ({M}_{B,C}^{T }\right )}^{t},
\left [\array{
2 &0& −2\cr
−24 &8 &−26
\cr
5 &0& −5\cr
−8 &4 &−17
} \right ]\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{
\text{1}&0& −1
\cr
0&\text{1}&−{25\over
4}
\cr
0&0& 0\cr
0&0 & 0 } \right ]
|
Now employ Theorem CSRST and Theorem BRS (there are other methods we could choose here to compute the column space, such as Theorem BCS) to obtain the basis for C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ),
\left \{\left [\array{
1\cr
0
\cr
−1 } \right ],\kern 1.95872pt \left [\array{
0\cr
1
\cr
−{25\over
4}} \right ]\right \}
|
We can now convert this basis of C\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) into a basis of ℛ\kern -1.95872pt \left (T\right ) by applying {ρ}_{C}^{−1} to each element of the basis,
So the set
\left \{2 + 3x + 3{x}^{2},\kern 1.95872pt {33\over
4} + 3x + {31\over
2} {x}^{2}\right \}
|
is a basis for ℛ\kern -1.95872pt \left (T\right ). ⊠
Theorem KNSI and Theorem RCSI can be viewed as further formal evidence for the Coordinatization Principle, though they are not direct consequences.
We have seen, both in theorems and in examples, that questions about linear transformations are often equivalent to questions about matrices. It is the matrix representation of a linear transformation that makes this idea precise. Here’s our final theorem that solidifies this connection.
Theorem IMR
Invertible Matrix Representations
Suppose that T : U → V is a
linear transformation, B
is a basis for U
and C is a
basis for V .
Then T
is an invertible linear transformation if and only if the matrix representation of
T relative
to B
and C,
{M}_{B,C}^{T } is an invertible
matrix. When T
is invertible,
{M}_{C,B}^{{T}^{−1}
} ={ \left ({M}_{B,C}^{T }\right )}^{−1}
|
Proof ( ⇐) Suppose T is invertible, so the inverse linear transformation {T}^{−1}: V → U exists (Definition IVLT). Both linear transformations have matrix representations relative to the bases of U and V , namely {M}_{B,C}^{T } and {M}_{C,B}^{{T}^{−1}} (Definition MR). Then
}
These two equations show that {M}_{B,C}^{T } and {M}_{C,B}^{{T}^{−1}} are inverse matrices (Definition MI) and establish that when T is invertible, then {M}_{C,B}^{{T}^{−1}} ={ \left ({M}_{ B,C}^{T }\right )}^{−1}.
( ⇐) Suppose now that {M}_{B,C}^{T } is an invertible matrix and hence nonsingular (Theorem NI). We compute the nullity of T,
So the kernel of T is trivial, and by Theorem KILT, T is injective.
We now compute the rank of T,
Since the dimension of the range of T equals the dimension of the codomain V , by Theorem EDYES, ℛ\kern -1.95872pt \left (T\right ) = V . Which says that T is surjective by Theorem RSLT.
Because T is both injective and surjective, by Theorem ILTIS, T is invertible. ■
By now, the connections between matrices and linear transformations should be starting to become more transparent, and you may have already recognized the invertibility of a matrix as being tantamount to the invertibility of the associated matrix representation. The next example shows how to apply this theorem to the problem of actually building a formula for the inverse of an invertible linear transformation.
Example ILTVR
Inverse of a linear transformation via a representation
Consider the linear transformation
R: {P}_{3} → {M}_{22},\quad R\left (a + bx + c{x}^{2} + {x}^{3}\right ) = \left [\array{
a + b − c + 2d&2a + 3b − 2c + 3d
\cr
a + b + 2d &−a + b + 2c − 5d } \right ]
|
If we wish to quickly find a formula for the inverse of R (presuming it exists), then choosing “nice” bases will work best. So build a matrix representation of R relative to the bases B and C,
Then,
So a representation of R is
{
M}_{B,C}^{R} = \left [\array{
1 &1&−1& 2\cr
2 &3 &−2 & 3
\cr
1 &1& 0 & 2\cr
−1 &1 & 2 &−5 } \right ]
|
The matrix {M}_{B,C}^{R} is invertible (as you can check) so we know for sure that R is invertible by Theorem IMR. Furthermore,
{
M}_{C,B}^{{R}^{−1}
} ={ \left ({M}_{B,C}^{R}\right )}^{−1} ={ \left [\array{
1 &1&−1& 2\cr
2 &3 &−2 & 3
\cr
1 &1& 0 & 2\cr
−1 &1 & 2 &−5 } \right ]}^{−1} = \left [\array{
20&−7&−2& 3\cr
−8 & 3 & 1 &−1
\cr
−1& 0 & 1 & 0\cr
−6 & 2 & 1 &−1
} \right ]
|
We can use this representation of the inverse linear transformation, in concert with Theorem FTMR, to determine an explicit formula for the inverse itself,
You might look back at Example AIVLT, where we first witnessed the inverse of a linear transformation and recognize that the inverse (S) was built from using the method of Example ILTVR with a matrix representation of T.
Theorem IMILT
Invertible Matrices, Invertible Linear Transformation
Suppose that A is a
square matrix of size n and
T : {ℂ}^{n} → {ℂ}^{n} is the linear transformation
defined by T\left (x\right ) = Ax. Then
A is invertible matrix if and
only if T is an invertible
linear transformation. □
Proof Choose bases B = C = \left \{{e}_{1},\kern 1.95872pt {e}_{2},\kern 1.95872pt {e}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {e}_{n}\right \} consisting of the standard unit vectors as a basis of {ℂ}^{n} (Theorem SUVB) and build a matrix representation of T relative to B and C. Then
So then the matrix representation of T, relative to B and C, is simply {M}_{B,C}^{T } = A. with this observation, the proof becomes a specialization of Theorem IMR,
This theorem may seem gratuitous. Why state such a special case of Theorem IMR? Because it adds another condition to our NMEx series of theorems, and in some ways it is the most fundamental expression of what it means for a matrix to be nonsingular — the associated linear transformation is invertible. This is our final update.
Theorem NME9
Nonsingular Matrix Equivalences, Round 9
Suppose that A is a
square matrix of size n.
The following are equivalent.
Proof By Theorem IMILT the new addition to this list is equivalent to the statement that A is invertible so we can expand Theorem NME8. ■
T : {ℂ}^{2} → {ℂ}^{2},\quad T\left (\left [\array{
{x}_{1}\cr
{x}_{
2}} \right ]\right ) = \left [\array{
2{x}_{1} − {x}_{2}
\cr
3{x}_{1} + 2{x}_{2}
} \right ]
|
relative to the bases
C10 Example KVMR concludes with a basis for the kernel of the linear transformation
T. Compute
the value of T
for each of these two basis vectors. Did you get what you expected?
Contributed by Robert Beezer
C20 Compute the matrix representation of T relative to the bases B and C.
Contributed by Robert Beezer Solution [1738]
C21 Find a matrix representation of the linear transformation T relative to the bases B and C.
Contributed by Robert Beezer Solution [1739]
C22 Let {S}_{22} be the vector space of 2 × 2 symmetric matrices. Build the matrix representation of the linear transformation T : {P}_{2} → {S}_{22} relative to the bases B and C and then use this matrix representation to compute T\left (3 + 5x − 2{x}^{2}\right ).
Contributed by Robert Beezer Solution [1739]
C25 Use a matrix representation to determine if the linear transformation T : {P}_{3} → {M}_{22} surjective.
T\left (a + bx + c{x}^{2} + d{x}^{3}\right ) = \left [\array{
−a + 4b + c + 2d&4a − b + 6c − d
\cr
a + 5b − 2c + 2d& a + 2c + 5d } \right ]
|
Contributed by Robert Beezer Solution [1741]
C30 Find bases for the kernel and range of the linear transformation S below.
S : {M}_{22} → {P}_{2},\quad S\left (\left [\array{
a&b\cr
c&d } \right ]\right ) = (a+2b+5c−4d)+(3a−b+8c+2d)x+(a+b+4c−2d){x}^{2}
|
Contributed by Robert Beezer Solution [1743]
C40 Let {S}_{22} be the set of 2 × 2 symmetric matrices. Verify that the linear transformation R is invertible and find {R}^{−1}.
R: {S}_{22} → {P}_{2},\quad R\left (\left [\array{
a&b\cr
b&c } \right ]\right ) = (a−b)+(2a−3b−2c)x+(a−b+c){x}^{2}
|
Contributed by Robert Beezer Solution [1745]
C41 Prove that the linear transformation S is invertible. Then find a formula for the inverse linear transformation, {S}^{−1}, by employing a matrix inverse.
S : {P}_{1} → {M}_{1,2},\quad S\left (a + bx\right ) = \left [\array{
3a + b&2a + b } \right ]
|
Contributed by Robert Beezer Solution [1747]
C42 The linear transformation R: {M}_{12} → {M}_{21} is invertible. Use a matrix representation to determine a formula for the inverse linear transformation {R}^{−1}: {M}_{ 21} → {M}_{12}.
R\left (\left [\array{
a&b } \right ]\right ) = \left [\array{
a + 3b\cr
4a + 11b } \right ]
|
Contributed by Robert Beezer Solution [1748]
C50 Use a matrix representation to find a basis for the range of the linear transformation L.
L: {M}_{22} → {P}_{2},\quad T\left (\left [\array{
a&b\cr
c&d } \right ]\right ) = (a+2b+4c+d)+(3a+c−2d)x+(−a+b+3c+3d){x}^{2}
|
Contributed by Robert Beezer Solution [1750]
C51 Use a matrix representation to find a basis for the kernel of the linear transformation L.
L: {M}_{22} → {P}_{2},\quad T\left (\left [\array{
a&b\cr
c&d } \right ]\right ) = (a+2b+4c+d)+(3a+c−2d)x+(−a+b+3c+3d){x}^{2}
|
Contributed by Robert Beezer
C52 Find a basis for the kernel of the linear transformation T : {P}_{2} → {M}_{22}.
T\left (a + bx + c{x}^{2}\right ) = \left [\array{
a + 2b − 2c& 2a + 2b
\cr
−a + b − 4c&3a + 2b + 2c } \right ]
|
Contributed by Robert Beezer Solution [1752]
M20 The linear transformation D performs differentiation on polynomials. Use a matrix representation of D to find the rank and nullity of D.
D: {P}_{n} → {P}_{n},\quad D\left (p(x)\right ) = {p}^{′}(x)
|
Contributed by Robert Beezer Solution [1755]
M60 Suppose U and
V are vector spaces
and define a function Z : U → V
by T\left (u\right ) = {0}_{V } for
every u ∈ U.
Then Exercise IVLT.M60 asks you to formulate the theorem:
Z is invertible if
and only if U = \left \{{0}_{U}\right \} and
V = \left \{{0}_{V }\right \}. What would a matrix
representation of Z
look like in this case? How does Theorem IMR read in this case?
Contributed by Robert Beezer
M80 In light of Theorem KNSI and Theorem MRCLT, write a short
comparison of Exercise MM.T40 with Exercise ILT.T15.
Contributed by Robert Beezer
M81 In light of Theorem RCSI and Theorem MRCLT, write a short
comparison of Exercise CRS.T40 with Exercise SLT.T15.
Contributed by Robert Beezer
M82 In light of Theorem MRCLT and Theorem IMR, write a short comparison
of Theorem SS and Theorem ICLT.
Contributed by Robert Beezer
M83 In light of Theorem MRCLT and Theorem IMR, write a short comparison
of Theorem NPNT and Exercise IVLT.T40.
Contributed by Robert Beezer
T20 Construct a new solution to Exercise B.T50 along the following outline. From the
n × n matrix
A, construct the linear
transformation T : {ℂ}^{n} → {ℂ}^{n},
T\left (x\right ) = Ax. Use
Theorem NI, Theorem IMILT and Theorem ILTIS to translate between the nonsingularity
of A and the
surjectivity/injectivity of T.
Then apply Theorem ILTB and Theorem SLTB to connect these properties with
bases.
Contributed by Robert Beezer Solution [1757]
T60 Create an entirely different proof of Theorem IMILT
that relies on Definition IVLT to establish the invertibility of
T,
and that relies on Definition MI to establish the invertibility of
A.
Contributed by Robert Beezer
T80 Suppose that T : U → V
and S : V → W are linear
transformations, and that B,
C and
D are bases
for U,
V , and
W.
Using only Definition MR define matrix representations for
T and
S. Using these
two definitions, and Definition MR, derive a matrix representation for the composition
S ∘ T in terms of the
entries of the matrices {M}_{B,C}^{T }
and {M}_{C,D}^{S}.
Explain how you would use this result to motivate a definition for matrix
multiplication that is strikingly similar to Theorem EMP.
Contributed by Robert Beezer Solution [1757]
C20 Contributed by Robert Beezer Statement [1729]
Apply Definition MR,
These four vectors are the columns of the matrix representation,
{
M}_{B,C}^{T } = \left [\array{
1 &−4& 5 &−3\cr
−2 & 1 &−3 & 4
\cr
3 & 0 & 2 &−3 } \right ]
|
C21 Contributed by Robert Beezer Statement [1729]
Applying Definition MR,
So the resulting matrix representation is
C22 Contributed by Robert Beezer Statement [1730]
Input to T the vectors of
the basis B and coordinatize
the outputs relative to C,
Applying Definition MR we have the matrix representation
{
M}_{B,C}^{T } = \left [\array{
2&1&2\cr
1&4 &3
\cr
1&1&0} \right ]
|
To compute T\left (3 + 5x − 2{x}^{2}\right ) employ Theorem FTMR,
You can, of course, check your answer by evaluating T\left (3 + 5x − 2{x}^{2}\right ) directly.
C25 Contributed by Robert Beezer Statement [1731]
Choose bases B
and C
for the matrix representation,
Input to T the vectors of the basis B and coordinatize the outputs relative to C,
Applying Definition MR we have the matrix representation
{
M}_{B,C}^{T } = \left [\array{
−1& 4 & 1 & 2\cr
4 &−1 & 6 &−1
\cr
1 & 5 &−2& 2\cr
1 & 0 & 2 & 5 } \right ]
|
Properties of this matrix representation will translate to properties of the linear transformation The matrix representation is nonsingular since it row-reduces to the identity matrix (Theorem NMRRI) and therefore has a column space equal to {ℂ}^{4} (Theorem CNMB). The column space of the matrix representation is isomorphic to the range of the linear transformation (Theorem RCSI). So the range of T has dimension 4, equal to the dimension of the codomain {M}_{22}. By Theorem ROSLT, T is surjective.
C30 Contributed by Robert Beezer Statement [1731]
These subspaces will be easiest to construct by analyzing a matrix representation
of S.
Since we can use any matrix representation, we might as well use natural
bases that allow us to construct the matrix representation quickly and
easily,
then we can practically build the matrix representation on sight,
{
M}_{B,C}^{S} = \left [\array{
1& 2 &5&−4\cr
3&−1 &8 & 2
\cr
1& 1 &4&−2 } \right ]
|
The first step is to find bases for the null space and column space of the matrix representation. Row-reducing the matrix representation we find,
\left [\array{
\text{1}&0&3& 0\cr
0&\text{1 } &1 &−2
\cr
0&0&0& 0 } \right ]
|
So by Theorem BNS and Theorem BCS, we have
Now, the proofs of Theorem KNSI and Theorem RCSI tell us that we can apply {ρ}_{B}^{−1} and {ρ}_{C}^{−1} (respectively) to “un-coordinatize” and get bases for the kernel and range of the linear transformation S itself,
C40 Contributed by Robert Beezer Statement [1732]
The analysis of R
will be easiest if we analyze a matrix representation of
R.
Since we can use any matrix representation, we might as well use natural
bases that allow us to construct the matrix representation quickly and
easily,
then we can practically build the matrix representation on sight,
{
M}_{B,C}^{R} = \left [\array{
1&−1& 0\cr
2&−3 &−2
\cr
1&−1& 1 } \right ]
|
This matrix representation is invertible (it has a nonzero determinant of − 1, Theorem SMZD, Theorem NI) so Theorem IMR tells us that the linear transformation R is also invertible. To find a formula for {R}^{−1} we compute,
C41 Contributed by Robert Beezer Statement [1732]
First, build a matrix representation of
S
(Definition MR). We are free to choose whatever bases we wish, so we should
choose ones that are easy to work with, such as
The resulting matrix representation is then
{
M}_{B,C}^{T } = \left [\array{
3&1
\cr
2&1 } \right ]
|
this matrix is invertible, since it has a nonzero determinant, so by Theorem IMR the linear transformation S is invertible. We can use the matrix inverse and Theorem IMR to find a formula for the inverse linear transformation,
C42 Contributed by Robert Beezer Statement [1733]
Choose bases B
and C
for {M}_{12}
and {M}_{21}
(respectively),
The resulting matrix representation is
{
M}_{B,C}^{R} = \left [\array{
1& 3
\cr
4&11} \right ]
|
This matrix is invertible (its determinant is nonzero, Theorem SMZD), so by Theorem IMR, we can compute the matrix representation of {R}^{−1} with a matrix inverse (Theorem TTMI),
{
M}_{C,B}^{{R}^{−1}
} ={ \left [\array{
1& 3\cr
4&11} \right ]}^{−1} = \left [\array{
−11& 3
\cr
4 &−1 } \right ]
|
To obtain a general formula for {R}^{−1}, use Theorem FTMR,
C50 Contributed by Robert Beezer Statement [1733]
As usual, build any matrix representation of
L, most
likely using a “nice” bases, such as
Then the matrix representation (Definition MR) is,
{
M}_{B,C}^{L} = \left [\array{
1 &2&4& 1\cr
3 &0 &1 &−2
\cr
−1&1&3& 3 } \right ]
|
Theorem RCSI tells us that we can compute the column space of the matrix representation, then use the isomorphism {ρ}_{C}^{−1} to convert the column space of the matrix representation into the range of the linear transformation. So we first analyze the matrix representation,
\left [\array{
1 &2&4& 1\cr
3 &0 &1 &−2
\cr
−1&1&3& 3 } \right ]\mathop{\longrightarrow}\limits_{}^{\text{RREF}}\left [\array{
\text{1}&0&0&−1\cr
0&\text{1 } &0 &−1
\cr
0&0&\text{1}& 1 } \right ]
|
With three nonzero rows in the reduced row-echelon form of the matrix, we know the column space has dimension 3. Since {P}_{2} has dimension 3 (Theorem DP), the range must be all of {P}_{2}. So any basis of {P}_{2} would suffice as a basis for the range. For instance, C itself would be a correct answer.
A more laborious approach would be to use Theorem BCS and choose the first three columns of the matrix representation as a basis for the range of the matrix representation. These could then be “un-coordinatized” with {ρ}_{C}^{−1} to yield a (“not nice”) basis for {P}_{2}.
C52 Contributed by Robert Beezer Statement [1734]
Choose bases B
and C
for the matrix representation,
Input to T the vectors of the basis B and coordinatize the outputs relative to C,
Applying Definition MR we have the matrix representation
{
M}_{B,C}^{T } = \left [\array{
1 &2&−2\cr
2 &2 & 0
\cr
−1&1&−4\cr
3 &2 & 2 } \right ]
|
The null space of the matrix representation is isomorphic (via {ρ}_{B}) to the kernel of the linear transformation (Theorem KNSI). So we compute the null space of the matrix representation by first row-reducing the matrix to,
\left [\array{
\text{1}&0& 2\cr
0&\text{1 } &−2
\cr
0&0& 0\cr
0&0 & 0 } \right ]
|
Employing Theorem BNS we have
N\kern -1.95872pt \left ({M}_{B,C}^{T }\right ) = \left \langle \left \{\left [\array{
−2\cr
2
\cr
1 } \right ]\right \}\right \rangle
|
We only need to uncoordinatize this one basis vector to get a basis for K\kern -1.95872pt \left (T\right ),
K\kern -1.95872pt \left (T\right ) = \left \langle \left \{{ρ}_{B}^{−1}\left (\left [\array{
−2\cr
2
\cr
1 } \right ]\right )\right \}\right \rangle = \left \langle \left \{−2 + 2x + {x}^{2}\right \}\right \rangle
|
M20 Contributed by Robert Beezer Statement [1734]
Build a matrix representation (Definition MR) with the set
B = \left \{1,\kern 1.95872pt x,\kern 1.95872pt {x}^{2},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {x}^{n}\right \}
|
employed as a basis of both the domain and codomain. Then
and the resulting matrix representation is
{
M}_{B,B}^{D} = \left [\array{
0&1&0&0&\mathop{\mathop{…}}\kern 1.95872pt &0&0\cr
0&0 &2 &0 &\mathop{\mathop{…}} \kern 1.95872pt &0 &0
\cr
0&0&0&3&\mathop{\mathop{…}}\kern 1.95872pt &0&0\cr
&\mathop{\mathop{⋮}} & & &\mathrel{⋱} & &\mathop{\mathop{⋮}}
\cr
0&0&0&0&\mathop{\mathop{…}}\kern 1.95872pt &0&n\cr
0&0 &0 &0 &\mathop{\mathop{…}} \kern 1.95872pt &0 &0 } \right ]
|
This (n + 1) × (n + 1) matrix is very close to being in reduced row-echelon form. Multiply row i by {1\over i} , for 1 ≤ i ≤ n, to convert it to reduced row-echelon form. From this we can see that matrix representation {M}_{B,B}^{D} has rank n and nullity 1. Applying Theorem RCSI and Theorem KNSI tells us that the linear transformation D will have the same values for the rank and nullity, as well.
T20 Contributed by Robert Beezer Statement [1736]
Given the nonsingular n × n
matrix A, create the
linear transformation T : {ℂ}^{n} → {ℂ}^{n}
defined by T\left (x\right ) = Ax.
Then
T80 Contributed by Robert Beezer Statement [1736]
Suppose that B = \left \{{u}_{1},\kern 1.95872pt {u}_{2},\kern 1.95872pt {u}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {u}_{m}\right \},
C = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{n}\right \} and
D = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {w}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {w}_{p}\right \}. For
convenience, set M = {M}_{B,C}^{T },
{m}_{ij} ={ \left [M\right ]}_{ij},
1 ≤ i ≤ n,
1 ≤ j ≤ m, and
similarly, set N = {M}_{C,D}^{S},
{n}_{ij} ={ \left [N\right ]}_{ij},
1 ≤ i ≤ p,
1 ≤ j ≤ n.
We want to learn about the matrix representation of
S ∘ T : V → W relative
to B
and D.
We will examine a single (generic) entry of this representation.
This exercise is meant to explain why many authors take the formula in Theorem EMP as their definition of matrix multiplication, and why it is a natural choice when the proper motivation is in place. If we first defined matrix multiplication in the style of Theorem EMP, then the above argument, followed by a simple application of the definition of matrix equality (Definition ME), would yield Theorem MRCLT.