Skip to main content

Section B Bases

A basis of a vector space is one of the most useful concepts in linear algebra. It often provides a concise, finite description of an infinite vector space.

Subsection B Bases

We now have all the tools in place to define a basis of a vector space.

Definition B. Basis.

Suppose \(V\) is a vector space. Then a subset \(S\subseteq V\) is a basis of \(V\) if it is linearly independent and spans \(V\text{.}\)

So, a basis is a linearly independent spanning set for a vector space. The requirement that the set spans \(V\) insures that \(S\) has enough raw material to build \(V\text{,}\) while the linear independence requirement insures that we do not have any more raw material than we need. As we shall see soon in Section D, a basis is a minimal spanning set.

You may have noticed that we used the term basis for some of the titles of previous theorems (e.g. Theorem BNS, Theorem BCS, Theorem BRS) and if you review each of these theorems you will see that their conclusions provide linearly independent spanning sets for sets that we now recognize as subspaces of \(\complex{m}\text{.}\) Examples associated with these theorems include Example NSLIL, Example CSOCD and Example IAS. As we will see, these three theorems will continue to be powerful tools, even in the setting of more general vector spaces.

Furthermore, the archetypes contain an abundance of bases. For each coefficient matrix of a system of equations, and for each archetype defined simply as a matrix, there is a basis for the null space, three bases for the column space, and a basis for the row space. For this reason, our subsequent examples will concentrate on bases for vector spaces other than \(\complex{m}\text{.}\)

Notice that Definition B does not preclude a vector space from having many bases, and this is the case, as hinted above by the statement that the archetypes contain three bases for the column space of a matrix. More generally, we can grab any basis for a vector space, multiply any one basis vector by a nonzero scalar and create a slightly different set that is still a basis. For “important” vector spaces, it will be convenient to have a collection of “nice” bases. When a vector space has a single particularly nice basis, it is sometimes called the standard basis though there is nothing precise enough about this term to allow us to define it formally — it is a question of style. Here are some nice bases for important vector spaces.

We must show that the set \(B\) is both linearly independent and a spanning set for \(\complex{m}\text{.}\) First, the vectors in \(B\) are, by Definition SUV, the columns of the identity matrix, which we know is nonsingular (since it row-reduces to the identity matrix, Theorem NMRRI). And the columns of a nonsingular matrix are linearly independent by Theorem NMLIC.

Suppose we grab an arbitrary vector from \(\complex{m}\text{,}\) say

\begin{equation*} \vect{v}=\colvector{v_1\\v_2\\v_3\\\vdots\\v_m}\text{.} \end{equation*}

Can we write \(\vect{v}\) as a linear combination of the vectors in \(B\text{?}\) Yes, and quite simply.

\begin{align*} \colvector{v_1\\v_2\\v_3\\\vdots\\v_m}&= v_1\colvector{1\\0\\0\\\vdots\\0}+ v_2\colvector{0\\1\\0\\\vdots\\0}+ v_3\colvector{0\\0\\1\\\vdots\\0}+ \cdots+ v_m\colvector{0\\0\\0\\\vdots\\1}\\ \vect{v}&=v_1\vect{e}_1+v_2\vect{e}_2+v_3\vect{e}_3+\cdots+v_m\vect{e}_m \end{align*}

This shows that \(\complex{m}\subseteq\spn{B}\text{,}\) which is sufficient to show that \(B\) is a spanning set for \(\complex{m}\text{.}\)

The vector space of polynomials with degree at most \(n\text{,}\) \(P_n\text{,}\) has the basis

\begin{equation*} B=\set{1,\,x,\,x^2,\,x^3,\,\ldots,\,x^n}\text{.} \end{equation*}

Another nice basis for \(P_n\) is

\begin{equation*} C=\set{1,\,1+x,\,1+x+x^2,\,1+x+x^2+x^3,\,\ldots,\,1+x+x^2+x^3+\cdots+x^n}\text{.} \end{equation*}

Checking that each of \(B\) and \(C\) is a linearly independent spanning set are good exercises.

In the vector space \(M_{mn}\) of matrices (Example VSM) define the matrices \(B_{k\ell}\text{,}\) \(1\leq k\leq m\text{,}\) \(1\leq\ell\leq n\) by

\begin{equation*} \matrixentry{B_{k\ell}}{ij}=\begin{cases} 1&\text{if }k=i,\,\ell=j\\ 0&\text{otherwise} \end{cases}\text{.} \end{equation*}

So these matrices have entries that are all zeros, with the exception of a lone entry that is one. The set of all \(mn\) of them,

\begin{equation*} B=\setparts{B_{k\ell}}{1\leq k\leq m,\ 1\leq\ell\leq n} \end{equation*}

forms a basis for \(M_{mn}\text{.}\) See Exercise B.M20.

The bases described above will often be convenient ones to work with. However a basis does not have to obviously look like a basis.

In Example SSP4 we showed that

\begin{equation*} S=\set{x-2,\,x^2-4x+4,\,x^3-6x^2+12x-8,\,x^4-8x^3+24x^2-32x+16} \end{equation*}

is a spanning set for \(W=\setparts{p(x)}{p\in P_4,\ p(2)=0}\text{.}\) We will now show that \(S\) is also linearly independent in \(W\text{.}\) Begin with a relation of linear dependence,

\begin{align*} 0+0x&+0x^2+0x^3+0x^4\\ &=\alpha_1\left(x-2\right)+\alpha_2\left(x^2-4x+4\right)+\alpha_3\left(x^3-6x^2+12x-8\right)\\ &\quad\quad +\alpha_4\left(x^4-8x^3+24x^2-32x+16\right)\\ &=\alpha_4x^4+ \left(\alpha_3-8\alpha_4\right)x^3+ \left(\alpha_2-6\alpha_3+24\alpha_4\right)x^2\\ &\quad\quad + \left(\alpha_1-4\alpha_2+12\alpha_3-32\alpha_4\right)x+ \left(-2\alpha_1+4\alpha_2-8\alpha_3+16\alpha_4\right)\text{.} \end{align*}

Equating coefficients (vector equality in \(P_4\)) gives the homogeneous system of five equations in four variables,

\begin{align*} \alpha_4&=0\\ \alpha_3-8\alpha_4&=0\\ \alpha_2-6\alpha_3+24\alpha_4&=0\\ \alpha_1-4\alpha_2+12\alpha_3-32\alpha_4&=0\\ -2\alpha_1+4\alpha_2-8\alpha_3+16\alpha_4&=0\text{.} \end{align*}

We form the coefficient matrix, and row-reduce to obtain a matrix in reduced row-echelon form

\begin{equation*} \begin{bmatrix} \leading{1}&0&0&0\\ 0&\leading{1}&0&0\\ 0&0&\leading{1}&0\\ 0&0&0&\leading{1}\\ 0&0&0&0 \end{bmatrix}\text{.} \end{equation*}

With only the trivial solution to this homogeneous system, we conclude that only scalars that will form a relation of linear dependence are the trivial ones, and therefore the set \(S\) is linearly independent (Definition LI). Finally, \(S\) has earned the right to be called a basis for \(W\) (Definition B).

In Example SSM22 we discovered that

\begin{equation*} Q=\set{ \begin{bmatrix}-3&1\\0&0\end{bmatrix},\, \begin{bmatrix}1&0\\-4&1\end{bmatrix} } \end{equation*}

is a spanning set for the subspace

\begin{equation*} Z=\setparts{\begin{bmatrix}a&b\\c&d\end{bmatrix}}{a+3b-c-5d=0,\ -2a-6b+3c+14d=0} \end{equation*}

of the vector space of all \(2\times 2\) matrices, \(M_{22}\text{.}\) If we can also determine that \(Q\) is linearly independent in \(Z\) (or in \(M_{22}\)), then it will qualify as a basis for \(Z\text{.}\)

Let us begin with a relation of linear dependence.

\begin{align*} \begin{bmatrix}0&0\\0&0\end{bmatrix} &= \alpha_1\begin{bmatrix}-3&1\\0&0\end{bmatrix}+ \alpha_2\begin{bmatrix}1&0\\-4&1\end{bmatrix}\\ &=\begin{bmatrix} -3\alpha_1 +\alpha_2 & \alpha_1\\ -4\alpha_2 & \alpha_2 \end{bmatrix} \end{align*}

Using our definition of matrix equality (Definition ME) we equate entries and get a homogeneous system of four equations in two variables,

\begin{align*} -3\alpha_1 +\alpha_2&=0\\ \alpha_1&=0\\ -4\alpha_2&=0\\ \alpha_2&=0\text{.} \end{align*}

We could row-reduce the coefficient matrix of this homogeneous system, but it is not necessary. The second and fourth equations tell us that \(\alpha_1=0\text{,}\) \(\alpha_2=0\) is the only solution to this homogeneous system. This qualifies the set \(Q\) as being linearly independent, since the only relation of linear dependence is trivial (Definition LI). Therefore \(Q\) is a basis for \(Z\) (Definition B).

In Example LIC and Example SSC we determined that the set \(R=\set{(1,\,0),\,(6,\,3)}\) from the crazy vector space, \(C\) (Example CVS), is linearly independent and is a spanning set for \(C\text{.}\) By Definition B we see that \(R\) is a basis for \(C\text{.}\)

We have seen that several of the sets associated with a matrix are subspaces of vector spaces of column vectors. Specifically these are the null space (Theorem NSMS), column space (Theorem CSMS), row space (Theorem RSMS) and left null space (Theorem LNSMS). As subspaces they are vector spaces (Definition S) and it is natural to ask about bases for these vector spaces. Theorem BNS, Theorem BCS, Theorem BRS each have conclusions that provide linearly independent spanning sets for (respectively) the null space, column space, and row space. Notice that each of these theorems contains the word “basis” in its title, even though we did not know the precise meaning of the word at the time. To find a basis for a left null space we can use the definition of this subspace as a null space (Definition LNS) and apply Theorem BNS. Or Theorem FS tells us that the left null space can be expressed as a row space and we can then use Theorem BRS.

Theorem BS is another early result that provides a linearly independent spanning set (i.e. a basis) as its conclusion. If a vector space of column vectors can be expressed as a span of a set of column vectors, then Theorem BS can be employed in a straightforward manner to quickly yield a basis.

Subsection BSCV Bases for Spans of Column Vectors

We have seen several examples of bases in different vector spaces. In this subsection, and the next (Subsection B.BNM), we will consider building bases for \(\complex{m}\) and its subspaces.

Suppose we have a subspace of \(\complex{m}\) that is expressed as the span of a set of vectors, \(S\text{,}\) and \(S\) is not necessarily linearly independent, or perhaps not very attractive. Theorem REMRS says that row-equivalent matrices have identical row spaces, while Theorem BRS says the nonzero rows of a matrix in reduced row-echelon form are a basis for the row space. These theorems together give us a great computational tool for quickly finding a basis for a subspace that is expressed originally as a span.

When we first defined the span of a set of column vectors, in Example SCAD we looked at the set

\begin{equation*} W=\spn{\set{ \colvector{2\\-3\\1},\, \colvector{1\\4\\1},\, \colvector{7\\-5\\4},\, \colvector{-7\\-6\\-5} }} \end{equation*}

with an eye towards realizing \(W\) as the span of a smaller set. By building relations of linear dependence (though we did not know them by that name then) we were able to remove two vectors and write \(W\) as the span of the other two vectors. These two remaining vectors formed a linearly independent set, even though we did not know that at the time.

Now we know that \(W\) is a subspace and must have a basis. Consider the matrix, \(C\text{,}\) whose rows are the vectors in the spanning set for \(W\text{,}\)

\begin{equation*} C=\begin{bmatrix} 2 & -3 & 1\\ 1 & 4 & 1\\ 7 & -5 & 4\\ -7 & -6 & -5 \end{bmatrix}\text{.} \end{equation*}

Then, by Definition RSM, the row space of \(C\) will be \(W\text{,}\) \(\rsp{C}=W\text{.}\)Theorem BRS tells us that if we row-reduce \(C\text{,}\) the nonzero rows of the row-equivalent matrix in reduced row-echelon form will be a basis for \(\rsp{C}\text{,}\) and hence a basis for \(W\text{.}\) Let us do it — \(C\) row-reduces to

\begin{equation*} \begin{bmatrix} \leading{1} & 0 & \frac{7}{11}\\ 0 & \leading{1} & \frac{1}{11}\\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

If we convert the two nonzero rows to column vectors then we have a basis,

\begin{equation*} B=\set{\colvector{1\\0\\\frac{7}{11}},\,\colvector{0\\1\\\frac{1}{11}}} \end{equation*}

and

\begin{equation*} W=\spn{\set{\colvector{1\\0\\\frac{7}{11}},\,\colvector{0\\1\\\frac{1}{11}}}}\text{.} \end{equation*}

For aesthetic reasons, we might wish to multiply each vector in \(B\) by \(11\text{,}\) which will not change the spanning or linear independence properties of \(B\) as a basis. Then we can also write

\begin{equation*} W=\spn{\set{\colvector{11\\0\\7},\,\colvector{0\\11\\1}}}\text{.} \end{equation*}

Example IAS provides another example of this flavor, though now we can notice that \(X\) is a subspace, and that the resulting set of three vectors is a basis. This is such a powerful technique that we should do one more example.

In Example RSC5 we began with a set of \(n=4\) vectors from \(\complex{5}\text{,}\)

\begin{equation*} R=\set{\vect{v}_1,\,\vect{v}_2,\,\vect{v}_3,\,\vect{v}_4} = \set{ \colvector{1\\2\\-1\\3\\2},\, \colvector{2\\1\\3\\1\\2},\, \colvector{0\\-7\\6\\-11\\-2},\, \colvector{4\\1\\2\\1\\6} } \end{equation*}

and defined \(V=\spn{R}\text{.}\) Our goal in that problem was to find a relation of linear dependence on the vectors in \(R\text{,}\) solve the resulting equation for one of the vectors, and re-express \(V\) as the span of a set of three vectors.

Here is another way to accomplish something similar. The row space of the matrix

\begin{equation*} A=\begin{bmatrix} 1 & 2 & -1 & 3 & 2\\ 2 & 1 & 3 & 1 & 2\\ 0 & -7 & 6 & -11 & -2\\ 4 & 1 & 2 & 1 & 6 \end{bmatrix} \end{equation*}

is equal to \(\spn{R}\text{.}\) By Theorem BRS we can row-reduce this matrix, ignore any zero rows, and use the nonzero rows as column vectors that are a basis for the row space of \(A\text{.}\) Row-reducing \(A\) creates the matrix

\begin{equation*} \begin{bmatrix} 1 & 0 & 0 & -\frac{1}{17} & \frac{30}{17}\\ 0 & 1 & 0 & \frac{25}{17} & -\frac{2}{17}\\ 0 & 0 & 1 & -\frac{2}{17} & -\frac{8}{17}\\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}\text{.} \end{equation*}

So

\begin{equation*} \set{ \colvector{1\\0\\0\\-\frac{1}{17}\\\frac{30}{17}},\, \colvector{0\\1\\0\\\frac{25}{17}\\-\frac{2}{17}},\, \colvector{0\\0\\1\\-\frac{2}{17}\\-\frac{8}{17}} } \end{equation*}

is a basis for \(V\text{.}\) Our theorem tells us this is a basis, there is no need to verify that the subspace spanned by three vectors (rather than four) is the identical subspace, and there is no need to verify that we have reached the limit in reducing the set, since the set of three vectors is guaranteed to be linearly independent.

Every vector space in Sage has a basis — you can obtain this with the vector space method .basis(), and the result is a list of vectors. Another method for a vector space is .basis_matrix() which outputs a matrix whose rows are the vectors of a basis. Sometimes one form is more convenient than the other, but notice that the description of a vector space chooses to print the basis matrix (since its display is just a bit easier to read). A vector space typically has many bases (infinitely many), so which one does Sage use? You will notice that the basis matrices displayed are in reduced row-echelon form — this is the defining property of the basis chosen by Sage.

Here is Example RSB again as an example of how bases are provided in Sage.

Or perhaps, “under the bonnet” if you learned your English in the Commonwealth. This is the first in a series that aims to explain how our knowledge of linear algebra theory helps us understand the design, construction and informed use of Sage.

How does Sage determine if two vector spaces are equal? Especially since these are infinite sets? One approach would be to take a spanning set for the first vector space (maybe a minimal spanning set), and ask if each element of the spanning set is an element of the second vector space. If so, the first vector space is a subset of the second. Then we could turn it around, and determine if the second vector space is a subset of the first. By Definition SE, the two vector spaces would be equal if both subset tests succeeded.

However, each time we would test if an element of a spanning set lives in a second vector space, we would need to solve a linear system. So for two large vector spaces, this could take a noticeable amount of time. There is a better way, made possible by exploiting two important theorems.

For every vector space, Sage creates a basis that uniquely identifies the vector space. We could call this a canonical basis. By Theorem REMRS we can span the row space of a matrix by the rows of any row-equivalent matrix. So if we begin with a vector space described by any basis (or any spanning set, for that matter), we can make a matrix with these rows as vectors, and the vector space is now the row space of the matrix. Of all the possible row-equivalent matrices, which would you pick? Of course, the reduced row-echelon version is useful, and here it is critical to realize this version is unique (Theorem RREFU).

So for every vector space, Sage takes a spanning set, makes its vectors the rows of a matrix, row-reduces the matrix and tosses out the zero rows. The result is what Sage calls an echelonized basis. Now, two vector spaces are equal if, and only if, they have equal echelonized basis matrices. It takes some computation to form the echelonized basis, but once built, the comparison of two echelonized bases can proceed very quickly by perhaps just comparing entries of the echelonized basis matrices.

You might create a vector space with a basis you prefer (a user basis), but Sage always has an echelonized basis at hand. If you do not specify some alternate basis, this is the basis Sage will create and provide for you. We can now continue a discussion we began back in Sage SSNS. We have consistently used the basis='pivot' keyword when we construct null spaces. This is because we initially prefer to see the basis described in Theorem BNS, rather than Sage's default basis, the echelonized version. But the echelonized version is always present and available.

Subsection BNM Bases and Nonsingular Matrices

A quick source of diverse bases for \(\complex{m}\) is the set of columns of a nonsingular matrix.

(⇒) 

Suppose that the columns of \(A\) are a basis for \(\complex{m}\text{.}\) Then Definition B says the set of columns is linearly independent. Theorem NMLIC then says that \(A\) is nonsingular.

(⇐) 

Suppose that \(A\) is nonsingular. Then by Theorem NMLIC this set of columns is linearly independent. Theorem CSNM says that for a nonsingular matrix, \(\csp{A}=\complex{m}\text{.}\) This is equivalent to saying that the columns of \(A\) are a spanning set for the vector space \(\complex{m}\text{.}\) As a linearly independent spanning set, the columns of \(A\) qualify as a basis for \(\complex{m}\) (Definition B).

Archetype K is the \(5\times 5\) matrix

\begin{equation*} K=\begin{bmatrix} 10 & 18 & 24 & 24 & -12 \\ 12 & -2 & -6 & 0 & -18 \\ -30 & -21 & -23 & -30 & 39 \\ 27 & 30 & 36 & 37 & -30 \\ 18 & 24 & 30 & 30 & -20 \end{bmatrix} \end{equation*}

which is row-equivalent to the \(5\times 5\) identity matrix \(I_5\text{.}\) So by Theorem NMRRI, \(K\) is nonsingular. Then Theorem CNMB says the set

\begin{equation*} \set{\colvector{10\\12\\-30\\27\\18},\,\colvector{18\\-2\\-21\\30\\24},\,\colvector{24\\-6\\-23\\36\\30},\,\colvector{24\\0\\-30\\37\\30},\,\colvector{-12\\-18\\39\\-30\\-20}} \end{equation*}

is a (novel) basis of \(\complex{5}\text{.}\)

Perhaps we should view the fact that the standard unit vectors are a basis (Theorem SUVB) as just a simple corollary of Theorem CNMB? (See Proof Technique LC.)

With a new equivalence for a nonsingular matrix, we can update our list of equivalences.

We can easily illustrate our latest equivalence for nonsingular matrices.

Subsection OBC Orthonormal Bases and Coordinates

We learned about orthogonal sets of vectors in \(\complex{m}\) back in Section O, and we also learned that orthogonal sets are automatically linearly independent (Theorem OSLI). When an orthogonal set also spans a subspace of \(\complex{m}\text{,}\) then the set is a basis. And when the set is orthonormal, then the set is an incredibly nice basis. We will back up this claim with a theorem, but first consider how you might manufacture such a set.

Suppose that \(W\) is a subspace of \(\complex{m}\) with basis \(B\text{.}\) Then \(B\) spans \(W\) and is a linearly independent set of nonzero vectors. We can apply the Gram-Schmidt Procedure (Theorem GSP) and obtain a linearly independent set \(T\) such that \(\spn{T}=\spn{B}=W\) and \(T\) is orthogonal. In other words, \(T\) is a basis for \(W\text{,}\) and is an orthogonal set. By scaling each vector of \(T\) to norm 1, we can convert \(T\) into an orthonormal set, without destroying the properties that make it a basis of \(W\text{.}\) In short, we can convert any basis into an orthonormal basis. Example GSTV, followed by Example ONTV, illustrates this process.

Unitary matrices (Definition UM) are another good source of orthonormal bases (and vice versa). Suppose that \(Q\) is a unitary matrix of size \(n\text{.}\) Then the \(n\) columns of \(Q\) form an orthonormal set (Theorem CUMOS) that is therefore linearly independent (Theorem OSLI). Since \(Q\) is invertible (Theorem UMI), we know \(Q\) is nonsingular (Theorem NI), and then the columns of \(Q\) span \(\complex{n}\) (Theorem CSNM). So the columns of a unitary matrix of size \(n\) are an orthonormal basis for \(\complex{n}\text{.}\)

Why all the fuss about orthonormal bases? Theorem VRRB told us that any vector in a vector space could be written, uniquely, as a linear combination of basis vectors. For an orthonormal basis, finding the scalars for this linear combination is extremely easy, and this is the content of the next theorem. Furthermore, with vectors written this way (as linear combinations of the elements of an orthonormal set) certain computations and analysis become much easier. Here is the promised theorem.

Because \(B\) is a basis of \(W\text{,}\) Theorem VRRB tells us that we can write \(\vect{w}\) uniquely as a linear combination of the vectors in \(B\text{.}\) So it is not this aspect of the conclusion that makes this theorem interesting. What is interesting is that the particular scalars are so easy to compute. No need to solve big systems of equations — just do an inner product of \(\vect{w}\) with \(\vect{v}_i\) to arrive at the coefficient of \(\vect{v}_i\) in the linear combination.

So begin the proof by writing \(\vect{w}\) as a linear combination of the vectors in \(B\text{,}\) using unknown scalars,

\begin{equation*} \vect{w}=\lincombo{a}{v}{p} \end{equation*}

and compute,

\begin{align*} \innerproduct{\vect{v}_i}{\vect{w}} &=\innerproduct{\vect{v}_i}{\sum_{k=1}^{p}a_k\vect{v}_k}&& \knowl{./knowl/theorem-VRRB.html}{\text{Theorem VRRB}}\\ &=\sum_{k=1}^{p}\innerproduct{\vect{v}_i}{a_k\vect{v}_k}&& \knowl{./knowl/theorem-IPVA.html}{\text{Theorem IPVA}}\\ &=\sum_{k=1}^{p}a_k\innerproduct{\vect{v}_i}{\vect{v}_k}&& \knowl{./knowl/theorem-IPSM.html}{\text{Theorem IPSM}}\\ &=a_i\innerproduct{\vect{v}_i}{\vect{v}_i}+ \sum_{\substack{k=1\\k\neq i}}^{p}a_k\innerproduct{\vect{v}_i}{\vect{v}_k}&& \knowl{./knowl/property-C.html}{\text{Property C}}\\ &=a_i(1)+\sum_{\substack{k=1\\k\neq i}}^{p}a_k(0)&& \knowl{./knowl/definition-ONS.html}{\text{Definition ONS}}\\ &=a_i\text{.} \end{align*}

So the (unique) scalars for the linear combination are indeed the inner products advertised in the conclusion of the theorem's statement.

The set

\begin{equation*} \set{\vect{x}_1,\,\vect{x}_2,\,\vect{x}_3,\,\vect{x}_4}= \set{ \colvector{1+i\\1\\1-i\\i},\, \colvector{1+5i\\6+5i\\-7-i\\1-6i},\, \colvector{-7+34i\\-8-23i\\-10+22i\\30+13i},\, \colvector{-2-4i\\6+i\\4+3i\\6-i} } \end{equation*}

was proposed, and partially verified, as an orthogonal set in Example AOS. Let us scale each vector to norm 1, so as to form an orthonormal set in \(\complex{4}\text{.}\) Then by Theorem OSLI the set will be linearly independent, and by Theorem NME5 the set will be a basis for \(\complex{4}\text{.}\) So, once scaled to norm 1, the adjusted set will be an orthonormal basis of \(\complex{4}\text{.}\) The norms are

\begin{align*} \norm{\vect{x}_1}=\sqrt{6}&& \norm{\vect{x}_2}=\sqrt{174}&& \norm{\vect{x}_3}=\sqrt{3451}&& \norm{\vect{x}_4}=\sqrt{119}\text{.} \end{align*}

So an orthonormal basis is

\begin{align*} B&= \set{\vect{v}_1,\,\vect{v}_2,\,\vect{v}_3,\,\vect{v}_4}\\ &=\set{ \frac{1}{\sqrt{6}}\colvector{1+i\\1\\1-i\\i},\, \frac{1}{\sqrt{174}}\colvector{1+5i\\6+5i\\-7-i\\1-6i},\, \frac{1}{\sqrt{3451}}\colvector{-7+34i\\-8-23i\\-10+22i\\30+13i},\, \frac{1}{\sqrt{119}}\colvector{-2-4i\\6+i\\4+3i\\6-i} }\text{.} \end{align*}

Now, to illustrate Theorem COB, choose any vector from \(\complex{4}\text{,}\) say \(\vect{w}=\colvector{2\\-3\\1\\4}\text{,}\) and compute

\begin{align*} \innerproduct{\vect{v}_1}{\vect{w}}&=\frac{-5i}{\sqrt{6}}& \innerproduct{\vect{v}_2}{\vect{w}}&=\frac{-19+30i}{\sqrt{174}}\\ \innerproduct{\vect{v}_3}{\vect{w}}&=\frac{120-211i}{\sqrt{3451}}& \innerproduct{\vect{v}_4}{\vect{w}}&=\frac{6+12i}{\sqrt{119}}\text{.} \end{align*}

Then Theorem COB guarantees that

\begin{align*} \colvector{2\\-3\\1\\4}&= \frac{-5i}{\sqrt{6}}\left(\frac{1}{\sqrt{6}}\colvector{1+i\\1\\1-i\\i}\right)+ \frac{-19+30i}{\sqrt{174}}\left(\frac{1}{\sqrt{174}}\colvector{1+5i\\6+5i\\-7-i\\1-6i}\right)\\ &\quad\quad+ \frac{120-211i}{\sqrt{3451}}\left(\frac{1}{\sqrt{3451}}\colvector{-7+34i\\-8-23i\\-10+22i\\30+13i}\right)+ \frac{6+12i}{\sqrt{119}}\left(\frac{1}{\sqrt{119}}\colvector{-2-4i\\6+i\\4+3i\\6-i}\right) \end{align*}

as you might want to check (if you have unlimited patience).

A slightly less intimidating example follows, in three dimensions and with just real numbers.

The set

\begin{equation*} \set{\vect{x}_1,\,\vect{x}_2,\,\vect{x}_3} =\set{ \colvector{1\\2\\1},\, \colvector{-1\\0\\1},\, \colvector{2\\1\\1} } \end{equation*}

is a linearly independent set, which the Gram-Schmidt Process (Theorem GSP) converts to an orthogonal set, and which can then be converted to the orthonormal set,

\begin{equation*} B= \set{\vect{v}_1,\,\vect{v}_2,\,\vect{v}_3} =\set{ \frac{1}{\sqrt{6}}\colvector{1\\2\\1},\, \frac{1}{\sqrt{2}}\colvector{-1\\0\\1},\, \frac{1}{\sqrt{3}}\colvector{1\\-1\\1} } \end{equation*}

which is therefore an orthonormal basis of \(\complex{3}\text{.}\) With three vectors in \(\complex{3}\text{,}\) all with real number entries, the inner product (Definition IP) reduces to the usual “dot product” (or scalar product) and the orthogonal pairs of vectors can be interpreted as perpendicular pairs of directions. So the vectors in \(B\) serve as replacements for our usual 3-D axes, or the usual 3-D unit vectors \(\vec{i},\vec{j}\) and \(\vec{k}\text{.}\) We would like to decompose arbitrary vectors into “components” in the directions of each of these basis vectors. It is Theorem COB that tells us how to do this.

Suppose that we choose \(\vect{w}=\colvector{2\\-1\\5}\text{.}\) Compute

\begin{align*} \innerproduct{\vect{v}_1}{\vect{w}}=\frac{5}{\sqrt{6}}&& \innerproduct{\vect{v}_2}{\vect{w}}=\frac{3}{\sqrt{2}}&& \innerproduct{\vect{v}_3}{\vect{w}}=\frac{8}{\sqrt{3}} \end{align*}

then Theorem COB guarantees that

\begin{equation*} \colvector{2\\-1\\5}= \frac{5}{\sqrt{6}}\left(\frac{1}{\sqrt{6}}\colvector{1\\2\\1}\right)+ \frac{3}{\sqrt{2}}\left(\frac{1}{\sqrt{2}}\colvector{-1\\0\\1}\right)+ \frac{8}{\sqrt{3}}\left(\frac{1}{\sqrt{3}}\colvector{1\\-1\\1}\right) \end{equation*}

which you should be able to check easily, even if you do not have much patience.

Not only do the columns of a unitary matrix form an orthonormal basis, but there is a deeper connection between orthonormal bases and unitary matrices. Informally, the next theorem says that if we transform each vector of an orthonormal basis by multiplying it by a unitary matrix, then the resulting set will be another orthonormal basis. And more remarkably, any matrix with this property must be unitary! As an equivalence (Proof Technique E) we could take this as our defining property of a unitary matrix, though it might not have the same utility as Definition UM.

(⇒) 

Assume \(A\) is a unitary matrix and establish several facts about \(C\text{.}\) First we check that \(C\) is an orthonormal set (Definition ONS). By Theorem UMPIP, for \(i\neq j\text{,}\)

\begin{align*} \innerproduct{A\vect{x}_i}{A\vect{x}_j}& =\innerproduct{\vect{x}_i}{\vect{x}_j}=0\text{.} \end{align*}

Similarly, Theorem UMPIP also gives, for \(1\leq i\leq n\text{,}\)

\begin{gather*} \norm{A\vect{x}_i}=\norm{\vect{x}_i}=1\text{.} \end{gather*}

As \(C\) is an orthogonal set (Definition OSV), Theorem OSLI yields the linear independence of \(C\text{.}\) Having established that the column vectors on \(C\) form a linearly independent set, a matrix whose columns are the vectors of \(C\) is nonsingular (Theorem NMLIC), and hence these vectors form a basis of \(\complex{n}\) by Theorem CNMB.

(⇐) 

Now assume that \(C\) is an orthonormal set. Let \(\vect{y}\) be an arbitrary vector from \(\complex{n}\text{.}\) Since \(B\) spans \(\complex{n}\text{,}\) there are scalars, \(\scalarlist{a}{n}\text{,}\) such that

\begin{align*} \vect{y}&=a_1\vect{x}_1+a_2\vect{x}_2+a_3\vect{x}_3+\cdots+a_n\vect{x}_n\text{.} \end{align*}

Now

\begin{align*} \adjoint{A}A\vect{y} &=\sum_{i=1}^{n}\innerproduct{\vect{x}_i}{\adjoint{A}A\vect{y}}\vect{x}_i&& \knowl{./knowl/theorem-COB.html}{\text{Theorem COB}}\\ &=\sum_{i=1}^{n}\innerproduct{\vect{x}_i}{\adjoint{A}A\sum_{j=1}^{n}a_j\vect{x}_j}\vect{x}_i&& \knowl{./knowl/definition-SSVS.html}{\text{Definition SSVS}}\\ &=\sum_{i=1}^{n}\innerproduct{\vect{x}_i}{\sum_{j=1}^{n}\adjoint{A}Aa_j\vect{x}_j}\vect{x}_i&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &=\sum_{i=1}^{n}\innerproduct{\vect{x}_i}{\sum_{j=1}^{n}a_j\adjoint{A}A\vect{x}_j}\vect{x}_i&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}}\\ &=\sum_{i=1}^{n}\sum_{j=1}^{n}\innerproduct{\vect{x}_i}{a_j\adjoint{A}A\vect{x}_j}\vect{x}_i&& \knowl{./knowl/theorem-IPVA.html}{\text{Theorem IPVA}}\\ &=\sum_{i=1}^{n}\sum_{j=1}^{n}a_j\innerproduct{\vect{x}_i}{\adjoint{A}A\vect{x}_j}\vect{x}_i&& \knowl{./knowl/theorem-IPSM.html}{\text{Theorem IPSM}}\\ &=\sum_{i=1}^{n}\sum_{j=1}^{n}a_j\innerproduct{A\vect{x}_i}{A\vect{x}_j}\vect{x}_i&& \knowl{./knowl/theorem-AIP.html}{\text{Theorem AIP}}\\ &= \sum_{i=1}^{n}\sum_{\substack{j=1\\j\neq i}}^{n}a_j\innerproduct{A\vect{x}_i}{A\vect{x}_j}\vect{x}_i + \sum_{\ell=1}^{n}a_\ell\innerproduct{A\vect{x}_\ell}{A\vect{x}_\ell}\vect{x}_\ell&& \knowl{./knowl/property-C.html}{\text{Property C}}\\ &= \sum_{i=1}^{n}\sum_{\substack{j=1\\j\neq i}}^{n}a_j(0)\vect{x}_i + \sum_{\ell=1}^{n}a_\ell(1)\vect{x}_\ell&& \knowl{./knowl/definition-ONS.html}{\text{Definition ONS}}\\ &= \sum_{i=1}^{n}\sum_{\substack{j=1\\j\neq i}}^{n}\zerovector + \sum_{\ell=1}^{n}a_\ell\vect{x}_\ell&& \knowl{./knowl/theorem-ZSSM.html}{\text{Theorem ZSSM}}\\ &=\sum_{\ell=1}^{n}a_\ell\vect{x}_\ell&& \knowl{./knowl/property-Z.html}{\text{Property Z}}\\ &=\vect{y}\\ &=I_n\vect{y}&& \knowl{./knowl/theorem-MMIM.html}{\text{Theorem MMIM}}\text{.} \end{align*}

Since the choice of \(\vect{y}\) was arbitrary, Theorem EMMVP tells us that \(\adjoint{A}A=I_n\text{,}\) so \(A\) is unitary (Definition UM).

For vector spaces of column vectors, Sage can quickly determine the coordinates of a vector relative to a basis, as guaranteed by Theorem VRRB. We illustrate some new Sage commands with a simple example and then apply them to orthonormal bases. The vectors v1 and v2 are linearly independent and thus span a subspace with a basis of size 2. We first create this subspace and let Sage determine the basis, then we illustrate a new vector space method, .subspace_with_basis(), that allows us to specify the basis. (This method is very similar to .span_of_basis(), except it preserves a subspace relationship with the original vector space.) Notice how the description of the vector space makes it clear that W has a user-specified basis. Notice too that the actual subspace created is the same in both cases.

Now we manufacture a third vector in the subspace, and request a coordinatization in each vector space, which has the effect of using a different basis in each case. The vector space method .coordinate_vector(v) computes a vector whose entries express v as a linear combination of basis vectors. Verify for yourself in each case below that the components of the vector returned really give a linear combination of the basis vectors that equals v3.

Now we can construct a more complicated example using an orthonormal basis, specifically the one from Example CROB4, but we will compute over QQbar, the field of algebraic numbers. We form the four vectors of the orthonormal basis, install them as the basis of a vector space and then ask for the coordinates. Sage treats the square roots in the scalars as “symbolic” expressions, so we need to explicitly coerce them into QQbar before computing the scalar multiples.

Is this right? Our exact coordinates in the text are displayed differently, but we can check that they really are the same numbers:

With an orthonormal basis, we can illustrate Theorem CUMOS by making the four vectors the columns of \(4\times 4\) matrix and verifying the result is a unitary matrix.

We will see coordinate vectors again, in a more formal setting, in Sage VR.

Reading Questions B Reading Questions

1.

The matrix below is nonsingular. What can you now say about its columns?

\begin{equation*} A=\begin{bmatrix} -3 & 0 & 1\\ 1 & 2 & 1\\ 5 & 1 & 6 \end{bmatrix} \end{equation*}
2.

Write the vector \(\vect{w}=\colvector{6\\6\\15}\) as a linear combination of the columns of the matrix \(A\) above. How many ways are there to answer this question?

3.

Why is an orthonormal basis desirable?

Exercises B Exercises

C10.

Find a basis for \(\spn{S}\text{,}\) where

\begin{align*} S &= \set{ \colvector{1\\3\\2\\1}, \colvector{1\\2\\1\\1}, \colvector{1\\1\\0\\1}, \colvector{1\\2\\2\\1}, \colvector{3\\4\\1\\3} }\text{.} \end{align*}
Solution

Theorem BS says that if we take these 5 vectors, put them into a matrix, and row-reduce to discover the pivot columns, then the vectors of \(S\) with the same column indices will be linearly independent and span \(S\text{,}\) and thus will form a basis of \(S\text{.}\)

\begin{align*} \begin{bmatrix} 1 & 1 & 1 & 1 & 3\\ 3 & 2 & 1 & 2 & 4\\ 2 & 1 & 0 & 2 & 1\\ 1 & 1 & 1 & 1 & 3 \end{bmatrix} &\rref \begin{bmatrix} \leading{1} & 0 & -1 & 0 & -2\\ 0 & \leading{1} & 2 & 0 & 5\\ 0 & 0 & 0 & \leading{1} & 0\\ 0 & 0 & 0 & 0 &0 \end{bmatrix} \end{align*}

Since columns 1, 2, and 4 are pivot columns, the vectors that span \(S\) are the first, second and fourth of the set, so a basis of \(S\) is

\begin{align*} B &= \set{ \colvector{1\\3\\2\\1}, \colvector{1\\2\\1\\1}, \colvector{1\\2\\2\\1} }\text{.} \end{align*}
C11.

Find a basis for the subspace \(W\) of \(\complex{4}\text{,}\)

\begin{align*} W &= \setparts{\colvector{a + b - 2c\\a + b - 2c + d\\ -2a + 2b + 4c - d\\ b + d}} {a, b, c, d \in\complexes}\text{.} \end{align*}
Solution

We can rewrite an arbitrary vector of \(W\) as

\begin{align*} \colvector{a + b - 2c\\ a + b - 2c + d\\ -2a + 2b + 4c - d\\ b + d} & = \colvector{a\\a\\-2a\\0} + \colvector{b\\b\\2b\\b} + \colvector{-2c\\-2c\\4c\\0} + \colvector{0\\d\\-d\\d}\\ &= a\colvector{1\\1\\-2\\0} + b\colvector{1\\1\\2\\1} + c\colvector{-2\\-2\\4\\0} + d\colvector{0\\1\\-1\\1} \end{align*}

Thus, we can write \(W\) as

\begin{align*} W &= \spn{\set{ \colvector{1\\1\\-2\\0}, \colvector{1\\1\\2\\1}, \colvector{-2\\-2\\4\\0}, \colvector{0\\1\\-1\\1} }}\text{.} \end{align*}

These four vectors span \(W\text{,}\) but we also need to determine if they are linearly independent (turns out they are not). With an application of Theorem BS we can see that the arrive at a basis employing three of these vectors,

\begin{align*} \begin{bmatrix} 1 & 1 & -2 & 0\\ 1 & 1 & -2 & 1\\ -2 & 2 & 4 & -1\\ 0 & 1 & 0 & 1 \end{bmatrix} &\rref \begin{bmatrix} \leading{1} & 0 & -2 & 0\\ 0 & \leading{1} & 0 & 0\\ 0 & 0 & 0 & \leading{1}\\ 0 & 0 & 0 &0 \end{bmatrix} \end{align*}

Thus, we have the following basis of \(W\text{,}\)

\begin{align*} B &= \set{ \colvector{1\\1\\-2\\0}, \colvector{1\\1\\2\\1}, \colvector{0\\1\\-1\\1} }\text{.} \end{align*}
C12.

Find a basis for the vector space \(T\) of lower triangular \(3 \times 3\) matrices; that is, matrices of the form

\begin{align*} \begin{bmatrix} * & 0 & 0\\ * & * & 0\\ * & * & *\end{bmatrix} \end{align*}

where an asterisk represents any complex number.

Solution

Let \(A\) be an arbitrary element of the specified vector space \(T\text{.}\) Then there exist \(a\text{,}\) \(b\text{,}\) \(c\text{,}\) \(d\text{,}\) \(e\) and \(f\) so that

\begin{align*} A = \begin{bmatrix} a & 0 & 0\\ b & c & 0\\ d & e & f \end{bmatrix}\text{.} \end{align*}

Then

\begin{align*} A &= a\begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix} + b\begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix} + c\begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 0 \end{bmatrix} + d\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 1 & 0 & 0 \end{bmatrix} + e\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix} + f\begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 0 & 1 \end{bmatrix} \end{align*}

Consider the set

\begin{align*} B &= \set{ \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 1 & 0 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix}, \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0\\ 0 & 0 & 1 \end{bmatrix} } \end{align*}

The six vectors in \(B\) span the vector space \(T\text{,}\) and we can check rather simply that they are also linearly independent. Thus, \(B\) is a basis of \(T\text{.}\)

C13.

Find a basis for the subspace \(Q\) of \(P_2\text{,}\) \(Q = \setparts{p(x) = a + bx + cx^2}{p(0) = 0}\text{.}\)

Solution

If \(p(0) = 0\text{,}\) then \(a + b(0) + c(0^2) = 0\text{,}\) so \(a = 0\text{.}\) Thus, we can write \(Q = \setparts{p(x) = bx + cx^2}{b, c\in\complexes}\text{.}\) A linearly independent set that spans \(Q\) is \(B=\set{x, x^2}\text{,}\) and this set forms a basis of \(Q\text{.}\)

C14.

Find a basis for the subspace \(R\) of \(P_2\text{,}\) \(R = \setparts{p(x) = a + bx + cx^2}{p'(0) = 0}\text{,}\) where \(p'\) denotes the derivative.

Solution

The derivative of \(p(x) = a + bx + cx^2\) is \(p^\prime(x) = b + 2cx\text{.}\) Thus, if \(p \in R\text{,}\) then \(p^\prime(0) = b + 2c(0) = 0\text{,}\) so we must have \(b = 0\text{.}\) We see that we can rewrite \(R\) as \(R = \setparts{p(x) = a + cx^2}{a, c\in\complexes}\text{.}\) A linearly independent set that spans \(R\) is \(B = \set{1,x^2}\text{,}\) and \(B\) is a basis of \(R\text{.}\)

C40.

From Example RSB, form an arbitrary (and nontrivial) linear combination of the four vectors in the original spanning set for \(W\text{.}\) So the result of this computation is of course an element of \(W\text{.}\) As such, this vector should be a linear combination of the basis vectors in \(B\text{.}\) Find the (unique) scalars that provide this linear combination. Repeat with another linear combination of the original four vectors.

Solution

An arbitrary linear combination is

\begin{equation*} \vect{y}= 3\colvector{2\\-3\\1}+ (-2)\colvector{1\\4\\1}+ 1\colvector{7\\-5\\4}+ (-2)\colvector{-7\\-6\\-5} = \colvector{25\\-10\\15}\text{.} \end{equation*}

(You probably used a different collection of scalars.) We want to write \(\vect{y}\) as a linear combination of

\begin{equation*} B=\set{\colvector{1\\0\\\frac{7}{11}},\,\colvector{0\\1\\\frac{1}{11}}} \end{equation*}

We could set this up as vector equation with variables as scalars in a linear combination of the vectors in \(B\text{,}\) but since the first two slots of \(B\) have such a nice pattern of zeros and ones, we can determine the necessary scalars easily and then double-check our answer with a computation in the third slot,

\begin{equation*} 25\colvector{1\\0\\\frac{7}{11}}+(-10)\colvector{0\\1\\\frac{1}{11}} = \colvector{25\\-10\\(25)\frac{7}{11}+(-10)\frac{1}{11}} = \colvector{25\\-10\\15}=\vect{y}\text{.} \end{equation*}

Notice how the uniqueness of these scalars arises. They are forced to be \(25\) and \(-10\text{.}\)

C80.

Prove that \(\set{(1,\,2),\,(2,\,3)}\) is a basis for the crazy vector space \(C\) (Example CVS).

M20.

In Example BM provide the verifications (linear independence and spanning) to show that \(B\) is a basis of \(M_{mn}\text{.}\)

Solution

We need to establish the linear independence and spanning properties of the set

\begin{equation*} B=\setparts{B_{k\ell}}{1\leq k\leq m,\ 1\leq\ell\leq n} \end{equation*}

relative to the vector space \(M_{mn}\text{.}\)

This proof is more transparent if you write out individual matrices in the basis with lots of zeros and dots and a lone one. But we do not have room for that here, so we will use summation notation. Think carefully about each step, especially when the double summations seem to “disappear.” Begin with a relation of linear dependence, using double subscripts on the scalars to align with the basis elements.

\begin{equation*} \zeromatrix=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\alpha_{k\ell}B_{k\ell} \end{equation*}

Now consider the entry in row \(i\) and column \(j\) for these equal matrices,

\begin{align*} 0 &=\matrixentry{\zeromatrix}{ij}&& \knowl{./knowl/definition-ZM.html}{\text{Definition ZM}}\\ &=\matrixentry{\sum_{k=1}^{m}\sum_{\ell=1}^{n}\alpha_{k\ell}B_{k\ell}}{ij}&& \knowl{./knowl/definition-ME.html}{\text{Definition ME}}\\ &=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\matrixentry{\alpha_{k\ell}B_{k\ell}}{ij}&& \knowl{./knowl/definition-MA.html}{\text{Definition MA}}\\ &=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\alpha_{k\ell}\matrixentry{B_{k\ell}}{ij}&& \knowl{./knowl/definition-MSM.html}{\text{Definition MSM}}\\ &=\alpha_{ij}\matrixentry{B_{ij}}{ij}&& \matrixentry{B_{k\ell}}{ij}=0\text{ when }(k,\ell)\neq(i,j)\\ &=\alpha_{ij}(1)&& \matrixentry{B_{ij}}{ij}=1\\ &=\alpha_{ij}\text{.} \end{align*}

Since \(i\) and \(j\) were arbitrary, we find that each scalar is zero and so \(B\) is linearly independent (Definition LI).

To establish the spanning property of \(B\) we need only show that an arbitrary matrix \(A\) can be written as a linear combination of the elements of \(B\text{.}\) So suppose that \(A\) is an arbitrary \(m\times n\) matrix and consider the matrix \(C\) defined as a linear combination of the elements of \(B\) by

\begin{equation*} C=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\matrixentry{A}{k\ell}B_{k\ell} \end{equation*}

Then,

\begin{align*} \matrixentry{C}{ij} &=\matrixentry{\sum_{k=1}^{m}\sum_{\ell=1}^{n}\matrixentry{A}{k\ell}B_{k\ell}}{ij}&& \knowl{./knowl/definition-ME.html}{\text{Definition ME}}\\ &=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\matrixentry{\matrixentry{A}{k\ell}B_{k\ell}}{ij}&& \knowl{./knowl/definition-MA.html}{\text{Definition MA}}\\ &=\sum_{k=1}^{m}\sum_{\ell=1}^{n}\matrixentry{A}{k\ell}\matrixentry{B_{k\ell}}{ij}&& \knowl{./knowl/definition-MSM.html}{\text{Definition MSM}}\\ &=\matrixentry{A}{ij}\matrixentry{B_{ij}}{ij}&& \matrixentry{B_{k\ell}}{ij}=0\text{ when }(k,\ell)\neq(i,j)\\ &=\matrixentry{A}{ij}(1)&& \matrixentry{B_{ij}}{ij}=1\\ &=\matrixentry{A}{ij} \end{align*}

So by Definition ME, \(A=C\text{,}\) and therefore \(A\in\spn{B}\text{.}\) By Definition B, the set \(B\) is a basis of the vector space \(M_{mn}\text{.}\)

T50.

Theorem UMCOB says that unitary matrices are characterized as those matrices that “carry” orthonormal bases to orthonormal bases. This problem asks you to prove a similar result: nonsingular matrices are characterized as those matrices that “carry” bases to bases.

More precisely, suppose that \(A\) is a square matrix of size \(n\) and \(B=\set{\vectorlist{x}{n}}\) is a basis of \(\complex{n}\text{.}\) Prove that \(A\) is nonsingular if and only if \(C=\set{A\vect{x}_1,\,A\vect{x}_2,\,A\vect{x}_3,\,\dots,\,A\vect{x}_n}\) is a basis of \(\complex{n}\text{.}\) (See also Exercise PD.T33, Exercise MR.T20.)

Solution

Our first proof relies mostly on definitions of linear independence and spanning, which is a good exercise. The second proof is shorter and turns on a technical result from our work with matrix inverses, Theorem NPNF.

(⇒) 

Assume that \(A\) is nonsingular and prove that \(C\) is a basis of \(\complex{n}\text{.}\) First show that \(C\) is linearly independent. Work on a relation of linear dependence on \(C\text{,}\)

\begin{align*} \zerovector &= a_1A\vect{x}_1+ a_2A\vect{x}_2+ a_3A\vect{x}_3+ \cdots+ a_nA\vect{x}_n&& \knowl{./knowl/definition-RLD.html}{\text{Definition RLD}}\\ &= Aa_1\vect{x}_1+ Aa_2\vect{x}_2+ Aa_3\vect{x}_3+ \cdots+ Aa_n\vect{x}_n&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}}\\ &= A\left( a_1\vect{x}_1+ a_2\vect{x}_2+ a_3\vect{x}_3+ \cdots+ a_n\vect{x}_n \right)&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}} \end{align*}

Since \(A\) is nonsingular, Definition NM and Theorem SLEMM allows us to conclude that

\begin{align*} a_1\vect{x}_1+ a_2\vect{x}_2+ \cdots+ a_n\vect{x}_n &=\zerovector \end{align*}

But this is a relation of linear dependence of the linearly independent set \(B\text{,}\) so the scalars are trivial, \(a_1=a_2=a_3=\cdots=a_n=0\text{.}\) By Definition LI, the set \(C\) is linearly independent.

Now prove that \(C\) spans \(\complex{n}\text{.}\) Given an arbitrary vector \(\vect{y}\in\complex{n}\text{,}\) can it be expressed as a linear combination of the vectors in \(C\text{?}\) Since \(A\) is a nonsingular matrix we can define the vector \(\vect{w}\) to be the unique solution of the system \(\linearsystem{A}{\vect{y}}\) (Theorem NMUS). Since \(\vect{w}\in\complex{n}\) we can write \(\vect{w}\) as a linear combination of the vectors in the basis \(B\text{.}\) So there are scalars, \(\scalarlist{b}{n}\) such that

\begin{align*} \vect{w}&=\lincombo{b}{x}{n}\text{.} \end{align*}

Then,

\begin{align*} \vect{y} &=A\vect{w}&& \knowl{./knowl/theorem-SLEMM.html}{\text{Theorem SLEMM}}\\ &=A\left(\lincombo{b}{x}{n}\right)&& \knowl{./knowl/definition-SSVS.html}{\text{Definition SSVS}}\\ &= Ab_1\vect{x}_1+ Ab_2\vect{x}_2+ Ab_3\vect{x}_3+ \cdots+ Ab_n\vect{x}_n&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= b_1A\vect{x}_1+ b_2A\vect{x}_2+ b_3A\vect{x}_3+ \cdots+ b_nA\vect{x}_n&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}} \end{align*}

So we can write an arbitrary vector of \(\complex{n}\) as a linear combination of the elements of \(C\text{.}\) In other words, \(C\) spans \(\complex{n}\) (Definition SSVS). By Definition B, the set \(C\) is a basis for \(\complex{n}\text{.}\)

(⇐) 

Assume that \(C\) is a basis and prove that \(A\) is nonsingular. Let \(\vect{x}\) be a solution to the homogeneous system \(\homosystem{A}\text{.}\) Since \(B\) is a basis of \(\complex{n}\) there are scalars, \(\scalarlist{a}{n}\text{,}\) such that

\begin{align*} \vect{x}&=\lincombo{a}{x}{n} \end{align*}

Then

\begin{align*} \zerovector &=A\vect{x}&& \knowl{./knowl/theorem-SLEMM.html}{\text{Theorem SLEMM}}\\ &=A\left(\lincombo{a}{x}{n}\right)&& \knowl{./knowl/definition-SSVS.html}{\text{Definition SSVS}}\\ &= Aa_1\vect{x}_1+ Aa_2\vect{x}_2+ Aa_3\vect{x}_3+ \cdots+ Aa_n\vect{x}_n&& \knowl{./knowl/theorem-MMDAA.html}{\text{Theorem MMDAA}}\\ &= a_1A\vect{x}_1+ a_2A\vect{x}_2+ a_3A\vect{x}_3+ \cdots+ a_nA\vect{x}_n&& \knowl{./knowl/theorem-MMSMM.html}{\text{Theorem MMSMM}} \end{align*}

This is a relation of linear dependence on the linearly independent set \(C\text{,}\) so the scalars must all be zero, \(a_1=a_2=a_3=\cdots=a_n=0\text{.}\) Thus,

\begin{align*} \vect{x}&=\lincombo{a}{x}{n}=0\vect{x}_1+0\vect{x}_2+0\vect{x}_3+\cdots+0\vect{x}_n=\zerovector\text{.} \end{align*}

By Definition NM we see that \(A\) is nonsingular.

Now for a second proof. Take the vectors for \(B\) and use them as the columns of a matrix, \(G=\matrixcolumns{x}{n}\text{.}\) By Theorem CNMB, because we have the hypothesis that \(B\) is a basis of \(\complex{n}\text{,}\) \(G\) is a nonsingular matrix. Notice that the columns of \(AG\) are exactly the vectors in the set \(C\text{,}\) by Definition MM.

\begin{align*} A\text{ nonsingular} &\iff AG\text{ nonsingular}&& \knowl{./knowl/theorem-NPNF.html}{\text{Theorem NPNF}}\\ &\iff C\text{ basis for }\complex{n}&& \knowl{./knowl/theorem-CNMB.html}{\text{Theorem CNMB}} \end{align*}

That was easy!

T51.

Use the result of Exercise B.T50 to build a very concise proof of Theorem CNMB. (Hint: make a judicious choice for the basis \(B\text{.}\))

Solution

Choose \(B\) to be the set of standard unit vectors, a particularly nice basis of \(\complex{n}\) (Theorem SUVB). For a vector \(\vect{e}_j\) (Definition SUV) from this basis, what is \(A\vect{e}_j\text{?}\)