SCLA Projectors

Section 1.6 Projectors

When we multiply a vector by a matrix, we form a linear combination of the columns of the matrix. Said differently, the result of the product is in the column space of the matrix. So we can think of a matrix as moving a vector into a subspace, and we call that subspace the column space of the matrix \(\csp{A}\text{.}\) In the case of a linear transformation, we call this subspace the range, \(\rng{T}\text{,}\) or we might call it the image. A projector is a square matrix which moves vectors into a subspace (like any matrix can), but fixes vectors already in the subspace. This property earns a projector the moniker idempotent. We will see that projectors have a variety of interesting properties.

Subsection 1.6.1 Oblique Projectors

Definition 1.6.1.

A square matrix \(P\) is a projector if \(P^2=P\text{.}\)

A projector fixes vectors in its column space.

Lemma 1.6.2. Projectors Fix Column Space.

Suppose \(P\) is a projector and \(\vect{x}\in\csp{P}\text{.}\) Then \(P\vect{x}-\vect{x}=\zerovector\text{.}\)

Proof.

Since \(\vect{x}\in\csp{P}\text{,}\) there is a vector \(\vect{w}\) such that \(P\vect{w}=\vect{x}\text{.}\) Then

\begin{equation*} P\vect{x}-\vect{x}=P\left(P\vect{w}\right)-P\vect{w}=P^2\vect{w}-P\vect{w}=P\vect{w}-P\vect{w}=\zerovector. \end{equation*}

For a general vector, the difference between the vector and its image under a projector may not always be the zero vector, but it will be a vector in the null space of the projector.

Lemma 1.6.3. Projector Directions are Null Space.

Suppose \(P\) is a projector of size \(n\) and \(\vect{x}\in\complex{n}\) is any vector. Then \(P\vect{x}-\vect{x}\in\nsp{P}\text{.}\) Furthermore, \(\nsp{P}=\setparts{P\vect{x}-\vect{x}}{\vect{x}\in\complex{n}}\text{.}\)

Proof.

First,

\begin{equation*} P(P\vect{x}-\vect{x})=P^2\vect{x}-P\vect{x}=P\vect{x}-P\vect{x}=\zerovector. \end{equation*}

To establish the second half of the claimed set equality, suppose \(\vect{z}\in\nsp{P}\text{,}\) then

\begin{equation*} \vect{z}=\zerovector - \left(-\vect{z}\right)=P\left(-\vect{z}\right)- \left(-\vect{z}\right) \end{equation*}

which establishes that \(\vect{z}\in\setparts{P\vect{x}-\vect{x}}{\vect{x}\in\complex{n}}\text{.}\)

When the null space of a projector has dimension one, it is easy to understand the choice of the term “projector”. Imagine the setting in three dimensions where the column space of the projector is a subspace of dimension two, which is physically a plane through the origin. Imagine some vector as an arrow from the origin, or as just the point that would be at the tip of the arrow. A light shines on the vector and casts a shadow onto the plane (either another arrow, or just a point). This shadow is the projection, the image of the projector. The image of the shadow is unchanged, since shining the light on the vector that is the shadow will not move it. What direction does the light come from? What is the vector that describes the change from the vector to its shadow (projection)? For a vector \(\vect{x}\text{,}\) this direction is \(P\vect{x}-\vect{x}\text{,}\) an element of the null space of \(P\text{.}\) So if \(\nsp{P}\) has dimension one, then every vector is moved in the same direction, a multiple of a lone basis vector for \(\nsp{P}\text{.}\) This matches our assumptions about physical light from a distant source, with rays all moving parallel to each other. Here is a simple example of just this scenario.

Example 1.6.4. Projector in Three Dimensions.

Verify the following facts about the matrix \(P\) to understand that it is a projector and to understand its geometry.

\begin{equation*} P = \frac{1}{13} \begin{bmatrix} 11 & -3 & -5 \\ -4 & 7 & -10 \\ -2 & -3 & 8 \end{bmatrix} \end{equation*}

\(P^2=P\)
\(\csp{P}=\spn{\set{\colvector{1\\0\\\frac{-2}{5}},\,\colvector{0\\1\\\frac{-3}{5}}}}\)
\(\nsp{P}=\spn{\set{\colvector{1\\2\\1}}}\)

So \(P\) sends every vector onto a two-dimensional subspace, with an equation we might write as \(2x+3y+5z=0\) in Cartesian coordinates, or which we might describe as the plane through the origin with normal vector \(\vect{n}=2\vec{i}+3\vec{j}+5\vec{k}\text{.}\) Vectors, or points, are always moved in the direction of the vector \(\vect{d}=\vec{i}+2\vec{j}+1\vec{k}\)—this is the direction the light is shining. Exercise Checkpoint 1.6.5 asks you to experiment further.

Checkpoint 1.6.5.

Continue experimenting with Example Example 1.6.4 by constructing a vector not in the column space of \(P\text{.}\) Compute its image under \(P\) and verify that it is a linear combination of the basis vectors given in the example. Compute the direction your vector moved and verify that it is a scalar multiple of the basis vector for the null space given in the example. Finally, construct a new vector in the column space and verify that it is unmoved by \(P\text{.}\)

Given a projector, we can define a complementary projector, which has some interesting properties.

Definition 1.6.6.

Given a projector \(P\text{,}\) the complementary projector to \(P\) is \(I-P\text{.}\)

The next lemma justifies calling \(I-P\) a projector.

Lemma 1.6.7. Complementary Projector is a Projector.

If \(P\) is a projector then \(I-P\) is also a projector.

Proof.

\begin{equation*} \left(I-P\right)^2=I^2-P-P+P^2=I-P-P+P=I-P \end{equation*}

The complementary projector to \(P\) projects onto the null space of \(P\text{.}\)

Lemma 1.6.8. Complementary Projector's Column Space.

Suppose \(P\) is a projector. Then \(\csp{I-P}=\nsp{P}\) and therefore \(\nsp{I-P}=\csp{P}\text{.}\)

Proof.

First, suppose \(\vect{x}\in\nsp{P}\text{.}\) Then

\begin{equation*} \left(I-P\right)\vect{x}=I\vect{x}-P\vect{x}=\vect{x} \end{equation*}

demonstrating that \(\vect{x}\) is a linear combination of the columns of \(I-P\text{.}\) So \(\nsp{P}\subseteq\csp{I-P}\text{.}\)

Now, suppose \(\vect{x}\in\csp{I-P}\text{.}\) Then there is a vector \(\vect{w}\) such that \(\vect{x}=\left(I-P\right)\vect{w}\text{.}\) Then

\begin{equation*} P\vect{x} = P\left(I-P\right)\vect{w}=(P-P^2)\vect{w}=\zeromatrix\vect{w}=\zerovector. \end{equation*}

So \(\csp{I-P}\subseteq\nsp{P}\text{.}\)

To establish the second conclusion, replace the projector \(P\) in the first conclusion by the projector \(I-P\text{.}\)

Using these facts about complementary projectors we find a simple direct sum decomposition.

Theorem 1.6.9. Projector Vector Space Decomposition.

Suppose \(P\) is a projector of size \(n\text{.}\) Then \(\complex{n}=\csp{P}\ds\nsp{P}\text{.}\)

Proof.

First, we show that \(\csp{P}\cap\nsp{P}=\set{\zerovector}\text{.}\) Suppose \(\vect{x}\in\csp{P}\cap\nsp{P}\text{.}\) Since \(\vect{x}\in\csp{P}\text{,}\) Lemma Lemma 1.6.8 implies that \(\vect{x}\in\nsp{I-P}\text{.}\) So

\begin{equation*} \vect{x}=\vect{x}-\zerovector=\vect{x}-P\vect{x}=\left(I-P\right)\vect{x}=\zerovector. \end{equation*}

Using Lemma Lemma 1.6.8 again, \(\nsp{P}=\csp{I-P}\text{.}\) We show that an arbitrary vector \(\vect{w}\in\complex{n}\) can be written as a sum of two vectors from the two column spaces,

\begin{equation*} \vect{w} = I\vect{w} - P\vect{w} + P\vect{w} = \left(I-P\right)\vect{w} + P\vect{w}. \end{equation*}

So \(\complex{n}\text{,}\) \(\csp{P}\) and \(\nsp{P}\) meet the hypotheses of Theorem Theorem 1.2.5, allowing us to establish the direct sum.

Subsection 1.6.2 Orthogonal Projectors

The projectors of the previous section would be termed oblique projectors since no assumption was made about the direction that a vector was moved when projected. We remedy that situation now by defining an orthogonal projector to be a projector where the complementary subspace is orthogonal to the space the projector projects onto.

Definition 1.6.10.

A projector \(P\) is orthogonal if \(\nsp{P}=\per{\left(\csp{P}\right)}\text{.}\)

We know from Theorem Theorem 1.6.9 that for a projector \(P\text{,}\) \(\complex{n}=\csp{P}\ds\nsp{P}\text{.}\) We also know by Corollary Corollary 1.3.8, that for any \(m\times n\) matrix \(A\text{,}\) \(\complex{m}=\csp{A}\ds\per{\csp{A}}=\csp{A}\ds\nsp{\adjoint{A}}\text{.}\) So, superficially, we might expect orthogonal projectors to be Hermitian. And so it is.

Theorem 1.6.11. Orthogonal Projectors are Hermitian.

Suppose \(P\) is a projector. Then \(P\) is an orthogonal projector if and only if \(P\) is Hermitian.

Proof.

Theorem HMIP says that a Hermitian matrix \(A\) is characterized by the property that \(\innerproduct{A\vect{x}}{\vect{y}}=\innerproduct{\vect{x}}{A\vect{y}}\) for every choice of the vectors \(\vect{x}, \vect{y}\text{.}\) We will use this result in both halves of the proof.

Suppose that \(\vect{x}\in\nsp{P}\text{.}\) Then for any \(\vect{y}\in\csp{P}\text{,}\) there is a vector \(\vect{w}\) that allows us to write

\begin{equation*} \innerproduct{\vect{x}}{\vect{y}} =\innerproduct{\vect{x}}{P\vect{w}} =\innerproduct{P\vect{x}}{\vect{w}} =\innerproduct{\zerovector}{\vect{w}} =0. \end{equation*}

So \(\nsp{P}\subseteq\per{\csp{P}}.\)

Now suppose that \(\vect{x}\in\per{\csp{P}}\text{.}\) Consider,

\begin{equation*} \innerproduct{P\vect{x}}{P\vect{x}} =\innerproduct{P^2\vect{x}}{\vect{x}} =\innerproduct{P\vect{x}}{\vect{x}} =0. \end{equation*}

By Theorem PIP, we conclude that \(P\vect{x}=\zerovector\) and \(\vect{x}\in\nsp{P}\text{.}\) So \(\per{\csp{P}}\subseteq\nsp{P}\) and we have establish the set equality of Definition Definition 1.6.10.

Let \(\vect{u},\vect{v}\in\complex{n}\) be any two vectors. Decompose each into two pieces, the first from the column space, the second from the null space, according to Theorem Theorem 1.6.9. So

\begin{align*} \vect{u}&=\vect{u}_1+\vect{u}_2&\vect{v}&=\vect{v}_1+\vect{v}_2 \end{align*}

with \(\vect{u}_1,\vect{v}_1\in\csp{P}\) and \(\vect{u}_2,\vect{v}_2\in\nsp{P}\text{.}\) Then

\begin{align*} \innerproduct{P\vect{u}}{\vect{v}} &=\innerproduct{P\vect{u}_1+P\vect{u}_2}{\vect{v}_1+\vect{v}_2} =\innerproduct{P\vect{u}_1}{\vect{v}_1+\vect{v}_2}\\ &=\innerproduct{\vect{u}_1}{\vect{v}_1+\vect{v}_2} =\innerproduct{\vect{u}_1}{\vect{v}_1}+\innerproduct{\vect{u}_1}{\vect{v}_2} =\innerproduct{\vect{u}_1}{\vect{v}_1}\\ \innerproduct{\vect{u}}{P\vect{v}} &=\innerproduct{\vect{u}_1+\vect{u}_2}{P\vect{v}_1+P\vect{v}_2} =\innerproduct{\vect{u}_1+\vect{u}_2}{P\vect{v}_1}\\ &=\innerproduct{\vect{u}_1+\vect{u}_2}{\vect{v}_1} =\innerproduct{\vect{u}_1}{\vect{v}_1}+\innerproduct{\vect{u}_2}{\vect{v}_1} =\innerproduct{\vect{u}_1}{\vect{v}_1} \end{align*}

Since \(\innerproduct{P\vect{u}}{\vect{v}}=\innerproduct{\vect{u}}{P\vect{v}}\) for all choices of \(\vect{u},\vect{v}\in\complex{n}\text{,}\) Theorem HMIP, establishes that \(P\) is Hermitian.

There is an easy recipe for creating orthogonal projectors onto a given subspace. We will first informally motivate the construction, then give the formal proof. Suppose \(U\) is a subspace with a basis \(\vectorlist{u}{k}\) and let \(A\) be a matrix with these basis vectors as the columns. Let \(P\) denote the desired orthogonal projector, and consider its action on an arbitrary vector \(\vect{x}\text{.}\) To project onto \(U\text{,}\) we must have \(P\vect{x}\in\csp{A}\text{,}\) so there is a vector \(\vect{w}\) such that \(P\vect{x}=A\vect{w}\text{.}\) The orthogonality condition will be satisfied if \(P\vect{x}-\vect{x}\) is orthogonal to every vector of \(U\text{.}\) It is enough to require orthogonality to each basis vector of \(U\text{,}\) and hence to each column of \(A\text{.}\) So we have

\begin{align*} &\adjoint{A}\left(P\vect{x}-\vect{x}\right)=\zerovector\\ &\adjoint{A}A\vect{w}-\adjoint{A}\vect{x}=\zerovector\\ &\adjoint{A}A\vect{w}=\adjoint{A}\vect{x} \end{align*}

As \(A\) has full rank, \(\adjoint{A}A\) is nonsingular ((((adjoint-A is nonsingular result)))), so we can employ its inverse to find

\begin{equation*} P\vect{x}=A\vect{w}=A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\vect{x} \end{equation*}

This suggests that \(P=A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\text{.}\) And so it is.

Theorem 1.6.12. Orthogonal Projector Construction.

Suppose \(U\) is a subspace and \(A\) is a matrix whose columns form a basis of \(U\text{.}\) Then \(P=A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\) is an orthogonal projector onto \(U\text{.}\)

Proof.

Because \(A\) is the leftmost term in the product for \(P\text{,}\) \(\csp{P}\subseteq\csp{A}\text{.}\) Because \(\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\) has full (column) rank, \(\csp{A}\subseteq\csp{P}\text{.}\) So the image of the projector is exactly \(U\text{.}\)

Now we verify that \(P\) is a projector.

\begin{align*} P^2=&\left(A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\right)\left(A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\right)\\ =&A\inverse{\left(\adjoint{A}A\right)}\left(\adjoint{A}A\right)\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\\ =&A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\\ =&P \end{align*}

And lastly, orthogonality against a basis of \(U\text{.}\)

\begin{align*} \adjoint{A}\left(P\vect{x}-\vect{x}\right)&=\adjoint{A}A\inverse{\left(\adjoint{A}A\right)}\adjoint{A}\vect{x}-\adjoint{A}\vect{x}\\ &=\adjoint{A}\vect{x}-\adjoint{A}\vect{x}\\ &=\zerovector \end{align*}

Suppose the basis vectors of \(U\) described in Theorem Theorem 1.6.12 form an orthonormal set, and in acknowledgment we denote the matrix with these vectors as columns by \(Q\text{.}\) Then the projector simplifies to \(P=Q\inverse{\left(\adjoint{Q}Q\right)}\adjoint{Q}=Q\adjoint{Q}.\) The other interesting special case is when \(U\) is 1-dimensional (a “line”). Then \(\adjoint{A}A\) is just the square of the norm of the lone basis vector. With this scalar moved out of the way, the remaining computation, \(A\adjoint{A}\text{,}\) is an outer product that results in a rank \(1\) matrix (as we would expect).

Checkpoint 1.6.13.

Illustrate Theorem Theorem 1.6.11 by proving directly that the orthogonal projector described in Theorem Theorem 1.6.12 is Hermitian.

Checkpoint 1.6.14.

Construct the orthogonal projector onto the line spanned by

\begin{equation*} \vect{v}=\colvector{4\\2\\1\\10}. \end{equation*}

Illustrate its use by projecting some vector not on the line, and verifying that the difference between the vector and its projection is orthogonal to the line.

Checkpoint 1.6.15.

Construct the orthogonal projector onto the subspace

\begin{equation*} U=\spn{\set{\colvector{1\\1\\1\\1}, \colvector{4\\2\\2\\5}}}. \end{equation*}

Illustrate its use by projecting some vector not in the subspace, and verifying that the difference between the vector and its projection is orthogonal to the line.

Checkpoint 1.6.16.

Redo Exercise Checkpoint 1.6.15 but first convert the basis for \(U\) to an orthonormal basis via the Gram-Schmidt process Theorem GSP and then use the simpler construction applicable to the case of an orthonormal basis.