本文为《Linear algebra and its applications》的读书笔记
目录
Orthogonal projections
- Given a vector
y
\boldsymbol y
y and a subspace
W
W
W in
R
n
\mathbb R^n
Rn, there is a vector
y
^
\hat \boldsymbol y
y^ in
W
W
W such that
- (1) y ^ \hat \boldsymbol y y^ is the unique vector in W W W for which y − y ^ \boldsymbol y -\hat\boldsymbol y y−y^ is orthogonal to W W W
- (2)
y
^
\hat\boldsymbol y
y^ is the unique vector in
W
W
W closest to
y
\boldsymbol y
y
EXAMPLE 1
Let
{
u
1
,
.
.
.
,
u
5
}
\{\boldsymbol u_1,...,\boldsymbol u_5\}
{u1,...,u5} be an orthogonal basis for
R
5
\mathbb R^5
R5 and let
Consider the subspace
W
=
S
p
a
n
{
u
1
,
u
2
}
W = Span \{\boldsymbol u_1,\boldsymbol u_2\}
W=Span{u1,u2}, and write
y
\boldsymbol y
y as the sum of a vector
z
1
\boldsymbol z_1
z1 in
W
W
W and a vector
z
2
\boldsymbol z_2
z2 in
W
⊥
W^\perp
W⊥.
SOLUTION
- The next theorem shows that the decomposition y = z 1 + z 2 \boldsymbol y =\boldsymbol z_1 +\boldsymbol z_2 y=z1+z2 in Example 1 can be computed without having an orthogonal basis for R n \mathbb R^n Rn. It is enough to have an orthogonal basis only for W W W .
- The vector
y
^
\hat \boldsymbol y
y^ in (1) is called the orthogonal projection of
y
\boldsymbol y
y onto
W
W
W and often is written as
p
r
o
j
W
y
proj_W\boldsymbol y
projWy. When
W
W
W is a one-dimensional subspace, the formula for
y
^
\hat\boldsymbol y
y^ matches the formula given in Section 6.2.
PROOF
We may assume that W W W is not the zero subspace, for otherwise W ⊥ = R n W^\perp=\mathbb R^n W⊥=Rn and (1) is simply y = 0 + y \boldsymbol y =\boldsymbol 0+\boldsymbol y y=0+y. The next section will show that any nonzero subspace of R n \mathbb R^n Rn has an orthogonal basis.
- Let
{
u
1
,
.
.
.
,
u
p
}
\{\boldsymbol u_1,...,\boldsymbol u_p\}
{u1,...,up} be any orthogonal basis for
W
W
W , and define
y
^
\hat\boldsymbol y
y^ by (2). Let
z
=
y
−
y
^
\boldsymbol z =\boldsymbol y -\hat\boldsymbol y
z=y−y^, then
Thus z \boldsymbol z z is orthogonal to u 1 \boldsymbol u_1 u1. Similarly, z \boldsymbol z z is orthogonal to each u j \boldsymbol u_j uj in the basis for W W W. Hence z \boldsymbol z z is orthogonal to every vector in W W W . That is, z \boldsymbol z z is in W ⊥ W^\perp W⊥.
- To show that the decomposition in (1) is unique, suppose
y
\boldsymbol y
y can also be written as
y
=
y
^
1
+
z
1
\boldsymbol y =\hat\boldsymbol y_1 +\boldsymbol z_1
y=y^1+z1, with
y
^
1
\hat\boldsymbol y_1
y^1 in
W
W
W and
z
1
\boldsymbol z_1
z1 in
W
⊥
W^\perp
W⊥. Then
y
^
+
z
=
y
^
1
+
z
1
\hat\boldsymbol y+\boldsymbol z =\hat\boldsymbol y_1+\boldsymbol z_1
y^+z=y^1+z1, and so
y ^ − y ^ 1 = z 1 − z \hat\boldsymbol y-\hat\boldsymbol y _1=\boldsymbol z_1-\boldsymbol z y^−y^1=z1−zThis equality shows that the vector v = y ^ − y ^ 1 \boldsymbol v =\hat\boldsymbol y-\hat\boldsymbol y_1 v=y^−y^1 is in W W W and in W ⊥ W^\perp W⊥. Hence v ⋅ v = 0 \boldsymbol v\cdot \boldsymbol v = 0 v⋅v=0, which shows that v = 0 \boldsymbol v =\boldsymbol 0 v=0. This proves that y ^ = y ^ 1 \hat\boldsymbol y=\hat\boldsymbol y_1 y^=y^1 and also z 1 = z \boldsymbol z_1 =\boldsymbol z z1=z.
EXERCISE
Suppose that { u 1 , u 2 } \{\boldsymbol u_1, \boldsymbol u_2\} {u1,u2} is an orthogonal set of nonzero vectors in R 3 \mathbb R^3 R3. How would you find an orthogonal basis of R 3 \mathbb R^3 R3 that contains u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2?
SOLUTION
- First, find a vector v \boldsymbol v v in R 3 \mathbb R^3 R3 that is not in the subspace W W W spanned by u 1 \boldsymbol u_1 u1 and u 2 \boldsymbol u_2 u2. Let u 3 = v − p r o j W v \boldsymbol u_3=\boldsymbol v-proj_W\boldsymbol v u3=v−projWv, then { u 1 , u 2 , u 3 } \{\boldsymbol u_1, \boldsymbol u_2, \boldsymbol u_3\} {u1,u2,u3} is an orthogonal basis.
EXERCISE 23
Let A A A be an m × n m \times n m×n matrix. Prove that every vector x \boldsymbol x x in R n \mathbb R^n Rn can be written in the form x = p + u \boldsymbol x=\boldsymbol p +\boldsymbol u x=p+u, where p \boldsymbol p p is in R o w A RowA RowA and u \boldsymbol u u is in N u l A NulA NulA. Also, show that if the equation A x = b A\boldsymbol x =\boldsymbol b Ax=b is consistent, then there is a unique p \boldsymbol p p in R o w A RowA RowA such that A p = b A\boldsymbol p=\boldsymbol b Ap=b.
SOLUTION
- By the Orthogonal Decomposition Theorem, each x \boldsymbol x x in R n \mathbb R^n Rn can be written uniquely as x = p + u \boldsymbol x = \boldsymbol p + \boldsymbol u x=p+u, with p \boldsymbol p p in R o w A Row A RowA and u \boldsymbol u u in ( R o w A ) ⊥ = N u l A (Row A)^\perp=Nul\ A (RowA)⊥=Nul A.
- Next, suppose that A x = b A\boldsymbol x = \boldsymbol b Ax=b is consistent. Let x \boldsymbol x x be a solution, and write x = p + u \boldsymbol x = \boldsymbol p +\boldsymbol u x=p+u, as above. Then A p = A ( x – u ) = A x – A u = b – 0 = b A\boldsymbol p = A(\boldsymbol x – \boldsymbol u) = A\boldsymbol x – A\boldsymbol u = \boldsymbol b – \boldsymbol 0 = \boldsymbol b Ap=A(x–u)=Ax–Au=b–0=b. So the equation A x = b A\boldsymbol x = \boldsymbol b Ax=b has at least one solution p \boldsymbol p p in R o w A Row A RowA.
- Finally, suppose that p \boldsymbol p p and p 1 \boldsymbol p_1 p1 are both in R o w A Row A RowA and satisfy A x = b A\boldsymbol x = \boldsymbol b Ax=b. Then p – p 1 \boldsymbol p – \boldsymbol p_1 p–p1 is in N u l A Nul A NulA because A ( p – p 1 ) = A p – A p 1 = b – b = 0 A (\boldsymbol p – \boldsymbol p_1) = A\boldsymbol p – A\boldsymbol p_1 = \boldsymbol b – \boldsymbol b = \boldsymbol 0 A(p–p1)=Ap–Ap1=b–b=0The equations p = p 1 + ( p – p 1 ) \boldsymbol p = \boldsymbol p_1 + (\boldsymbol p – \boldsymbol p_1) p=p1+(p–p1) and p = p + 0 \boldsymbol p = \boldsymbol p + \boldsymbol 0 p=p+0 both decompose p \boldsymbol p p as the sum of a vector in R o w A Row A RowA and a vector in ( R o w A ) T (Row A)^T (RowA)T. By the uniqueness of the orthogonal decomposition (Theorem 8), p 1 = p \boldsymbol p_1 = \boldsymbol p p1=p, so p \boldsymbol p p is unique.
A Geometric Interpretation of the Orthogonal Projection
- When
W
W
W is a one-dimensional subspace, the formula (2) for
p
r
o
j
W
y
proj_W \boldsymbol y
projWy contains just one term. Thus, when
d
i
m
W
>
1
dimW > 1
dimW>1, each term in (2) is itself an orthogonal projection of
y
\boldsymbol y
y onto a one-dimensional subspace spanned by one of the
u
\boldsymbol u
u’s in the basis for
W
W
W . Figure 3 illustrates this when
W
W
W is a subspace of
R
3
\mathbb R^3
R3 spanned by
u
1
\boldsymbol u_1
u1 and
u
2
\boldsymbol u_2
u2.
Properties of Orthogonal Projections
- This fact also follows from the next theorem.
最佳逼近定理
- The vector
y
\boldsymbol y
y in Theorem 9 is called the best approximation to
y
\boldsymbol y
y by elements of
W
W
W(
W
W
W 中元素对
y
\boldsymbol y
y 的最佳逼近).
- Later sections in the text will examine problems where a given y \boldsymbol y y must be replaced, or approximated, by a vector v \boldsymbol v v in some fixed subspace W W W . The distance ∥ y − v ∥ \left\|\boldsymbol y-\boldsymbol v\right\| ∥y−v∥, can be regarded as the “error” of using v \boldsymbol v v in place of y \boldsymbol y y. Theorem 9 says that this error is minimized when v = y ^ \boldsymbol v =\hat\boldsymbol y v=y^.
- Inequality (3) leads to a new proof that y ^ \hat\boldsymbol y y^ does not depend on the particular orthogonal basis used to compute it.
PROOF
- Take
v
\boldsymbol v
v in
W
W
W distinct from
y
^
\hat\boldsymbol y
y^. See Figure 4. Then
y
−
y
^
\boldsymbol y -\hat \boldsymbol y
y−y^ is orthogonal to
y
^
−
v
\hat\boldsymbol y-\boldsymbol v
y^−v (which is in
W
W
W). Since
the Pythagorean Theorem(勾股定理) gives
Now ∥ y ^ − v ∥ > 0 \left\|\hat\boldsymbol y -\boldsymbol v\right\| > 0 ∥y^−v∥>0 , and so inequality (3) follows immediately.
- The final theorem in this section shows how formula (2) for p r o j W y proj_W \boldsymbol y projWy is simplified when the basis for W W W is an orthonormal set.
- Suppose
U
U
U is an
n
×
p
n \times p
n×p matrix with orthonormal columns, and let
W
W
W be the column space of
U
U
U. Then
EXAMPLE
Let W W W be a subspace of R n \mathbb R^n Rn. Let x \boldsymbol x x and y \boldsymbol y y be vectors in R n \mathbb R^n Rn and let z = x + y \boldsymbol z =\boldsymbol x + \boldsymbol y z=x+y. If u \boldsymbol u u is the projection of x \boldsymbol x x onto W W W and v \boldsymbol v v is the projection of y \boldsymbol y y onto W W W , show that u + v \boldsymbol u + \boldsymbol v u+v is the projection of z \boldsymbol z z onto W W W .
SOLUTION
- Let
U
U
U be a matrix whose columns consist of an orthonormal basis for
W
W
W . Then
p r o j W z = U U T z = U U T ( x + y ) = U U T x + U U T y = p r o j W x + p r o j W y = u + v \begin{aligned}proj_W\boldsymbol z &= UU^T\boldsymbol z \\&= UU^T (\boldsymbol x + \boldsymbol y)\\&= UU^T \boldsymbol x + UU^T \boldsymbol y \\&= proj_W \boldsymbol x + proj_W \boldsymbol y \\&=\boldsymbol u +\boldsymbol v\end{aligned} projWz=UUTz=UUT(x+y)=UUTx+UUTy=projWx+projWy=u+v