本文为《Linear algebra and its applications》的读书笔记
目录
Inner product spaces
- Notions of length, distance, and orthogonality are often important in applications involving a vector space. For R n \R^n Rn, these concepts were based on the properties of the inner product. For other spaces, we need analogues of the inner product with the same properties.
内积空间的定义
< v , 0 > = < 0 , v > = 0 <\boldsymbol v,\boldsymbol 0>=<\boldsymbol 0,\boldsymbol v>=0 <v,0>=<0,v>=0
PROOF
< v , 0 > = < v , 0 0 > = 0 < v , 0 > = 0 <\boldsymbol v,\boldsymbol 0>=<\boldsymbol v,0\boldsymbol 0>=0<\boldsymbol v,\boldsymbol 0>=0 <v,0>=<v,00>=0<v,0>=0
常见的内积空间
加权最小二乘
EXAMPLE 1
- Let
u
=
(
u
1
,
u
2
)
\boldsymbol u=(u_1,u_2)
u=(u1,u2) and
v
=
(
v
1
,
v
2
)
\boldsymbol v=(v_1,v_2)
v=(v1,v2),
u
,
v
\boldsymbol u,\boldsymbol v
u,v are in
R
n
\R^n
Rn. It can be shown that
<
u
,
v
>
=
4
u
1
v
1
+
5
u
2
v
2
(
1
)
<\boldsymbol u,\boldsymbol v>=4u_1v_1+5u_2v_2\ \ \ \ \ \ \ (1)
<u,v>=4u1v1+5u2v2 (1)defines an inner product.
- Inner products similar to (1) can be defined on R n \R^n Rn. They arise naturally in connection with “weighted least-squares” problems, in which weights are assigned to the various entries in the sum for the inner product in such a way that more importance is given to the more reliable measurements.
多项式的内积空间
From now on, when an inner product space involves polynomials or other functions, we will write the functions in the familiar way, rather than use the boldface type for vectors. Nevertheless, it is important to remember that each function is a vector when it is treated as an element of a vector space.
EXAMPLE 2
- Let
t
0
,
.
.
.
,
t
n
t_0,..., t_n
t0,...,tn be distinct real numbers. For
p
p
p and
q
q
q in
P
n
\mathbb P^n
Pn (a vector space), define
Inner product Axioms 1–3 are readily checked. - For Axiom 4, note that
Also, < 0 , 0 > = 0 <\boldsymbol 0,\boldsymbol 0>= 0 <0,0>=0. (The boldface zero here denotes the zero polynomial, the zero vector in P n \mathbb P^n Pn.) If < p , p > = 0 <p,p> = 0 <p,p>=0, then p p p must vanish at n + 1 n + 1 n+1 points: t 0 , . . . , t n t_0,..., t_n t0,...,tn. This is possible only if p p p is the zero polynomial, because the degree of p p p is less than n + 1 n + 1 n+1. Thus (2) defines an inner product on P n \mathbb P^n Pn.
这个例子很重要,下面的几个例子都是围绕多项式的内积空间展开的
An Inner Product for C [ a , b ] C[a, b] C[a,b]
- Probably the most widely used inner product space for applications is the vector space C [ a , b ] C[a, b] C[a,b] of all continuous functions on an interval a ≤ t ≤ b a \leq t \leq b a≤t≤b, with an inner product that we will describe.
- We begin by considering a polynomial p p p and any integer n n n larger than or equal to the degree of p p p. Then p p p is in P n \mathbb P^n Pn, and we may compute a “length” for p p p using the inner product of Example 2 involving evaluation at n + 1 n + 1 n+1 points in [ a , b ] [a, b] [a,b]. However, this length of p p p captures the behavior at only those n + 1 n + 1 n+1 points.
- We could use a much larger
n
n
n, with many more points for the “evaluation” inner product. See Figure 4.
- Let us partition
[
a
,
b
]
[a, b]
[a,b] into
n
+
1
n + 1
n+1 subintervals of length
Δ
t
=
(
b
−
a
)
/
(
n
+
1
)
\Delta t=(b - a)/(n + 1)
Δt=(b−a)/(n+1), and let
t
0
,
.
.
.
,
t
n
t_0,..., t_n
t0,...,tn be arbitrary points in these subintervals.
If n n n is large, the inner product on P n \mathbb P^n Pn determined by t 0 , . . . , t n t_0,..., t_n t0,...,tn will tend to give a large value to < p , p > <p, p> <p,p>, so we scale it down and divide by n + 1 n + 1 n+1. Observe that 1 / ( n + 1 ) = Δ t / ( b − a ) 1/(n + 1)=\Delta t/(b- a) 1/(n+1)=Δt/(b−a), and define
- Now, let
n
n
n increase without bound. Since polynomials
p
p
p and
q
q
q are continuous functions, the expression in brackets is a Riemann sum (黎曼和) that approaches a definite integral, and we are led to consider the
a
v
e
r
a
g
e
average
average
v
a
l
u
e
value
value of
p
(
t
)
q
(
t
)
p(t)q(t)
p(t)q(t) on the interval
[
a
,
b
]
[a, b]
[a,b]:
This quantity is defined for polynomials of any degree (in fact, for all continuous functions), and it has all the properties of an inner product. The scale factor 1 / ( b − a ) 1/(b- a) 1/(b−a) is inessential and is often omitted for simplicity.
- The inner product discussed above provides a more sophisticated approach to the least-squares curve fitting.
Lengths, Distances, and Orthogonality
- Let
V
V
V be an inner product space, with the inner product denoted by
<
u
,
v
>
<\boldsymbol u,\boldsymbol v>
<u,v>. Just as in
R
n
\R^n
Rn, we define the length, or norm (范数), of a vector
v
\boldsymbol v
v to be the scalar
∥ v ∥ = < v , v > \left\|\boldsymbol v\right\|=\sqrt{<\boldsymbol v,\boldsymbol v>} ∥v∥=<v,v> - A unit vector is one whose length is 1.
- The distance between u \boldsymbol u u and v \boldsymbol v v is ∥ u − v ∥ \left\|\boldsymbol u-\boldsymbol v\right\| ∥u−v∥.
- Vectors u \boldsymbol u u and v \boldsymbol v v are orthogonal if < u , v > = 0 <\boldsymbol u, \boldsymbol v>= 0 <u,v>=0.
The Gram–Schmidt Process
- The existence of orthogonal bases for finite-dimensional subspaces of an inner product space can be established by the Gram–Schmidt process, just as in
R
n
\R^n
Rn.
- The orthogonal projection of a vector onto a subspace W W W with an orthogonal basis can be constructed as usual. The projection does not depend on the choice of orthogonal basis, and it has the properties described in the Orthogonal Decomposition Theorem and the Best Approximation Theorem.
EXAMPLE 5
Let V V V be P 4 \mathbb P^4 P4 with the inner product in Example 2, involving evaluation of polynomials at − 2 , − 1 , 0 , 1 -2, -1, 0, 1 −2,−1,0,1, and 2 2 2, and view P 2 \mathbb P^2 P2 as a subspace of V V V. Produce an orthogonal basis for P 2 \mathbb P^2 P2 by applying the Gram–Schmidt process to the polynomials 1 , t , 1, t , 1,t, and t 2 t^2 t2.
SOLUTION
- The inner product depends only on the values of a polynomial at
−
2
,
−
1
,
0
,
1
,
2
-2, -1, 0, 1,2
−2,−1,0,1,2. so we list the values of each polynomial as a vector in
R
5
\R^5
R5, underneath the name of the polynomial:
- The inner product of two polynomials in
V
V
V equals the (standard) inner product of their corresponding vectors in
R
5
\R^5
R5. Observe that
t
t
t is orthogonal to the constant function
1
1
1. So take
p
0
(
t
)
=
1
p_0(t)= 1
p0(t)=1 and
p
1
(
t
)
=
t
p_1(t)= t
p1(t)=t . For
p
2
p_2
p2, use the vectors in
R
5
\R^5
R5 to compute the projection of
t
2
t^2
t2 onto
S
p
a
n
{
p
0
,
p
1
}
Span\{p_0,p_1\}
Span{p0,p1}:
The orthogonal projection of t 2 t^2 t2 onto S p a n { p 0 , p 1 } Span\{p_0,p_1\} Span{p0,p1} is 10 5 p 0 + 0 p 1 \frac{10}{5} p_0+ 0p_1 510p0+0p1. Thus
- An orthogonal basis for the subspace
P
2
\mathbb P^2
P2 of
V
V
V is:
Best Approximation in Inner Product Spaces
- A common problem in applied mathematics involves a vector space V V V whose elements are functions. The problem is to approximate a function f f f in V V V by a function g g g from a specified subspace W W W of V V V . The “closeness” of the approximation of f f f depends on the way ∥ f − g ∥ \left\|f- g\right\| ∥f−g∥ is defined. We will consider only the case in which the distance between f f f and g g g is determined by an inner product. In this case, the best approximation to f f f by functions in W W W is the orthogonal projection of f f f onto the subspace W W W.
EXAMPLE 6
Let V V V be P 4 \mathbb P^4 P4 with the inner product in Example 5, and let p 0 , p 1 , p_0, p_1, p0,p1, and p 2 p_2 p2 be the orthogonal basis found in Example 5 for the subspace P 2 \mathbb P^2 P2. Find the best approximation to p ( t ) = 5 − 1 2 t 4 p(t)= 5 -\frac{1}{2}t^4 p(t)=5−21t4 by polynomials in P 2 \mathbb P^2 P2.
SOLUTION
- The values of
p
0
,
p
1
p_0, p_1
p0,p1, and
p
2
p_2
p2 at the numbers
−
2
,
−
1
,
0
,
1
-2, -1, 0, 1
−2,−1,0,1, and
2
2
2 are listed in
R
5
\R^5
R5 vectors in (3) above. The corresponding values for
p
p
p are
−
3
,
9
/
2
,
5
,
9
/
2
,
-3, 9/2, 5, 9/2,
−3,9/2,5,9/2, and
−
3
-3
−3. Compute
Then the best approximation in V V V to p p p by polynomials in P 2 \mathbb P^2 P2 is
- This polynomial is the closest to p p p of all polynomials in P 2 \mathbb P^2 P2, when the distance between polynomials is measured only at − 2 , − 1 , 0 , 1 -2, -1, 0, 1 −2,−1,0,1, and 2 2 2. See Figure 1.
- The polynomials p 0 , p 1 p_0, p_1 p0,p1, and p 2 p_2 p2 in Examples 5 and 6 belong to a class of polynomials that are referred to in statistics as o r t h o g o n a l orthogonal orthogonal p o l y n o m i a l s polynomials polynomials.
The Linearity of an Orthogonal Projection
- In any inner product space, the mapping
y
↦
<
y
,
u
>
<
u
,
u
>
u
\boldsymbol y\mapsto \frac{<\boldsymbol y,\boldsymbol u>}{<\boldsymbol u,\boldsymbol u>}\boldsymbol u
y↦<u,u><y,u>u is linear, for any nonzero
u
\boldsymbol u
u.
- Similarly, if
u
1
,
.
.
.
,
u
p
\boldsymbol u_1,...,\boldsymbol u_p
u1,...,up are any nonzero vectors, then the mapping
is a linear transformation. Thus, if { u 1 , … , u p } \{\boldsymbol u_1, …, \boldsymbol u_p\} {u1,…,up} is an orthogonal basis for a subspace W W W, then the mapping y ↦ p r o j W y \boldsymbol y\mapsto proj_W \boldsymbol y y↦projWy is a linear transformation.
Two Inequalities
两个不等式
- Given a vector
v
\boldsymbol v
v in an inner product space
V
V
V and given a finite-dimensional subspace
W
W
W , we may apply the Pythagorean Theorem to the orthogonal decomposition of
v
\boldsymbol v
v with respect to
W
W
W and obtain
See Figure 2. In particular, this shows that the norm of the projection of v \boldsymbol v v onto W W W does not exceed the norm of v \boldsymbol v v itself. This simple observation leads to the following important inequality.
柯西-施瓦茨不等式
PROOF
- If u = 0 \boldsymbol u =\boldsymbol 0 u=0, then both sides of (4) are zero, and hence the inequality is true in this case.
- If
u
≠
0
\boldsymbol u \neq \boldsymbol 0
u=0, let
W
W
W be the subspace spanned by
u
\boldsymbol u
u. Recall that
∥
c
u
∥
=
∣
c
∣
∥
u
∥
\left\|c\boldsymbol u\right\|=|c| \left\|\boldsymbol u\right\|
∥cu∥=∣c∣∥u∥ for any scalar
c
c
c. Thus
Since ∥ p r o j W v ∥ ≤ ∥ v ∥ \left\|proj_W\boldsymbol v\right\|\leq\left\|\boldsymbol v\right\| ∥projWv∥≤∥v∥, ( 4 ) (4) (4) is proved.
PROOF