本文为《Linear algebra and its applications》的读书笔记
目录
Hyperplanes
Hyperplanes play a special role in the geometry of R n \R^n Rn because they divide the space into two disjoint pieces, just as a plane separates R 3 \R^3 R3 into two parts and a line cuts through R 2 \R^2 R2. The key to working with hyperplanes is to use simple i m p l i c i t implicit implicit descriptions, rather than the e x p l i c i t explicit explicit or parametric representations of lines and planes used in the earlier work with affine sets.
An implicit equation of a line in R 2 \R^2 R2 has the form a x + b y = d ax + by= d ax+by=d. An implicit equation of a plane in R 3 \R^3 R3 has the form a x + b y + c z = d ax+ by+ cz= d ax+by+cz=d. Both equations describe the line or plane as the set of all points at which a linear expression (also called a l i n e a r linear linear f u n c t i o n a l functional functional (线性函数)) has a fixed value, d d d.
If
f
f
f is a linear functional on
R
n
\R^n
Rn, then the standard matrix of this linear transformation
f
f
f is a
1
×
n
1\times n
1×n matrix
A
A
A, say
A
=
[
a
1
a
2
.
.
.
a
n
]
A =\begin{bmatrix} a_1& a_2&...& a_n\end{bmatrix}
A=[a1a2...an]. So
If f f f is a nonzero functional, then r a n k A = 1 rankA = 1 rankA=1, and d i m N u l A = n − 1 dim NulA = n - 1 dimNulA=n−1. Thus, the subspace [ f : 0 ] [f: 0] [f:0] has dimension n − 1 n-1 n−1 and so is a hyperplane. Also, if d d d is any number in R R R, then
Recall that the set of solutions of
A
x
=
b
A\boldsymbol x= \boldsymbol b
Ax=b is obtained by translating the solution set of
A
x
=
0
A\boldsymbol x=\boldsymbol 0
Ax=0. Then
Thus the sets
[
f
:
d
]
[f: d]
[f:d] are hyperplanes parallel to
[
f
:
0
]
[f: 0]
[f:0]. See Figure 1.
When
A
A
A is a
1
×
n
1\times n
1×n matrix, the equation
A
x
=
d
A\boldsymbol x = d
Ax=d may be written with an inner product
n
⋅
x
\boldsymbol n\cdot\boldsymbol x
n⋅x, using
n
\boldsymbol n
n in
R
n
\R^n
Rn with the same entries as
A
A
A. Thus, from (2),
Then
[
f
:
0
]
=
{
x
∈
R
n
:
n
⋅
x
=
0
}
[f: 0]=\{\boldsymbol x\in R^n:\boldsymbol n\cdot \boldsymbol x=0\}
[f:0]={x∈Rn:n⋅x=0}, which shows that
[
f
,
0
]
[f,0]
[f,0] is the orthogonal complement of the subspace spanned by
n
\boldsymbol n
n. In the terminology of calculus and geometry for
R
3
\R^3
R3,
n
\boldsymbol n
n is called a normal vector (法向量) to
[
f
:
0
]
[f:0]
[f:0]. (A “normal” vector in this sense need not have unit length.) Also,
n
\boldsymbol n
n is said to be normal to each parallel hyperplane
[
f
:
d
]
[f:d]
[f:d], even though
n
⋅
x
\boldsymbol n\cdot x
n⋅x is not zero when
d
≠
0
d\neq 0
d=0.
Another name for [ f : d ] [f:d] [f:d] is a level set (水平集) of f f f , and n n n is sometimes called the gradient (梯度) of f f f when f ( x ) = n ⋅ x f(\boldsymbol x)= \boldsymbol n\cdot \boldsymbol x f(x)=n⋅x for each x \boldsymbol x x.
The next three examples show connections between implicit and explicit descriptions of hyperplanes.
EXAMPLE 4
In
R
2
\R^2
R2, give an explicit description of the line
x
−
4
y
=
13
x-4y=13
x−4y=13 in parametric vector form.
SOLUTION
EXAMPLE 5
Let
, and let
L
1
L_1
L1 be the line through
v
1
\boldsymbol v_1
v1 and
v
2
\boldsymbol v_2
v2. Find a linear functional
f
f
f and a constant
d
d
d such that
L
1
=
[
f
:
d
]
L_1=[f:d]
L1=[f:d].
SOLUTION
The line
L
1
L_1
L1 is parallel to the translated line
L
0
L_0
L0 through
v
2
−
v
1
\boldsymbol v_2-\boldsymbol v_1
v2−v1 and the origin. The defining equation for
L
0
L_0
L0 has the form
Since
n
\boldsymbol n
n is orthogonal to the subspace
L
0
L_0
L0, which contains
v
2
−
v
1
\boldsymbol v_2-\boldsymbol v_1
v2−v1, then
By inspection, a solution is
[
a
b
]
=
[
2
5
]
[\ \ a\ \ \ b\ \ ]= [\ \ 2\ \ \ 5\ \ ]
[ a b ]=[ 2 5 ]. Let
f
(
x
,
y
)
=
2
x
+
5
y
f(x, y) =2x+5y
f(x,y)=2x+5y. From (5),
L
0
=
[
f
:
0
]
L_0=[f:0]
L0=[f:0], and
L
1
=
[
f
:
d
]
L_1=[f:d]
L1=[f:d] for some
d
d
d. Since
v
1
\boldsymbol v_1
v1 is on line
L
1
L_1
L1,
d
=
f
(
v
1
)
=
2
(
1
)
+
5
(
2
)
=
12
d=f(\boldsymbol v_1)=2(1) +5(2)=12
d=f(v1)=2(1)+5(2)=12. Thus, the equation for
L
1
L_1
L1 is
2
x
+
5
y
=
12
2x + 5y = 12
2x+5y=12.
EXAMPLE 6
Let
Find an implicit description
[
f
:
d
]
[f:d]
[f:d] of the plane
H
1
H_1
H1 that passes through
v
1
,
v
2
,
\boldsymbol v_1, \boldsymbol v_2,
v1,v2, and
v
3
\boldsymbol v_3
v3.
SOLUTION
H
1
H_1
H1 is parallel to a plane
H
0
H_0
H0 through the origin that contains the translated points
Since these two points are linearly independent,
H
0
=
S
p
a
n
{
v
2
−
v
1
,
v
3
−
v
1
}
H_0= Span\{\boldsymbol v_2-\boldsymbol v_1, \boldsymbol v_3-\boldsymbol v_1\}
H0=Span{v2−v1,v3−v1}. Let
n
=
[
a
b
c
]
\boldsymbol n=\begin{bmatrix} a\\b\\c\end{bmatrix}
n=⎣⎡abc⎦⎤ be the normal to
H
0
H_0
H0. Then
v
2
−
v
1
\boldsymbol v_2- \boldsymbol v_1
v2−v1 and
v
3
−
v
1
\boldsymbol v_3 -\boldsymbol v_1
v3−v1 are each orthogonal to
n
\boldsymbol n
n:
These two equations form a system whose augmented matrix can be row reduced:
Row operations yield
Set
c
=
4
c= 4
c=4, for instance. Then
n
=
[
−
2
5
4
]
\boldsymbol n=\begin{bmatrix} -2\\5\\4\end{bmatrix}
n=⎣⎡−254⎦⎤ and
H
0
=
[
f
:
0
]
H_0=[f:0]
H0=[f:0], where
f
(
x
)
=
−
2
x
1
+
5
x
2
+
4
x
3
f(\boldsymbol x)=-2x_1+5x_2 + 4x_3
f(x)=−2x1+5x2+4x3.
The parallel hyperplane H 1 H_1 H1 is [ f : d ] [f :d] [f:d]. To find d d d, use the fact that v 1 \boldsymbol v_1 v1 is in H 1 H_1 H1, and compute d = f ( v 1 ) = f ( 1 , 1 , 1 ) = 7 d = f(\boldsymbol v_1)= f(1, 1, 1)= 7 d=f(v1)=f(1,1,1)=7.
The procedure in Example 6 generalizes to higher dimensions. However, for the
special case of
R
3
\R^3
R3, one can also use the cross-product formula (叉积公式) to compute
n
n
n, using a symbolic determinant as a mnemonic device:
If only the formula for
f
f
f is needed, the cross-product calculation may be written as an ordinary determinant:
PROOF
Suppose that
H
H
H is a hyperplane, take
p
∈
H
\boldsymbol p\in H
p∈H, and let
H
0
=
H
−
p
H_0= H -\boldsymbol p
H0=H−p. Then
H
0
H_0
H0 is an
(
n
−
1
)
(n-1)
(n−1)-dimensional subspace. Next, take any point
y
\boldsymbol y
y that is not in
H
0
H_0
H0. By the Orthogonal Decomposition Theorem,
where y 1 \boldsymbol y_1 y1 is a vector in H 0 H_0 H0 and n \boldsymbol n n is orthogonal to every vector in H 0 H_0 H0. The function f f f defined by
is a linear functional, by properties of the inner product. Now, [ f : 0 ] [f :0] [f:0] is a hyperplane that contains H 0 H_0 H0, by construction of n \boldsymbol n n. It follows that
Finally, let d = f ( p ) = n ⋅ p d= f(\boldsymbol p)=\boldsymbol n\cdot \boldsymbol p d=f(p)=n⋅p. Then, as in (3) shown earlier,
The converse statement that [ f : d ] [f :d] [f:d] is a hyperplane follows from (1) and (3) above.
Many important applications of hyperplanes depend on the possibility of “separating” two sets by a hyperplane. The following terminology and notation will help to make this idea more precise.
Topology: 拓扑
open ball: 开球
A set is open: 开集
A set is closed: 闭集
A set is bounded: 有界集
A set is compact: 紧致集
EXERCISE 27
Give an example of a closed subset
S
S
S of
R
2
\R^2
R2 such that
c
o
n
v
S
conv S
convS is not closed.
SOLUTION
S
=
{
p
∣
p
=
(
x
,
y
)
,
y
=
1
/
x
,
x
≥
1
/
2
}
S=\{\boldsymbol p|\boldsymbol p=(x,y),y=1/x,x\geq1/2\}
S={p∣p=(x,y),y=1/x,x≥1/2}
EXERCISE 29
Prove that the open ball
B
(
p
,
δ
)
=
{
x
:
∥
x
−
p
∥
<
δ
}
B(\boldsymbol p,\delta)=\{\boldsymbol x:\left\|\boldsymbol x-\boldsymbol p\right\|<\delta\}
B(p,δ)={x:∥x−p∥<δ} is a convex set.
SOLUTION
[Hint: Use the Triangle Inequality.] (三角不等式)
EXAMPLE 7
Let
as shown in Figure 3. Then the set
S
S
S is closed since it contains all its boundary points. The set
S
S
S is bounded since
S
⊂
B
(
0
,
3
)
S\subset B(\boldsymbol 0, 3)
S⊂B(0,3). Thus
S
S
S is also compact.
N o t a t i o n Notation Notation: If f f f is a linear functional, then f ( A ) ≤ d f(A)\leq d f(A)≤d means f ( x ) ≤ d f(\boldsymbol x)\leq d f(x)≤d for each x ∈ A \boldsymbol x\in A x∈A.
strictly seperate: 严格分割
Notice that strict separation requires that the two sets be disjoint, while mere separation does not.
PROOF
Suppose that
(
c
o
n
v
A
)
∩
(
c
o
n
v
B
)
=
ϕ
(conv A)\cap(convB)=\phi
(convA)∩(convB)=ϕ. Since the convex hull of a compact set is compact, Theorem 12 ensures that there is a hyperplane
H
H
H that strictly separates
c
o
n
v
A
convA
convA and
c
o
n
v
B
convB
convB. Clearly,
H
H
H also strictly separates the smaller sets
A
A
A and
B
B
B.
Conversely, suppose the hyperplane H = [ f : d ] H =[f :d] H=[f:d] strictly separates A A A and B B B. Without loss of generality, assume that f ( A ) < d f(A) < d f(A)<d and f ( B ) > d f(B) > d f(B)>d. Let x = c 1 x 1 + . . . + c k x k \boldsymbol x = c_1\boldsymbol x_1+...+ c_k\boldsymbol x_k x=c1x1+...+ckxk be any convex combination of elements of A A A. Then
Thus
f
(
c
o
n
v
A
)
<
d
f(conv A) < d
f(convA)<d. Likewise,
f
(
c
o
n
v
B
)
>
d
f(convB) > d
f(convB)>d, so
H
=
[
f
:
d
]
H=[f :d]
H=[f:d] strictly separates
c
o
n
v
A
convA
convA and
c
o
n
v
B
convB
convB. By Theorem 12,
c
o
n
v
A
convA
convA and
c
o
n
v
B
convB
convB must be disjoint.
EXERCISE 14
Let
F
1
F_1
F1 and
F
2
F_2
F2 be 4-dimensional flats in
R
6
\R^6
R6, and suppose that
F
1
∩
F
2
≠
ϕ
F_1\cap F_2 \neq\phi
F1∩F2=ϕ. What are the possible dimensions of
F
1
∩
F
2
F_1\cap F_2
F1∩F2?
SOLUTION
下面的答案是我自己写的,感觉论证啰嗦且不太严谨,仅供参考
如果有好的解答,欢迎一起交流~
Let
F
1
=
W
1
+
p
1
,
F
2
=
W
2
+
p
2
F_1=W_1+\boldsymbol p_1,F_2=W_2+\boldsymbol p_2
F1=W1+p1,F2=W2+p2, where
W
1
,
W
2
W_1,W_2
W1,W2 are two 4-dimensional subspaces. Suppose
a
1
,
.
.
.
,
a
4
\boldsymbol a_1,...,\boldsymbol a_4
a1,...,a4 and
b
1
,
.
.
.
,
b
4
\boldsymbol b_1,...,\boldsymbol b_4
b1,...,b4 be the basis of
W
1
W_1
W1 and
W
2
W_2
W2 respectively. Let
F
1
∩
F
2
=
W
+
p
F_1\cap F_2=W+\boldsymbol p
F1∩F2=W+p and
x
∈
W
\boldsymbol x\in W
x∈W, then there exist
m
i
,
n
i
∈
R
m_i,n_i\in R
mi,ni∈R (
1
≤
i
≤
4
1\leq i\leq4
1≤i≤4) such that
p
1
+
m
1
a
1
+
.
.
.
+
m
4
a
4
=
p
2
+
n
1
b
1
+
.
.
.
+
n
4
b
4
=
x
+
p
p
1
−
p
+
m
1
a
1
+
.
.
.
+
m
4
a
4
=
p
2
−
p
+
n
1
b
1
+
.
.
.
+
n
4
b
4
=
x
(
1
)
\boldsymbol p_1+m_1\boldsymbol a_1+...+m_4\boldsymbol a_4=\boldsymbol p_2+n_1\boldsymbol b_1+...+n_4\boldsymbol b_4=\boldsymbol x+\boldsymbol p\\ \boldsymbol p_1-\boldsymbol p+m_1\boldsymbol a_1+...+m_4\boldsymbol a_4=\boldsymbol p_2-\boldsymbol p+n_1\boldsymbol b_1+...+n_4\boldsymbol b_4=\boldsymbol x\ \ \ (1)
p1+m1a1+...+m4a4=p2+n1b1+...+n4b4=x+pp1−p+m1a1+...+m4a4=p2−p+n1b1+...+n4b4=x (1)
Notice that d i m F 1 ∩ F 2 = d i m W dimF_1\cap F_2=dimW dimF1∩F2=dimW ( d i m W dimW dimW is the dimension of the solution set of x \boldsymbol x x).
Since x \boldsymbol x x can be 0 \boldsymbol 0 0, there exist m i ′ , n i ′ ∈ R m_i',n_i'\in R mi′,ni′∈R ( 1 ≤ i ≤ 4 1\leq i\leq4 1≤i≤4) such that
p 1 + m 1 ′ a 1 + . . . + m 4 ′ a 4 = p 2 + n 1 ′ b 1 + . . . + n 4 ′ b 4 = p ( 2 ) \boldsymbol p_1+m_1'\boldsymbol a_1+...+m_4'\boldsymbol a_4=\boldsymbol p_2+n_1'\boldsymbol b_1+...+n_4'\boldsymbol b_4=\boldsymbol p\ \ \ (2) p1+m1′a1+...+m4′a4=p2+n1′b1+...+n4′b4=p (2)
From
(
2
)
(2)
(2), we know that
p
1
−
p
\boldsymbol p_1-\boldsymbol p
p1−p is a linear combination of
{
a
1
,
.
.
.
,
a
4
}
\{\boldsymbol a_1,...,\boldsymbol a_4\}
{a1,...,a4} and
p
2
−
p
\boldsymbol p_2-\boldsymbol p
p2−p is a linear combination of
{
b
1
,
.
.
.
,
b
4
}
\{\boldsymbol b_1,...,\boldsymbol b_4\}
{b1,...,b4}.Thus by
(
1
)
(1)
(1), there exist
t
i
,
s
i
∈
R
t_i,s_i\in R
ti,si∈R (
1
≤
i
≤
4
1\leq i\leq4
1≤i≤4) such that
t
1
a
1
+
.
.
.
+
t
4
a
4
=
s
1
b
1
+
.
.
.
+
s
4
b
4
=
x
(
3
)
t
1
a
1
+
.
.
.
+
t
4
a
4
−
s
1
b
1
−
.
.
.
−
s
4
b
4
=
0
(
4
)
t_1\boldsymbol a_1+...+t_4\boldsymbol a_4=s_1\boldsymbol b_1+...+s_4\boldsymbol b_4=\boldsymbol x\ \ \ (3)\\ t_1\boldsymbol a_1+...+t_4\boldsymbol a_4-s_1\boldsymbol b_1-...-s_4\boldsymbol b_4=\boldsymbol 0\ \ \ (4)
t1a1+...+t4a4=s1b1+...+s4b4=x (3)t1a1+...+t4a4−s1b1−...−s4b4=0 (4)
Let
A
=
[
a
1
.
.
.
a
4
b
1
.
.
.
b
4
]
A=\begin{bmatrix}\boldsymbol a_1&...&\boldsymbol a_4&\boldsymbol b_1&...&\boldsymbol b_4\end{bmatrix}
A=[a1...a4b1...b4], then according to
(
4
)
(4)
(4):
A
[
t
1
.
.
.
t
4
−
s
1
.
.
.
−
s
4
]
=
0
∵
4
≤
r
a
n
k
A
≤
6
,
d
i
m
N
u
l
A
=
8
−
r
a
n
k
A
∴
2
≤
d
i
m
N
u
l
A
≤
4
A\begin{bmatrix}t_1\\...\\t_4\\-s_1\\...\\-s_4\end{bmatrix}=\boldsymbol 0\\\because 4\leq rankA\leq6,dimNulA=8-rankA\\\therefore 2\leq dimNulA\leq 4
A⎣⎢⎢⎢⎢⎢⎢⎡t1...t4−s1...−s4⎦⎥⎥⎥⎥⎥⎥⎤=0∵4≤rankA≤6,dimNulA=8−rankA∴2≤dimNulA≤4
According to ( 3 ) (3) (3), it can be shown that the dimension of the solution set of x \boldsymbol x x and [ t 1 . . . t 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix} ⎣⎡t1...t4⎦⎤ are equal. We can also observe that [ t 1 . . . t 4 ] ↦ [ s 1 . . . s 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix}\mapsto\begin{bmatrix}s_1\\...\\s_4\end{bmatrix} ⎣⎡t1...t4⎦⎤↦⎣⎡s1...s4⎦⎤ is a linear transformation, thus d i m N u l A = dimNulA= dimNulA= the dimension of the solution set of [ t 1 . . . t 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix} ⎣⎡t1...t4⎦⎤. So d i m F 1 ∩ F 2 = d i m N u l A dimF_1\cap F_2=dimNulA dimF1∩F2=dimNulA and 2 ≤ d i m F 1 ∩ F 2 ≤ 4 2\leq dimF_1\cap F_2\leq4 2≤dimF1∩F2≤4.