注:本文是对Matrix Analysis and Applied Linear Algebra一书4.1节Space and Subspace和4.2节Four Fundamental Subspaces的学习笔记
文章目录
向量空间(Vector Space)
定义一个向量空间涉及4样东西:非空向量集 V \mathcal V V,标量场 F \mathcal F F,代数运算向量加(vector addition)和标量乘(salar multiplication)
The set V \mathcal V V is called a vector space over F \mathcal F F when the vector addition and scalar multiplication operations satisfy the following properties:
(A1) x + y ∈ V \mathbf{x}+\mathbf {y}\in \mathcal V x+y∈V for all x , y ∈ V \mathbf {x},\mathbf{y}\in \mathcal V x,y∈V (closure property for vector addition)
(A2) ( x + y ) + z = x + ( y + z ) (\mathbf{x}+\mathbf y)+\mathbf z=\mathbf x+(\mathbf y+\mathbf z) (x+y)+z=x+(y+z) for all x , y , z ∈ V \mathbf x,\mathbf y,\mathbf z\in \mathcal V x,y,z∈V
(A3) x + y = y + x \mathbf{x}+\mathbf y=\mathbf y+\mathbf x x+y=y+x for all x , y ∈ V \mathbf x,\mathbf y\in \mathcal V x,y∈V
(A4) There is an element 0 ∈ V \mathbf 0\in \mathcal V 0∈V such that x + 0 = x \mathbf x+\mathbf 0=\mathbf x x+0=x for all x ∈ V \mathbf x\in \mathcal V x∈V
(A5) For each x ∈ V \mathbf x\in \mathcal V x∈V, there is an element ( − x ) ∈ V (-\mathbf x)\in \mathcal V (−x)∈V such that x + ( − x ) = 0 \mathbf x+(-\mathbf x)=\mathbf 0 x+(−x)=0
(M1) α x ∈ V \alpha \mathbf x \in \mathcal V αx∈V for all α ∈ F \alpha \in \mathcal F α∈F and x ∈ V \mathbf x\in \mathcal V x∈V. (closure property for scalar multiplication)
(M2) ( α β ) x = α ( β x ) (\alpha \beta)\mathbf x=\alpha (\beta \mathbf x) (αβ)x=α(βx) for all α , β ∈ F \alpha,\beta \in \mathcal F α,β∈F and every x ∈ V \mathbf x\in \mathcal V x∈V
(M3) α ( x + y ) = α x + α y \alpha (\mathbf x+\mathbf y)=\alpha \mathbf x +\alpha \mathbf y α(x+y)=αx+αy for every α ∈ F \alpha \in \mathcal F α∈F and all x , y ∈ V \mathbf x,\mathbf y\in \mathcal V x,y∈V
(M4) ( α + β ) x = α x + β x (\alpha+\beta)\mathbf x=\alpha \mathbf x+\beta \mathbf x (α+β)x=αx+βx for all α , β ∈ F \alpha ,\beta \in \mathcal F α,β∈F and every x ∈ V \mathbf x\in \mathcal V x∈V
(M5) 1 x = x 1\mathbf x=\mathbf x 1x=x for all x ∈ V \mathbf x \in \mathcal V x∈V
一些向量空间的例子:
-
The set ℜ m × n \Re^{m \times n} ℜm×n of m × n m \times n m×n real matrices is a vector space over ℜ \Re ℜ.
-
The set C m × n \mathcal{C}^{m \times n} Cm×n of m × n m \times n m×n complex matrices is a vector space over C \mathcal{C} C.
-
If n = 1 n=1 n=1(or m = 1 m=1 m=1), we have coordinate spaces
ℜ m × 1 = { ( x 1 x 2 ⋮ x m ) , x i ∈ ℜ } or C m × 1 = { ( x 1 x 2 ⋮ x m ) , x i ∈ C } \Re^{m\times 1}=\left\{ \left(\begin{matrix} x_1\\x_2\\ \vdots\\x_m \end{matrix} \right),x_i\in \Re\right\}\text{ or } \mathcal{C}^{m\times 1}=\left\{ \left(\begin{matrix} x_1\\x_2\\ \vdots\\x_m \end{matrix} \right),x_i\in \mathcal{C}\right\} ℜm×1=⎩⎪⎪⎪⎨⎪⎪⎪⎧⎝⎜⎜⎜⎛x1x2⋮xm⎠⎟⎟⎟⎞,xi∈ℜ⎭⎪⎪⎪⎬⎪⎪⎪⎫ or Cm×1=⎩⎪⎪⎪⎨⎪⎪⎪⎧⎝⎜⎜⎜⎛x1x2⋮xm⎠⎟⎟⎟⎞,xi∈C⎭⎪⎪⎪⎬⎪⎪⎪⎫ -
With function addition and multiplication defined by
( f + g ) ( x ) = f ( x ) + g ( x ) and ( α f ) ( x ) = α f ( x ) (f+g)(x)=f(x)+g(x)\text{ and } (\alpha f)(x)=\alpha f(x) (f+g)(x)=f(x)+g(x) and (αf)(x)=αf(x)
the following sets are vector spaces over ℜ \Re ℜ:- The set of functions mapping the interval [ 0 , 1 ] [0,1] [0,1] into ℜ \Re ℜ
- The set of all real-valued continuous functions defined on [ 0 , 1 ] [0,1] [0,1]
- The set of real-valued functions that are differentiable on [ 0 , 1 ] [0,1] [0,1]
- The set of all polynomials with real coefficients
-
Consider the vector space ℜ 2 \Re ^2 ℜ2, and let L = { ( x , y ) ∣ y = α x } \mathcal L=\{(x,y)|y=\alpha x\} L={(x,y)∣y=αx} be a line through the origin. L \mathcal L L is a subset of ℜ 2 \Re ^2 ℜ2, but L \mathcal L L is a special kind of subset because L \mathcal L L also satisfies the properties (A1-A5) and (M1-M5) that define a vector space.
最后一个例子表明,一个向量空间可能包含一个“更小”的向量空间,这就引出了子空间(subspace)的定义。
子空间(Subspace)
Let S \mathcal S S be a nonempty subset of a vector space V \mathcal V V over F \mathcal F F, i.e., S ⊆ V \mathcal S \subseteq \mathcal V S⊆V.
If S \mathcal S S is also a vector space over F \mathcal F F using the same addition and scalar multiplication operations, then S \mathcal S S is said to be a subspace of V \mathcal V V.
It’s not necessary to check all 10 of the defining conditions in order to determine if a subset is also a subspace—only the closure conditions (A1) and (M1) need to be considered. That is, a nonempty subset S \mathcal S S of a vector space V \mathcal V V is a subspace of V \mathcal V V if and only if
(A1) x , y ∈ S ⟹ x + y ∈ S \mathbf x,y \in \mathcal S\Longrightarrow \mathbf x+\mathbf y\in \mathcal S x,y∈S⟹x+y∈S
(M1) x ∈ S ⟹ α x ∈ S \mathbf x\in \mathcal S\Longrightarrow \alpha \mathbf x\in \mathcal S x∈S⟹αx∈S for all α ∈ F \alpha \in \mathcal F α∈F
为什么满足两条就够了?
首先,因为 S \mathcal S S是 V \mathcal V V的子集,它自动满足除了(A1), (A4), (A5), (M1)之外的所有条件,而(A1)+(M1)是可以推出(A4), (A5)的:由(M1)有 ( − x ) = ( − 1 ) x ∈ S (-\mathbf x)=(-1)\mathbf x\in \mathcal S (−x)=(−1)x∈S,可以推出(A5)成立;由(A1)又有 x + ( − x ) ∈ S \mathbf x+(-\mathbf x)\in \mathcal S x+(−x)∈S, 结合(M1)有 0 ∈ S \mathbf 0\in \mathcal S 0∈S, (A4)成立。
一些子空间的例子:
-
Given a vector space V \mathcal V V, the set Z = { 0 } \mathcal Z=\{\mathbf 0\} Z={0} containing only the zero vector is a subspace of V \mathcal V V. This subspace is called the trivial subspace.
-
We have already observed that straight lines through the origin in ℜ 2 \Re^2 ℜ2 are subspaces, but what about straight lines not through the origin? No—they cannot be subspaces because subspaces must contain the zero vector. Consequently, the only proper subspaces of ℜ 2 \Re ^2 ℜ2 are the trivial subspace and lines through the origin.
-
Similarly, the only proper subspaces of ℜ 3 \Re ^3 ℜ3 are the trivial subspace and lines through the origin, and planes through the origin.
-
Generalize to higher dimension: For a set of vectors S = { v 1 , v 2 , ⋯ , v r } \mathcal S=\{ \mathbf v_1,\mathbf v_2,\cdots,\mathbf v_r\} S={v1,v2,⋯,vr} from a vector space V \mathcal V V, the set of all possible linear combinations of the v i \mathbf v_i vi's is denoted by
s p a n ( S ) = { α 1 v 1 + α 2 v 2 + ⋯ + α r v r ∣ α i ∈ F } span(\mathcal S)=\{\alpha_1 \mathbf v_1+\alpha_2 \mathbf v_2+\cdots+\alpha_r \mathbf v_r|\alpha _i \in \mathcal F \} span(S)={α1v1+α2v2+⋯+αrvr∣αi∈F}
s p a n ( S ) span(\mathcal S) span(S) is a subspace of V \mathcal V V.
事实上,所有的子空间都可以用 s p a n ( S ) span (\mathcal S) span(S)的形式表示出来,因此我们引入了下面的定义
For a set of vectors S = { v 1 , v 2 , ⋯ , v r } \mathcal S=\{ \mathbf v_1,\mathbf v_2,\cdots,\mathbf v_r\} S={v1,v2,⋯,vr}, the subspace
s p a n ( S ) = { α 1 v 1 + α 2 v 2 + ⋯ + α r v r } span(\mathcal S)=\{\alpha_1 \mathbf v_1+\alpha_2 \mathbf v_2+\cdots+\alpha_r \mathbf v_r\} span(S)={α1v1+α2v2+⋯+αrvr}
generated by forming all linear combinations of vectors from S \mathcal S S is called the space spanned by S \mathcal S S.If V \mathcal V V is a vector space such that V = s p a n ( S ) \mathcal V=span(\mathcal S) V=span(S), we say S \mathcal S S is a spanning set for V \mathcal V V. In other words, S \mathcal S S spans V \mathcal V V whenever each vector in V \mathcal V V is a linear combination of vectors from S \mathcal S S.
一些例子:
- S = { ( 1 1 ) , ( 2 2 ) } \mathcal S=\left\{\left(\begin{matrix}1\\1\end{matrix}\right) ,\left(\begin{matrix}2\\2\end{matrix}\right)\right\} S={(11),(22)} spans the line y = x y=x y=x in ℜ 2 \Re^2 ℜ2
- The unit vectors { e 1 , e 2 , ⋯ , e n } \{\mathbf e_1,\mathbf e_2,\cdots,\mathbf e_n\} {e1,e2,⋯,en} in ℜ n \Re ^n ℜn form a spanning set for ℜ n \Re^n ℜn.
- For a set of vectors S = { a 1 , ⋯ , a n } \mathcal S=\{\mathbf a_1,\cdots,\mathbf a_n\} S={a1,⋯,an} from a subspace V ⊆ ℜ m × 1 \mathcal V\subseteq \Re^{m\times 1} V⊆ℜm×1, let A \mathbf A A be a matrix containing the a i \mathbf a_i ai's as its columns. S \mathcal S S spans V \mathcal V V iff for each b ∈ V \mathbf b\in \mathcal V b∈V there corresponds a column x \mathbf x x such that A x = b \mathbf A \mathbf x=\mathbf b Ax=b. (i.e., iff A x = b \mathbf A \mathbf x=\mathbf b Ax=b is a consistent system for every b ∈ V ⟺ r a n k [ A ∣ b ] = r a n k ( A ) \mathbf b\in \mathcal V \iff rank[\mathbf A|\mathbf b]=rank(\mathbf A) b∈V⟺rank[A∣b]=rank(A))
- The finite set { 1 , x , x 2 , ⋯ , x n } \{1,x,x^2,\cdots,x^n\} {1,x,x2,⋯,xn} spans the space of all polynomials such that d e g p ( x ) ≤ n \mathrm{deg}~p(x)\le n deg p(x)≤n, and the infinite set { 1 , x , x 2 , ⋯ } \{1,x,x^2,\cdots\} {1,x,x2,⋯} spans the space of all polynomials.
我们也可以定义两个子空间的“加法”,相加后得到的依然是一个子空间,而且此时的spanning set就是原先两个spanning sets的并:
If X \mathcal X X and Y \mathcal Y Y are subspaces of a vector space V \mathcal V V, then the sum of X \mathcal X X and Y \mathcal Y Y is defined to be the set of all possible sums of vectors from X \mathcal X X with vectors from Y \mathcal Y Y. That is
X + Y = { x + y ∣ x ∈ X and y ∈ Y } \mathcal X+\mathcal Y=\{\mathbf x+\mathbf y|\mathbf x\in \mathcal X\text{ and } \mathbf y\in \mathcal Y\} X+Y={x+y∣x∈X and y∈Y}
- The sum X + Y \mathcal X+\mathcal Y X+Y is again a subspace of V \mathcal V V
- If S X , S Y \mathcal S_X,\mathcal S_Y SX,SY span X , Y \mathcal X,\mathcal Y X,Y, then S X ∪ S Y \mathcal S_X \cup \mathcal S_Y SX∪SY spans X + Y \mathcal X+\mathcal Y X+Y
一个例子:若 X ⊆ ℜ 2 \mathcal X\subseteq\Re^2 X⊆ℜ2和 Y ⊆ ℜ 2 \mathcal Y\subseteq \Re^2 Y⊆ℜ2是两个子空间,在几何上相当于两条过原点的线,那么 X + Y = ℜ 2 \mathcal X+\mathcal Y=\Re^2 X+Y=ℜ2. (平行四边形法则)
四个基本子空间
列空间(Column Space)和行空间(Row Space)
子空间的概念和线性方程其实是紧密联系在一起的。
For a linear function f f f mapping ℜ n \Re^n ℜn into ℜ m \Re^m ℜm, let R ( f ) \mathcal R(f) R(f) denote the range of f f f. That is, R ( f ) = { f ( x ) ∣ x ∈ ℜ n } ⊆ ℜ m \mathcal R(f)=\{f(\mathbf x)|\mathbf x\in \Re^n\}\subseteq \Re^m R(f)={f(x)∣x∈ℜn}⊆ℜm is the set of all “images” as x \mathbf x x varies freely over ℜ n \Re^n ℜn.
- The range of every linear function f : ℜ n → ℜ m f:\Re^n\to \Re^m f:ℜn→ℜm is a subspace of ℜ m \Re^m ℜm, and every subspace of ℜ m \Re^m ℜm is the range of some linear function.
For this reason, subspaces of ℜ m \Re^m ℜm are sometimes called linear spaces.
这个结果表明对于任意矩阵 A ∈ ℜ m × n \mathbf A\in \Re^{m\times n} A∈ℜm×n,都可以由线性方程 f ( x ) = A x f(\mathbf x)=\mathbf A\mathbf x f(x)=Ax构造一个 ℜ m \Re^m ℜm的子空间。类似地, A T ∈ ℜ n × m \mathbf A^T\in \Re^{n\times m} AT∈ℜn×m也可以由 f ( y ) = A T y f(\mathbf y)=\mathbf A^T\mathbf y f(y)=ATy定义一个 ℜ n \Re^n ℜn的子空间。从这里我们可以得到四个基本子空间中的两种:
The range of a matrix A ∈ ℜ m × n \mathbf A\in \Re^{m\times n} A∈ℜm×n is defined to be the subspace R ( A ) \mathcal R(\mathbf A) R(A) of ℜ m \Re^m ℜm that is generated by the range of f ( x ) = A x f(\mathbf x)=\mathbf A \mathbf x f(x)=Ax. That is
R ( A ) = { A x ∣ x ∈ ℜ n } ⊆ ℜ m b ∈ R ( A ) ⟺ b = A x for some x \mathcal R(\mathbf A)=\{\mathbf A \mathbf x|\mathbf x\in \Re^n\}\subseteq \Re^m\\ \mathbf b \in \mathcal R(\mathbf A)\iff \mathbf b=\mathbf A \mathbf x\text{ for some }\mathbf x R(A)={Ax∣x∈ℜn}⊆ℜmb∈R(A)⟺b=Ax for some x
Similarly, the range of matrix A T \mathbf A^T AT is the subspace of ℜ n \Re^n ℜn defined by
R ( A T ) = { A T y ∣ y ∈ ℜ m } ⊆ ℜ n a ∈ R ( A T ) ⟺ a = A T y for some y \mathcal R(\mathbf A^T)=\{\mathbf A^T \mathbf y|\mathbf y\in \Re^m\}\subseteq \Re^n\\ \mathbf a\in \mathcal R(\mathbf A^T)\iff \mathbf a=\mathbf A^T \mathbf y\text{ for some }\mathbf y R(AT)={ATy∣y∈ℜm}⊆ℜna∈R(AT)⟺a=ATy for some y
Some people also call R ( A ) \mathcal R(\mathbf A) R(A) the image space of A \mathbf A A.
如果我们把
A
\mathbf A
A和
x
\mathbf x
x拆开来写
A
x
=
(
A
∗
1
∣
A
∗
2
∣
⋯
∣
A
∗
n
)
(
ξ
1
ξ
2
⋮
ξ
n
)
=
∑
j
=
1
n
ξ
j
A
∗
j
\mathbf A \mathbf x=(\mathbf A_{*1}|\mathbf A_{*2}|\cdots|\mathbf A_{*n})\left(\begin{matrix}\xi_1\\\xi_2\\\vdots\\\xi_n\end{matrix}\right)=\sum_{j=1}^n\xi_j \mathbf A_{*j}
Ax=(A∗1∣A∗2∣⋯∣A∗n)⎝⎜⎜⎜⎛ξ1ξ2⋮ξn⎠⎟⎟⎟⎞=j=1∑nξjA∗j
可以看到
R
(
A
)
\mathcal R(\mathbf A)
R(A)其实就是
A
\mathbf A
A的列向量的线性组合,也就是由
A
\mathbf A
A的列向量张成的空间,这就是为什么
R
(
A
)
\mathcal R(\mathbf A)
R(A)经常被称作**列空间(column space)**.
类似地, R ( A T ) \mathcal R(\mathbf A^T) R(AT)是 A \mathbf A A的行向量张成的空间,也被称为**行空间(row space)**.
有时候我们想知道两个矩阵是否有一样的列空间/行空间,这时我们可以看它们是否列等价/行等价:
For two matrices A \mathbf A A and B \mathbf B B of the same shape:
- R ( A ) = R ( B ) \mathcal R(\mathbf A)=\mathcal R(\mathbf B) R(A)=R(B) iff A ∼ col B \mathbf A\stackrel{\text{col}}{\sim}\mathbf B A∼colB
- R ( A T ) = R ( B T ) \mathcal R(\mathbf A^T)=\mathcal R(\mathbf B^T) R(AT)=R(BT) iff A ∼ row B \mathbf A \stackrel{\text{row}}{\sim}\mathbf B A∼rowB
证明:证明行等价的情况,列等价时类似。
⟸
:
\Longleftarrow:
⟸: 首先根据
A
∼
row
B
\mathbf A \stackrel{\text{row}}{\sim}\mathbf B
A∼rowB可知存在可逆矩阵
P
\mathbf P
P使得
P
A
=
B
\mathbf P \mathbf A=\mathbf B
PA=B. 假设向量
a
\mathbf a
a属于行空间
R
(
A
T
)
\mathcal R(\mathbf A^T)
R(AT)
a
∈
R
(
A
T
)
⟺
a
T
=
y
T
A
=
y
T
P
−
1
P
A
for some
y
T
⟺
a
T
=
z
T
B
for
z
T
=
y
T
P
−
1
⟺
a
∈
R
(
B
T
)
\begin{aligned} \mathbf a\in \mathcal R(\mathbf A^T) & \iff \mathbf a^T=\mathbf y^T \mathbf A=\mathbf y^T \mathbf P^{-1}\mathbf P \mathbf A\text{ for some }\mathbf y^T\\ & \iff \mathbf a^T=\mathbf z^T \mathbf B\text{ for }\mathbf z^T =\mathbf y^T \mathbf P^{-1}\\ & \iff \mathbf a\in \mathcal R(\mathbf B^T) \end{aligned}
a∈R(AT)⟺aT=yTA=yTP−1PA for some yT⟺aT=zTB for zT=yTP−1⟺a∈R(BT)
⟹
:
\Longrightarrow:
⟹: 若
R
(
A
T
)
=
R
(
B
T
)
\mathcal R(\mathbf A^T)=\mathcal R(\mathbf B^T)
R(AT)=R(BT), 则
s
p
a
n
{
A
1
∗
,
A
2
∗
,
⋯
,
A
m
∗
}
=
s
p
a
n
{
B
1
∗
,
B
2
∗
,
⋯
,
B
m
∗
}
span\{\mathbf A_{1*},\mathbf A_{2*},\cdots,\mathbf A_{m*} \}=span\{\mathbf B_{1*},\mathbf B_{2*},\cdots,\mathbf B_{m*}\}
span{A1∗,A2∗,⋯,Am∗}=span{B1∗,B2∗,⋯,Bm∗}
B
\mathbf B
B的每一行都能用矩阵
A
\mathbf A
A行向量的线性组合表示出来,反之也成立。因此可以通过行变换由
A
\mathbf A
A变到
B
\mathbf B
B(略去细节),也即
A
∼
row
B
\mathbf A \stackrel{\text{row}}{\sim}\mathbf B
A∼rowB.
我们已经知道 A \mathbf A A的行向量可以张成 R ( A T ) \mathcal R(\mathbf A^T) R(AT), A \mathbf A A的列向量可以张成 R ( A ) \mathcal R(\mathbf A) R(A). 但有些时候张成这些空间并不需要用到所有的行向量/列向量:
Let A \mathbf A A be an m × n m\times n m×n matrix, and let U \mathbf U U be any row echelon form derived from A \mathbf A A. Spanning sets for the row and column spaces are as follows:
- The nonzero rows of U \mathbf U U span R ( A T ) \mathcal R(\mathbf A^T) R(AT)
- The basic columns in A \mathbf A A span R ( A ) \mathcal R(\mathbf A) R(A)
这条性质可以通过证明列等价/行等价并运用上一条性质得到。
零空间(Nullspace)和左零空间(Left-Hand Nullspace)
另外两个基本子空间也可以从线性函数的角度考虑。记
f
f
f为一个从
ℜ
n
\Re^n
ℜn映射到
ℜ
m
\Re^m
ℜm的线性函数,定义
N
(
f
)
=
{
x
∣
f
(
x
)
=
0
}
\mathcal N(f)=\{\mathbf x|f(\mathbf x)=\mathbf 0\}
N(f)={x∣f(x)=0}
容易证明
N
(
f
)
\mathcal N(f)
N(f)是一个子空间,因为它满足**(A1)和(M1)**:
若
x
1
,
x
2
∈
N
(
f
)
\mathbf x_1,\mathbf x_2\in \mathcal N(f)
x1,x2∈N(f), 则由
f
f
f的线性性,
f
(
x
1
+
x
2
)
=
f
(
x
1
)
+
f
(
x
2
)
=
0
+
0
=
0
⟹
x
1
+
x
2
∈
N
(
f
)
f(\mathbf x_1+\mathbf x_2)=f(\mathbf x_1)+f(\mathbf x_2)=\mathbf 0+\mathbf 0=\mathbf 0\Longrightarrow \mathbf x_1+\mathbf x_2 \in \mathcal N(f)
f(x1+x2)=f(x1)+f(x2)=0+0=0⟹x1+x2∈N(f)
类似地,若
α
∈
ℜ
\alpha \in \Re
α∈ℜ,且
x
∈
N
(
f
)
\mathbf x\in \mathcal N(f)
x∈N(f),
f
(
α
x
)
=
α
f
(
x
)
=
α
0
=
0
⟹
α
x
∈
N
(
f
)
f(\alpha \mathbf x)=\alpha f(\mathbf x)=\alpha \mathbf 0=\mathbf 0\Longrightarrow \alpha \mathbf x\in \mathcal N(f)
f(αx)=αf(x)=α0=0⟹αx∈N(f)
接下来我们给出**零空间(Nullspace)和左零空间(Left-Hand Nullspace)**的正式定义:
- For an m × n m\times n m×n matrix A \mathbf A A, the set N ( A ) = { x n × 1 ∣ A x = 0 } ⊆ ℜ n \mathcal N(\mathbf A)=\{\mathbf x_{n\times 1}|\mathbf A \mathbf x=\mathbf 0 \}\subseteq \Re^n N(A)={xn×1∣Ax=0}⊆ℜn is called the nullspace of A \mathbf A A. In other words, N ( A ) \mathcal N(\mathbf A) N(A) is simply the set of all solutions to the homogeneous system A x = 0 \mathbf A \mathbf x=\mathbf 0 Ax=0.
- The set N ( A T ) = { y m × 1 ∣ A T y = 0 } ⊆ ℜ m \mathcal N(\mathbf A^T)=\{\mathbf y_{m\times 1}|\mathbf A^T\mathbf y=\mathbf0 \}\subseteq \Re^m N(AT)={ym×1∣ATy=0}⊆ℜm is called the left-hand nullspace of A \mathbf A A because N ( A T ) \mathcal N(\mathbf A^T) N(AT) is the set of all solutions to the left-hand homogeneous system y T A = 0 T \mathbf y^T \mathbf A=\mathbf 0^T yTA=0T.
既然零空间是子空间,它也可以被一组向量张成。由齐次方程组解的相关结论(MA&ALA2_行阶梯矩阵和秩)
To determine a spanning set for N ( A ) \mathcal N (\mathbf A) N(A), where r a n k ( A m × n ) = r rank (\mathbf A_{m\times n}) = r rank(Am×n)=r, row reduce A \mathbf A A to a row echelon form U \mathbf U U, and solve U x = 0 \mathbf U\mathbf x = \mathbf 0 Ux=0 for the basic variables in terms of the free variables to produce the general solution of A x = 0 \mathbf A\mathbf x =\mathbf 0 Ax=0 in the form
x = x f 1 h 1 + x f 2 h 2 + ⋯ + x f n − r h n − r \mathbf x=x_{f_1}\mathbf h_1+x_{f_2}\mathbf h_2+\cdots+x_{f_{n-r}}\mathbf h_{n-r} x=xf1h1+xf2h2+⋯+xfn−rhn−r
- By definition, the set H = { h 1 , h 2 , ⋯ , h n − r } \mathcal H=\{\mathbf h_1,\mathbf h_2,\cdots,\mathbf h_{n-r} \} H={h1,h2,⋯,hn−r} spans N ( A ) \mathcal N(\mathbf A) N(A).
- N ( A ) = { 0 } ⟺ r a n k ( A ) = n \mathcal N(\mathbf A)=\{\mathbf 0\}\iff rank(\mathbf A)=n N(A)={0}⟺rank(A)=n
- N ( A T ) = { 0 } ⟺ r a n k ( A ) = m \mathcal N(\mathbf A^T)=\{\mathbf 0\}\iff rank(\mathbf A)=m N(AT)={0}⟺rank(A)=m
接下来我们考虑左零空间的生成集合(spanning set). 当然我们可以把 A T \mathbf A^T AT看作 A ′ \mathbf A' A′求最简阶梯型 E A T \mathbf E_{\mathbf A^T} EAT然后得到齐次方程组的解以及生成集合,但是考虑到其它三个基本子空间都可以通过 E A \mathbf E_{\mathbf A} EA求得,我们希望最好也能从 E A \mathbf E_{\mathbf A} EA出发得到左零空间的生成集合。但这不是一个显然的过程,因为 E A ≠ E A T \mathbf E_{\mathbf A}\ne \mathbf E_{\mathbf A ^T} EA=EAT.
If r a n k ( A m × n ) = r rank\left(\mathbf{A}_{m \times n}\right)=r rank(Am×n)=r, and if P A = U \mathbf{P A}=\mathbf{U} PA=U, where P \mathbf{P} P is nonsingular and U \mathbf{U} U is in row echelon form, then the last m − r m-r m−r rows in P \mathbf{P} P span the left-hand nullspace of A \mathbf A A. In other words, if P = ( P 1 P 2 ) \mathbf{P}=\left(\begin{array}{c}\mathbf{P}_{1} \\ \mathbf{P}_{2}\end{array}\right) P=(P1P2), where P 2 \mathbf{P}_{2} P2 is ( m − r ) × m (m-r) \times m (m−r)×m, then
N ( A T ) = R ( P 2 T ) \mathcal N\left(\mathbf{A}^{T}\right)=\mathcal R\left(\mathbf{P}_{2}^{T}\right) N(AT)=R(P2T)
证明:记 U = ( C 0 ) \mathbf U=\left(\begin{matrix}\mathbf C\\ \mathbf 0\end{matrix} \right) U=(C0), 其中 C r × n \mathbf C_{r\times n} Cr×n,由 P A = U \mathbf P\mathbf A=\mathbf U PA=U有 P 2 A = 0 \mathbf P_2 \mathbf A=\mathbf 0 P2A=0.
首先证明 R ( P 2 T ) ⊆ N ( A T ) \mathcal R(\mathbf P_2^T)\subseteq \mathcal N(\mathbf A^T) R(P2T)⊆N(AT): 若 b ∈ R ( P 2 T ) \mathbf b\in \mathcal R(\mathbf P_2^T) b∈R(P2T),则存在某个 x \mathbf x x使得 P 2 T x = b \mathbf P_2^T\mathbf x=\mathbf b P2Tx=b. 因为 A T b = A T P 2 T x = ( P 2 A ) T x = 0 \mathbf A^T\mathbf b=\mathbf A^T \mathbf P_2^T\mathbf x=(\mathbf P_2 \mathbf A)^T\mathbf x=\mathbf 0 ATb=ATP2Tx=(P2A)Tx=0, 则 b ∈ N ( A T ) \mathbf b\in \mathcal N(\mathbf A^T) b∈N(AT).
然后证明
N
(
A
T
)
⊆
R
(
P
2
T
)
\mathcal N(\mathbf A^T)\subseteq \mathcal R(\mathbf P_2^T)
N(AT)⊆R(P2T). 假设
y
T
∈
N
(
A
T
)
\mathbf y^T\in \mathcal N(\mathbf A^T)
yT∈N(AT), 记
P
−
1
=
(
Q
1
Q
2
)
\mathbf P^{-1}=(\mathbf Q_1~\mathbf Q_2)
P−1=(Q1 Q2)
0
=
y
T
A
=
y
T
P
−
1
U
=
y
T
Q
1
C
⟹
0
=
y
T
Q
1
\mathbf 0=\mathbf y^T \mathbf A=\mathbf y^T \mathbf P^{-1}\mathbf U=\mathbf y^T \mathbf Q_1\mathbf C\Longrightarrow \mathbf 0=\mathbf y^T \mathbf Q_1
0=yTA=yTP−1U=yTQ1C⟹0=yTQ1
这是因为
r
a
n
k
(
C
)
=
r
⟺
N
(
C
T
)
=
{
0
}
rank(\mathbf C)=r\iff\mathcal N(\mathbf C^T)=\{\mathbf 0\}
rank(C)=r⟺N(CT)={0}(见上一条结论)
根据
P
P
−
1
=
I
=
P
−
1
P
\mathbf P \mathbf P^{-1}=\mathbf I=\mathbf P^{-1}\mathbf P
PP−1=I=P−1P,有
P
1
Q
1
=
I
r
\mathbf P_1\mathbf Q_1=\mathbf I_r
P1Q1=Ir和
Q
1
P
1
=
I
m
−
Q
2
P
2
\mathbf Q_1 \mathbf P_1=\mathbf I_m-\mathbf Q_2\mathbf P_2
Q1P1=Im−Q2P2, 那么
0
=
y
T
Q
1
⟹
0
=
y
T
Q
1
P
1
=
y
T
(
I
−
Q
2
P
2
)
⟹
y
T
=
y
T
Q
2
P
2
=
(
y
T
Q
2
)
P
2
⟹
y
∈
R
(
P
2
T
)
\begin{aligned} \mathbf 0=\mathbf y^T \mathbf Q_1&\Longrightarrow\mathbf 0=\mathbf y^T\mathbf Q_1\mathbf P_1=\mathbf y^T(\mathbf I-\mathbf Q_2 \mathbf P_2)\\ & \Longrightarrow\mathbf y^T=\mathbf y^T\mathbf Q_2 \mathbf P_2=(\mathbf y^T\mathbf Q_2) \mathbf P_2\\ &\Longrightarrow \mathbf y\in \mathcal R(\mathbf P_2^T) \end{aligned}
0=yTQ1⟹0=yTQ1P1=yT(I−Q2P2)⟹yT=yTQ2P2=(yTQ2)P2⟹y∈R(P2T)
类似地,我们还可以证明
If r a n k ( A m × n ) = r rank\left(\mathbf{A}_{m \times n}\right)=r rank(Am×n)=r, and if P A = U = ( C 0 ) \mathbf{P A}=\mathbf{U}=\left(\begin{matrix}\mathbf C\\ \mathbf 0\end{matrix} \right) PA=U=(C0), where P = ( P 1 P 2 ) \mathbf{P}=\left(\begin{array}{c}\mathbf{P}_{1} \\ \mathbf{P}_{2}\end{array}\right) P=(P1P2) is nonsingular and U \mathbf{U} U is in row echelon form, then
R ( A ) = N ( P 2 ) \mathcal R(\mathbf A)=\mathcal N(\mathbf P_2) R(A)=N(P2)
证明:
首先证明 R ( A ) ⊆ N ( P 2 ) \mathcal R(\mathbf A)\subseteq \mathcal N(\mathbf P_2) R(A)⊆N(P2): 若 b ∈ R ( A ) \mathbf b\in \mathcal R(\mathbf A) b∈R(A),则存在某个 x \mathbf x x使得 A x = b \mathbf A\mathbf x=\mathbf b Ax=b. 因为 P 2 b = P 2 A x = ( P 2 A ) x = 0 \mathbf P_2\mathbf b=\mathbf P_2 \mathbf A\mathbf x=(\mathbf P_2 \mathbf A)\mathbf x=\mathbf 0 P2b=P2Ax=(P2A)x=0, 则 b ∈ N ( P 2 ) \mathbf b\in \mathcal N(\mathbf P_2) b∈N(P2).
然后证明
N
(
P
2
)
⊆
R
(
A
)
\mathcal N(\mathbf P_2)\subseteq \mathcal R(\mathbf A)
N(P2)⊆R(A). 若
b
∈
N
(
P
2
)
\mathbf b\in \mathcal N(\mathbf P_2)
b∈N(P2),
P
b
=
(
P
1
P
2
)
b
=
(
P
1
b
P
2
b
)
=
(
d
r
×
1
0
)
\mathbf P \mathbf b=\left(\begin{matrix}\mathbf P_1\\\mathbf P_2 \end{matrix}\right)\mathbf b=\left(\begin{matrix}\mathbf P_1 \mathbf b\\\mathbf P_2 \mathbf b\end{matrix}\right)=\left(\begin{matrix}\mathbf d_{r\times 1}\\\mathbf 0\end{matrix}\right)
Pb=(P1P2)b=(P1bP2b)=(dr×10)
因此
P
(
A
∣
b
)
=
(
P
A
∣
P
b
)
=
(
C
d
0
0
)
\mathbf P(\mathbf A|\mathbf b)=(\mathbf P\mathbf A|\mathbf P \mathbf b)=\left(\begin{matrix}\mathbf C & \mathbf d\\ \mathbf 0 & \mathbf 0 \end{matrix}\right)
P(A∣b)=(PA∣Pb)=(C0d0), 这意味着
r
a
n
k
[
A
∣
b
]
=
r
=
r
a
n
k
(
A
)
rank[\mathbf A|\mathbf b]=r=rank(\mathbf A)
rank[A∣b]=r=rank(A)
也就是说
A
x
=
b
\mathbf A \mathbf x=\mathbf b
Ax=b有唯一解,因此
b
∈
R
(
A
)
\mathbf b\in \mathcal R(\mathbf A)
b∈R(A).
有时候我们需要知道两个矩阵是否有同样的零空间或者左零空间,可以用下面这条性质:
For two matrices A \mathbf A A and B \mathbf B B of the same shape:
- N ( A ) = N ( B ) ⟺ A ∼ row B \mathcal N(\mathbf A)=\mathcal N(\mathbf B)\iff \mathbf A\stackrel{\text{row}}{\sim}\mathbf B N(A)=N(B)⟺A∼rowB
- N ( A T ) = N ( B T ) ⟺ A ∼ col B \mathcal N(\mathbf A^T)=\mathcal N(\mathbf B^T)\iff \mathbf A\stackrel{\text{col}}{\sim}\mathbf B N(AT)=N(BT)⟺A∼colB
总结
The four fundamental subspaces associated with A m × n \mathbf{A}_{m \times n} Am×n are as follows.
The range or column space: R ( A ) = { A x } ⊆ ℜ m R(\mathbf{A})=\{\mathbf{A} \mathbf{x}\} \subseteq \Re^{m} R(A)={Ax}⊆ℜm
The row space or left-hand range: R ( A T ) = { A T y } ⊆ ℜ n R\left(\mathbf{A}^{T}\right)=\left\{\mathbf{A}^{T} \mathbf{y}\right\} \subseteq \Re^{n} R(AT)={ATy}⊆ℜn
The nullspace: N ( A ) = { x ∣ A x = 0 } ⊆ ℜ n N(\mathbf{A})=\{\mathrm{x} | \mathbf{A} \mathbf{x}=\mathbf{0}\} \subseteq \Re^{n} N(A)={x∣Ax=0}⊆ℜn
The left-hand nullspace: N ( A T ) = { y ∣ A T y = 0 } ⊆ ℜ m N\left(\mathbf{A}^{T}\right)=\left\{\mathbf{y} | \mathbf{A}^{T} \mathbf{y}=\mathbf{0}\right\} \subseteq \Re^{m} N(AT)={y∣ATy=0}⊆ℜm
Let P \mathbf{P} P be a nonsingular matrix such that P A = U \mathbf{P A}=\mathbf{U} PA=U, where U \mathbf{U} U is in row echelon form, and suppose r a n k ( A ) = r rank(\mathbf{A})=r rank(A)=r.
Spanning set for R ( A ) = R(\mathbf{A})= R(A)= the basic columns in A \mathbf{A} A
Spanning set for R ( A T ) = R\left(\mathbf{A}^{T}\right)= R(AT)= the nonzero rows in U \mathbf{U} U
Spanning set for N ( A ) = N(\mathbf{A})= N(A)= the h i \mathbf{h}_{i} hi's in the general solution of A x = 0 \mathbf{A x}=\mathbf{0} Ax=0
Spanning set for N ( A T ) = N\left(\mathbf{A}^{T}\right)= N(AT)= the last m − r m-r m−r rows of P \mathbf{P} P
If A \mathbf{A} A and B \mathbf{B} B have the same shape, then
- A ∼ row B ⟺ N ( A ) = N ( B ) ⟺ R ( A T ) = R ( B T ) \mathbf{A} \stackrel{\text { row }}{\sim} \mathbf{B} \Longleftrightarrow N(\mathbf{A})=N(\mathbf{B}) \Longleftrightarrow R\left(\mathbf{A}^{T}\right)=R\left(\mathbf{B}^{T}\right) A∼ row B⟺N(A)=N(B)⟺R(AT)=R(BT)
- A ∼ col B ⟺ R ( A ) = R ( B ) ⟺ N ( A T ) = N ( B T ) \mathbf{A} \stackrel{\text { col }}{\sim} \mathbf{B} \Longleftrightarrow R(\mathbf{A})=R(\mathbf{B}) \Longleftrightarrow N\left(\mathbf{A}^{T}\right)=N\left(\mathbf{B}^{T}\right) A∼ col B⟺R(A)=R(B)⟺N(AT)=N(BT)