矩阵论(三)——矩阵分解
1. 常见的矩阵标准形与分解
1.1 三角分解
等价标准形:
A
∈
C
n
×
n
,
∃
可
逆
P
∈
C
m
×
m
,
Q
∈
C
n
×
n
A \in C^{n \times n},\exist 可逆P \in C^{m \times m},\ Q \in C^{n \times n}
A∈Cn×n,∃可逆P∈Cm×m, Q∈Cn×n,使
A
=
P
[
I
r
0
0
0
]
Q
A = P \begin{bmatrix} I_r & 0 \\ 0 & 0 \end{bmatrix} Q
A=P[Ir000]Q
相似标准形:
A
∈
C
n
×
n
,
∃
可
逆
P
∈
C
n
×
n
A \in C^{n \times n},\exist 可逆P \in C^{n \times n}
A∈Cn×n,∃可逆P∈Cn×n,使
A
=
P
[
λ
1
λ
2
⋱
λ
n
]
P
−
1
A = P \begin{bmatrix} \lambda_1 & & \\ & \lambda_2 & \\ & & \ddots & \\ & & & \lambda_n \end{bmatrix} P^{-1}
A=P⎣⎢⎢⎡λ1λ2⋱λn⎦⎥⎥⎤P−1或
A
=
P
J
A
P
−
1
A = P J_A P^{-1}
A=PJAP−1
A ∈ R n × n , A T = A , 则 ∃ 正 交 P ( P T P = I ) A \in R^{n \times n},\ A^T = A,则\exist 正交P(P^T P = I) A∈Rn×n, AT=A,则∃正交P(PTP=I),使 P T A P = C − 1 A C = d i a g ( λ 1 , λ 2 , ⋯ , λ n ) P^T A P = C^{-1} A C = diag(\lambda_1,\ \lambda_2,\ \cdots,\ \lambda_n) PTAP=C−1AC=diag(λ1, λ2, ⋯, λn)
设
U
=
[
a
11
a
12
⋯
a
1
n
a
21
⋯
a
2
n
⋱
⋮
a
n
n
]
U = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ & a_{21} & \cdots & a_{2n} \\ & & \ddots & \vdots \\ & & & a_{nn} \end{bmatrix}
U=⎣⎢⎢⎢⎡a11a12a21⋯⋯⋱a1na2n⋮ann⎦⎥⎥⎥⎤为上三角矩阵,
L
=
[
a
11
a
21
a
22
⋮
⋮
⋱
a
n
1
a
n
2
⋯
a
n
n
]
L = \begin{bmatrix} a_{11} & & & \\ a_{21} & a_{22} & & \\ \vdots& \vdots & \ddots & \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix}
L=⎣⎢⎢⎢⎡a11a21⋮an1a22⋮an2⋱⋯ann⎦⎥⎥⎥⎤为下三角矩阵
LU分解:
A
=
L
U
A = LU
A=LU
LDV分解:
A
=
L
D
V
A = LDV
A=LDV,其中V的对角线元素全为1,D是对角矩阵
下三角可逆矩阵P,使
P
A
=
U
⇒
A
=
P
−
1
U
PA=U \Rightarrow A = P^{-1} U
PA=U⇒A=P−1U
A的三角分解方法:
1. 对A作初等行变换,
(
A
,
I
)
=
(
U
,
P
)
(A,\ I) = (U,\ P)
(A, I)=(U, P)
2.
L
=
P
−
1
L = P^{-1}
L=P−1,
(
P
,
I
)
=
(
I
,
P
−
1
)
(P,\ I) = (I,\ P^{-1})
(P, I)=(I, P−1)
3. U = DV,其中D=diag(U), V是对U的对角线元素归一化后的矩阵
例如:
A有唯一LDV分解
⟺
\iff
⟺
A的顺序主子式
Δ
k
=
∣
a
11
⋯
a
1
k
a
21
⋯
a
2
k
⋮
⋮
a
k
1
⋯
a
k
k
∣
≠
0
,
k
=
1
,
2
,
⋯
,
n
−
1
;
Δ
0
=
1
\Delta_k = \begin{vmatrix} a_{11} & \cdots & a_{1k} \\ a_{21} & \cdots & a_{2k} \\ \vdots & & \vdots \\ a_{k1} & \cdots & a_{kk} \end{vmatrix} \neq 0,k = 1,\ 2,\ \cdots,\ n - 1;\Delta_0 = 1
Δk=∣∣∣∣∣∣∣∣∣a11a21⋮ak1⋯⋯⋯a1ka2k⋮akk∣∣∣∣∣∣∣∣∣=0,k=1, 2, ⋯, n−1;Δ0=1
其中
D
=
[
d
1
d
2
⋱
d
n
]
,
d
k
=
Δ
k
Δ
k
−
1
;
k
=
1
,
2
,
⋯
,
n
D = \begin{bmatrix} d_1 & & & \\ & d_2 & & \\ & & \ddots & \\ & & & d_n \end{bmatrix},d_k = \frac{\Delta_k}{\Delta_{k - 1}};k = 1,\ 2,\ \cdots,\ n
D=⎣⎢⎢⎡d1d2⋱dn⎦⎥⎥⎤,dk=Δk−1Δk;k=1, 2, ⋯, n
A
=
(
a
i
j
)
∈
F
n
×
n
,
r
a
n
k
(
A
)
=
k
(
k
≤
n
)
,
A
的
顺
序
主
子
式
Δ
j
≠
0
,
j
=
1
,
2
,
⋯
,
k
A = (a_{ij}) \in F^{n \times n},\ rank(A) = k(k \leq n),\ A的顺序主子式\Delta_j \neq 0, j = 1,\ 2,\ \cdots,\ k
A=(aij)∈Fn×n, rank(A)=k(k≤n), A的顺序主子式Δj=0,j=1, 2, ⋯, k,则A有LU分解
不是所有的矩阵都有LU分解,例如
A
=
[
0
1
1
0
]
A = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}
A=[0110]
可逆矩阵 A ∈ F n × n A \in F^{n \times n} A∈Fn×n,则A有LU分解 ⟺ \iff ⟺A的所有顺序主子式 Δ k ≠ 0 , k = 1 , 2 , ⋯ , n − 1 \Delta_k \neq 0,\ k = 1,\ 2,\ \cdots,\ n - 1 Δk=0, k=1, 2, ⋯, n−1
例如:
设A的LU分解为
A
=
L
U
A = LU
A=LU,则
A
X
=
b
⟺
L
U
X
=
b
⟺
{
L
Y
=
b
U
X
=
Y
AX = b \iff LU X = b \iff \begin{cases} LY = b \\ UX = Y \end{cases}
AX=b⟺LUX=b⟺{LY=bUX=Y
例如:
1.2 满秩分解
满秩分解:
A
∈
F
m
×
n
,
r
a
n
k
(
A
)
=
r
,
若
∃
秩
为
r
的
矩
阵
B
∈
F
m
×
r
,
C
∈
F
r
×
n
,
使
A
=
B
C
A \in F^{m \times n},\ rank(A) = r,若\exist秩为r的矩阵B \in F^{m \times r},\ C \in F^{r \times n},使A = BC
A∈Fm×n, rank(A)=r,若∃秩为r的矩阵B∈Fm×r, C∈Fr×n,使A=BC
任何非零矩阵
A
∈
F
m
×
n
A \in F^{m \times n}
A∈Fm×n,都存在满秩分解。A的满秩分解一般不惟一
求满秩分解方法:
方法3
Hermite标准形:
例如:
A
=
[
1
1
2
0
2
2
1
0
1
]
A = \begin{bmatrix} 1 & 1 & 2 \\ 0 & 2 & 2 \\ 1 & 0 & 1 \end{bmatrix}
A=⎣⎡101120221⎦⎤,求A的满秩分解
解:
A
=
[
1
1
2
0
2
2
1
0
1
]
→
[
1
0
1
0
1
1
0
0
0
]
⇒
r
a
n
k
(
A
)
=
2
A = \begin{bmatrix} 1 & 1 & 2 \\ 0 & 2 & 2 \\ 1 & 0 & 1 \end{bmatrix} \rightarrow \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \end{bmatrix} \Rightarrow rank(A) = 2
A=⎣⎡101120221⎦⎤→⎣⎡100010110⎦⎤⇒rank(A)=2
则
B
=
[
1
1
0
2
1
0
]
,
C
=
[
1
0
1
0
1
1
]
,
A
=
B
C
B = \begin{bmatrix} 1 & 1 \\ 0 & 2 \\ 1 & 0 \end{bmatrix},\quad C = \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \end{bmatrix},\ A = BC
B=⎣⎡101120⎦⎤,C=[100111], A=BC
1.3 谱分解
谱: 方阵A的所有互异特征值得集合
谱分解:
A
=
∑
i
=
1
s
λ
i
P
i
A = \sum \limits^s_{i = 1}\lambda_i P_i
A=i=1∑sλiPi,即可对角化矩阵可分解为s个方阵
P
i
P_i
Pi的加权和
分解过程如下:
幂等矩阵:
P
2
=
P
P^2 = P
P2=P
性质:
P
H
\quad P^H
PH和
(
I
−
P
)
(I - P)
(I−P)仍是幂等矩阵
\quad
P的特征值为1或者是0,而且P可相似于对角矩阵
F
n
=
N
(
P
)
⨁
R
(
P
)
\quad F^n = N(P) \bigoplus R(P)
Fn=N(P)⨁R(P)
A ∈ C n × n A \in C^{n \times n} A∈Cn×n,A可对角化 ⟺ A = ∑ i = 1 s λ i P i \iff A = \sum^{s}_{i = 1} \lambda_i P_i ⟺A=∑i=1sλiPi
其中
λ
i
\lambda_i
λi为A的谱,
P
i
∈
C
n
×
n
P_i \in C^{n \times n}
Pi∈Cn×n 满足
1.
P
i
2
=
P
i
,
i
=
1
,
2
,
⋯
,
s
\quad 1.\ P_i^2 = P_i, i = 1,\ 2,\ \cdots,\ s
1. Pi2=Pi,i=1, 2, ⋯, s
2.
P
i
P
j
=
0
,
i
≠
j
\quad 2.\ P_i P_j = 0,\ i \neq j
2. PiPj=0, i=j
3.
∑
i
=
1
s
P
i
=
I
n
\quad 3.\ \sum \limits^s_{i = 1}P_i = I_n
3. i=1∑sPi=In
半正定Hermite矩阵:
A
∈
F
n
×
n
,
A
H
=
A
,
∀
x
∈
F
n
,
有
x
H
A
X
≥
0
A \in F^{n \times n},\ A^H = A,\ \forall x \in F^n,有x^H A X \geq 0
A∈Fn×n, AH=A, ∀x∈Fn,有xHAX≥0,则称A为半正定的。即
A为半正定的
⟺
\iff
⟺ A的特征值为非负实数
A
∈
F
n
×
n
A \in F^{n \times n}
A∈Fn×n是半正定的Hermite矩阵,
r
a
n
k
(
A
)
=
k
rank(A) = k
rank(A)=k,则
A
=
v
1
v
1
H
+
v
2
v
2
H
+
⋯
+
v
k
v
k
H
A = v_1 v_1^H + v_2 v_2^H + \cdots + v_k v_k^H
A=v1v1H+v2v2H+⋯+vkvkH
v
i
∈
F
n
,
{
v
1
,
v
2
,
⋯
,
v
k
}
v_i \in F^n,\ \{v_1,\ v_2,\ \cdots,\ v_k\}
vi∈Fn, {v1, v2, ⋯, vk}是空间
F
n
F^n
Fn中非零的正交向量组
2. Schur分解与正规矩阵
2.1 Schur分解
正交(酉)相似: A , B ∈ R n × n , ∃ 正 交 ( 酉 ) 矩 阵 U , 使 U T A U = B ( U H A U = B ) A,\ B \in R^{n \times n},\ \exist 正交(酉)矩阵U,使U^T A U = B(U^H A U = B) A, B∈Rn×n, ∃正交(酉)矩阵U,使UTAU=B(UHAU=B),则称A正交(酉)相似于B
UR分解:
A
∈
C
n
×
n
,
∣
A
∣
≠
0
A \in C^{n \times n},\ |A| \neq 0
A∈Cn×n, ∣A∣=0,则存在酉矩阵
U
∈
C
n
×
n
U \in C^{n \times n}
U∈Cn×n及主对角线上全为正的上三角阵
R
=
[
r
11
r
12
⋯
r
1
n
r
22
⋯
r
2
n
⋱
⋮
r
n
n
]
,
r
i
i
>
0
R = \begin{bmatrix} r_{11} & r_{12} & \cdots & r_{1n} \\ & r_{22} & \cdots & r_{2n} \\ & & \ddots & \vdots \\ & & & r_{nn} \end{bmatrix}, \quad r_{ii} > 0
R=⎣⎢⎢⎢⎡r11r12r22⋯⋯⋱r1nr2n⋮rnn⎦⎥⎥⎥⎤,rii>0,使
A
=
U
R
A = UR
A=UR
UR分解求解步骤:
1. 取A的列向量记为
A
1
,
⋯
A
n
A_1, \cdots A_n
A1,⋯An
2. 对
A
1
,
⋯
A
n
A_1, \cdots A_n
A1,⋯An使用gram-schmidt正交化,得到标准正交向量组U
3. 根据
A
=
U
R
A = UR
A=UR求出R,使用初等行变换:
(
U
∣
A
)
→
(
E
∣
R
)
(U|A) \rightarrow (E|R)
(U∣A)→(E∣R)
例如:
QR分解:
A
∈
C
m
×
k
A \in C^{m \times k}
A∈Cm×k是一个列满秩(列向量组线性无关)的矩阵,即
r
a
n
k
(
A
)
=
k
rank(A) = k
rank(A)=k,则A可被分解为
A
=
Q
R
A = QR
A=QR。
其中Q为列规范矩阵(列向量两两正交且为单位向量),R是可逆的上三角矩阵
QR分解求解步骤:
1. 取A的列向量记为
A
1
,
⋯
A
n
A_1, \cdots A_n
A1,⋯An
2. 对
A
1
,
⋯
A
n
A_1, \cdots A_n
A1,⋯An使用gram-schmidt正交化,得到标准正交向量组Q
3. 根据
A
=
Q
R
A = QR
A=QR求出R,使用初等行变换:
(
Q
∣
A
)
→
(
E
∣
R
)
(Q|A) \rightarrow (E|R)
(Q∣A)→(E∣R)
Schur分解:
A
∈
C
n
×
n
A \in C^{n \times n}
A∈Cn×n,存在酉矩阵U和上三角矩阵T,使得
U
H
A
U
=
T
=
[
λ
1
t
12
⋯
t
1
n
λ
2
⋯
t
2
n
⋱
⋮
λ
n
]
U^H A U = T = \begin{bmatrix} \lambda_1 & t_{12} & \cdots & t_{1n} \\ & \lambda_2 & \cdots & t_{2n} \\ & & \ddots & \vdots \\ & & & \lambda_n \end{bmatrix}
UHAU=T=⎣⎢⎢⎢⎡λ1t12λ2⋯⋯⋱t1nt2n⋮λn⎦⎥⎥⎥⎤
其中
λ
i
\lambda_i
λi为矩阵A的特征值,
i
=
1
,
2
,
⋯
,
n
i = 1,\ 2,\ \cdots,\ n
i=1, 2, ⋯, n
Schur分解求解步骤:
1. 当不存在Jordan矩阵
J
A
J_A
JA和可逆矩阵P,使
J
A
=
P
−
1
A
P
J_A = P^{-1} A P
JA=P−1AP时,需要求Jordan矩阵
J
A
J_A
JA和可逆矩阵P。求解步骤见Jordan标准形计算步骤
2. 取P的列向量记为
(
α
1
,
⋯
α
n
)
(\alpha_1, \cdots \alpha_n)
(α1,⋯αn)
3. 对
(
α
1
,
⋯
α
n
)
(\alpha_1, \cdots \alpha_n)
(α1,⋯αn)使用gram-schmidt正交化,得到标准正交向量组U
4. 根据
P
=
U
R
P = UR
P=UR求出R,使用初等行变换:
(
U
∣
P
)
→
(
E
∣
R
)
(U|P) \rightarrow (E|R)
(U∣P)→(E∣R)
5. 令
T
=
R
J
A
R
−
1
T = R J_A R^{-1}
T=RJAR−1,即得
U
H
A
U
=
T
U^H A U = T
UHAU=T
2.2 正规矩阵
正规矩阵:
A
∈
R
n
×
n
,
A
T
A
=
A
A
T
;
A
∈
C
n
×
n
,
A
H
A
=
A
A
H
A \in R^{n \times n},A^T A= A A^T; \quad A \in C^{n \times n},A^H A= A A^H
A∈Rn×n,ATA=AAT;A∈Cn×n,AHA=AAH
对角矩阵、Hermite矩阵、反Hermite矩阵(
A
H
=
−
A
A^H = -A
AH=−A)、酉(正交)矩阵、实对称矩阵均为正规矩阵
复对称矩阵不是正规的,例如
A
=
[
1
i
i
−
1
]
,
A
H
=
[
1
−
i
−
i
−
1
]
,
但
A
H
A
≠
A
A
H
A = \begin{bmatrix}1 & i \\ i & -1 \end{bmatrix},\ A^H = \begin{bmatrix}1 & -i \\ -i & -1 \end{bmatrix},\ 但A^H A \neq A A^H
A=[1ii−1], AH=[1−i−i−1], 但AHA=AAH
例如:
A
∈
C
n
×
n
A \in C^{n \times n}
A∈Cn×n是正规矩阵
⟺
\iff
⟺ A酉相似于对角矩阵,即
∃
U
∈
C
n
×
n
,
U
H
A
U
=
d
i
a
g
(
λ
1
,
λ
2
,
⋯
,
λ
n
)
\exist U \in C^{n \times n},\ U^H A U = diag(\lambda_1, \lambda_2,\ \cdots,\ \lambda_n)
∃U∈Cn×n, UHAU=diag(λ1,λ2, ⋯, λn)
A ∈ C n × n A \in C^{n \times n} A∈Cn×n是正规矩阵 ⟺ \iff ⟺A有n个线性无关的特征向量构成空间 C n C^n Cn的标准正交基
A ∈ F n × n A \in F^{n \times n} A∈Fn×n,若A是正规矩阵,则A必有n个线性无关的特征向量,且不同特征值对应的特征向量线性无关
正规矩阵谱分解求解步骤:
1. 通过
A
H
A
=
A
A
H
A^H A = A A^H
AHA=AAH判断A是否是正规矩阵
2. 求A的特征值,
∣
λ
I
−
A
∣
=
(
λ
−
λ
1
)
⋯
(
λ
−
λ
n
)
=
0
|\lambda I - A| = (\lambda - \lambda_1) \cdots (\lambda - \lambda_n) = 0
∣λI−A∣=(λ−λ1)⋯(λ−λn)=0
3. 求A的对应的线性无关的特征向量:
ϵ
1
,
⋯
,
ϵ
n
\epsilon_1, \cdots,\ \epsilon_n
ϵ1,⋯, ϵn
4. 将
ϵ
1
,
⋯
,
ϵ
n
\epsilon_1, \cdots,\ \epsilon_n
ϵ1,⋯, ϵn标准正交化,得到
u
1
,
⋯
,
u
n
u_1,\ \cdots,\ u_n
u1, ⋯, un
5. 令
U
=
(
u
1
,
⋯
,
u
n
)
U = (u_1,\ \cdots,\ u_n)
U=(u1, ⋯, un),即得
U
H
A
U
=
d
i
a
g
(
λ
1
,
⋯
,
λ
n
)
U^H A U = diag(\lambda_1,\ \cdots,\ \lambda_n)
UHAU=diag(λ1, ⋯, λn)
正交矩阵的谱分解:
A
∈
C
n
×
n
,
λ
(
A
)
=
{
λ
1
,
⋯
,
λ
n
}
A \in C^{n \times n},\ \lambda(A) = \{\lambda_1,\ \cdots,\ \lambda_n\}
A∈Cn×n, λ(A)={λ1, ⋯, λn},则
A是正规矩阵
⟺
\iff
⟺ A有如下谱分解,
A
=
∑
i
=
1
s
λ
i
P
i
,
p
i
∈
C
n
×
n
A = \sum \limits^s_{i = 1} \lambda_i P_i,\ p_i \in C^{n \times n}
A=i=1∑sλiPi, pi∈Cn×n,
其中
P
i
2
=
P
i
,
P
i
H
=
P
i
,
i
=
1
,
⋯
,
n
P
i
P
J
=
0
,
i
≠
j
I
n
=
∑
i
=
1
s
P
i
\quad P^2_i = P_i,\ P^H_i = P_i,\ i = 1, \cdots,\ n \\ \qquad \ \ P_i P_J = 0,\ i \neq j \\ \qquad \ \ I_n = \sum \limits^s_{i = 1} P_i
Pi2=Pi, PiH=Pi, i=1,⋯, n PiPJ=0, i=j In=i=1∑sPi
例如:
A
=
[
0
−
1
i
1
0
0
i
0
0
]
A = \begin{bmatrix} 0 & -1 & i \\ 1 & 0 & 0 \\ i & 0 & 0 \end{bmatrix}
A=⎣⎡01i−100i00⎦⎤,求酉矩阵U和A的谱分解,使
U
H
A
U
=
∧
U^H A U = \land
UHAU=∧
解:
酉矩阵U:
A
H
A
=
A
A
H
∣
λ
I
−
A
∣
=
λ
(
λ
2
+
2
)
=
0
⇒
λ
1
=
2
i
,
λ
2
=
−
2
i
,
λ
3
=
0
对
应
的
特
征
向
量
ϵ
1
=
(
2
,
−
i
,
1
)
T
,
ϵ
2
=
(
−
2
,
−
i
,
1
)
T
,
ϵ
3
=
(
0
,
−
1
,
i
)
T
将
ϵ
1
,
ϵ
2
,
ϵ
3
正
交
化
,
u
i
=
ϵ
i
∣
∣
ϵ
i
∣
∣
得
u
1
=
(
2
2
,
−
i
2
,
1
2
)
T
,
u
2
=
(
−
2
2
,
−
i
2
,
1
2
)
T
,
u
3
=
(
0
,
−
1
2
,
−
i
2
)
令
U
=
(
u
1
,
u
2
,
u
3
)
,
则
U
为
酉
矩
阵
,
且
U
H
A
U
=
d
i
a
g
(
2
i
,
−
2
i
,
0
)
A^H A = A A^H \\ |\lambda_I - A| = \lambda(\lambda^2+2) = 0 \Rightarrow \lambda_1 = \sqrt{2} i,\ \lambda_2 = -\sqrt{2} i,\ \lambda_3 = 0 \\ 对应的特征向量\epsilon_1 = (\sqrt{2},\ -i,\ 1)^T,\ \epsilon_2 = (-\sqrt{2},\ -i,\ 1)^T,\ \epsilon_3 = (0,\ -1,\ i)^T \\ 将\epsilon_1,\epsilon_2,\epsilon_3正交化,u_i = \frac{\epsilon_i}{||\epsilon_i||}得 \\ u_1 = (\frac{\sqrt{2}}2,\ -\frac{i}{2},\ \frac{1}{2})^T,u_2 = (-\frac{\sqrt{2}}2,\ -\frac{i}{2},\ \frac{1}{2})^T,u_3 = (0,\ -\frac{1}{\sqrt{2}},\ -\frac{i}{\sqrt{2}}) \\ 令U = (u_1,\ u_2,\ u_3),则U为酉矩阵,且U^H A U = diag(\sqrt{2} i,\ -\sqrt{2} i,\ 0)
AHA=AAH∣λI−A∣=λ(λ2+2)=0⇒λ1=2i, λ2=−2i, λ3=0对应的特征向量ϵ1=(2, −i, 1)T, ϵ2=(−2, −i, 1)T, ϵ3=(0, −1, i)T将ϵ1,ϵ2,ϵ3正交化,ui=∣∣ϵi∣∣ϵi得u1=(22, −2i, 21)T,u2=(−22, −2i, 21)T,u3=(0, −21, −2i)令U=(u1, u2, u3),则U为酉矩阵,且UHAU=diag(2i, −2i, 0)
\newline
A的谱分解:
A
=
(
u
1
,
u
2
,
u
3
)
[
2
i
(
1
0
0
)
−
2
i
(
0
1
0
)
+
0
(
0
0
1
)
]
(
u
1
H
u
2
H
u
3
H
)
=
2
i
u
1
u
1
H
−
2
i
u
2
u
2
H
+
0
u
3
u
3
H
\begin{aligned}A & = (u_1,\ u_2,\ u_3) \left[ \sqrt{2} i \begin{pmatrix} 1 & & \\ & 0 & \\ & & 0 \end{pmatrix} -\sqrt{2} i \begin{pmatrix} 0 & & \\ & 1 & \\ & & 0 \end{pmatrix} +0\begin{pmatrix} 0 & & \\ & 0 & \\ & & 1 \end{pmatrix} \right] \begin{pmatrix} u_1^H \\ u_2^H \\ u_3^H \end{pmatrix} \\ & = \sqrt{2}i u_1 u^H_1 - \sqrt{2}i u_2 u_2^H + 0u_3 u_3^H \end{aligned}
A=(u1, u2, u3)⎣⎡2i⎝⎛100⎠⎞−2i⎝⎛010⎠⎞+0⎝⎛001⎠⎞⎦⎤⎝⎛u1Hu2Hu3H⎠⎞=2iu1u1H−2iu2u2H+0u3u3H
3. 矩阵的奇异值分解
3.1 矩阵的奇异值及其性质
∀ A ∈ C m × n \forall A \in C^{m \times n} ∀A∈Cm×n,有 A H A ∈ C n × n , A A H ∈ C m × m A^HA \in C^{n \times n},AA^H \in C^{m \times m} AHA∈Cn×n,AAH∈Cm×m,且 A H A , A A H A^HA,AA^H AHA,AAH均为Hermite矩阵
设
A
∈
C
m
×
n
A \in C^{m \times n}
A∈Cm×n,则
秩(A) = 秩(
A
H
A
A^HA
AHA) = 秩(
A
A
H
AA^H
AAH)
A
H
A
与
A
A
H
A^HA与AA^H
AHA与AAH的非零特征值相等,其特征值为非负实数
A
H
A
与
A
A
H
A^HA与AA^H
AHA与AAH均为半正定矩阵,当秩(A)=n时,
A
H
A
A^HA
AHA为正定的
奇异值: 设
A
∈
C
m
×
n
A \in C^{m \times n}
A∈Cm×n,秩(A) = r > 0,矩阵
A
H
A
A^HA
AHA的特征值为
λ
1
≥
λ
2
≥
⋯
≥
λ
r
>
0
,
λ
r
+
1
=
⋯
=
λ
n
=
0
(
\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_r > 0,\ \lambda_{r+1} = \cdots = \lambda_n = 0(
λ1≥λ2≥⋯≥λr>0, λr+1=⋯=λn=0(矩阵
A
H
A
A^HA
AHA的特征值为
λ
1
≥
λ
2
≥
⋯
λ
r
>
0
,
λ
r
+
1
=
⋯
=
λ
n
=
0
)
\lambda_1 \ge \lambda_2 \ge \cdots \lambda_r > 0,\ \lambda_{r+1} = \cdots = \lambda_n = 0)
λ1≥λ2≥⋯λr>0, λr+1=⋯=λn=0)
称正数
σ
i
=
λ
i
(
i
=
1
,
⋯
,
r
)
\sigma_i = \sqrt{\lambda_i}(i = 1,\ \cdots,\ r)
σi=λi(i=1, ⋯, r)为A的奇异值
奇异值的求法:
A
∈
C
m
×
n
A \in C^{m \times n}
A∈Cm×n,求
A
H
A
A^HA
AHA的特征值:
λ
1
≥
⋯
≥
λ
r
>
0
\lambda_1 \ge \cdots \ge \lambda_r > 0
λ1≥⋯≥λr>0,即得
σ
i
=
λ
i
\sigma_i = \sqrt{\lambda_i}
σi=λi
A
∈
C
n
×
n
A \in C^{n \times n}
A∈Cn×n为正规矩阵,则A的奇异值为A的非零特征值的模,即
σ
=
∣
λ
∣
,
λ
∈
λ
(
A
)
\sigma = |\lambda|, \lambda \in \lambda(A)
σ=∣λ∣,λ∈λ(A)
A
∈
C
n
×
n
A \in C^{n \times n}
A∈Cn×n为正定的Hermite矩阵时,A的奇异值为A的特征值
酉等价:
∃
B
∈
C
m
×
n
,
酉
矩
阵
U
∈
C
m
×
m
,
V
∈
C
n
×
n
,
使
U
A
V
=
B
\exists B \in C^{m \times n},酉矩阵U \in C^{m \times m}, V \in C^{n \times n},使UAV = B
∃B∈Cm×n,酉矩阵U∈Cm×m,V∈Cn×n,使UAV=B
酉等价的矩阵A,B具有相同的奇异值。
3.2 矩阵的奇异分解(SVD)
矩阵的谱分解是奇异分解的特例,因为谱分解的前提是正规矩阵,然后奇异分解针对的是一般情况。
∀
A
∈
C
m
×
n
,
秩
(
A
)
=
r
>
0
,
σ
1
≥
⋯
≥
σ
r
\forall A \in C^{m \times n},秩(A) = r > 0, \sigma_1 \ge \cdots \ge \sigma_r
∀A∈Cm×n,秩(A)=r>0,σ1≥⋯≥σr为A的奇异值,则存在酉矩阵
U
∈
C
m
×
m
,
V
∈
n
×
n
U \in C^{m \times m},\ V \in {n \times n}
U∈Cm×m, V∈n×n,使
A
m
×
n
=
U
m
×
m
(
Δ
r
0
)
m
×
n
V
n
×
n
H
,
其
中
Δ
r
=
(
σ
1
⋱
σ
r
)
A_{m \times n} = U_{m \times m} \begin{pmatrix} \Delta_r & \\ & 0 \end{pmatrix}_{m \times n} V^H_{n \times n},其中\Delta_r = \begin{pmatrix}\sigma_1 & & \\ & \ddots & \\ & & \sigma_r\end{pmatrix}
Am×n=Um×m(Δr0)m×nVn×nH,其中Δr=⎝⎛σ1⋱σr⎠⎞
A的奇异值分解不唯一
A
∈
C
m
×
n
,
U
=
(
u
1
,
⋯
,
u
m
)
∈
C
m
×
m
,
V
=
(
v
1
,
⋯
,
v
n
)
∈
C
n
×
n
A \in C^{m \times n},\ U = (u_1,\ \cdots,\ u_m) \in C^{m \times m}, V=(v_1,\ \cdots,\ v_n) \in C^{n \times n}
A∈Cm×n, U=(u1, ⋯, um)∈Cm×m,V=(v1, ⋯, vn)∈Cn×n,则
A
=
U
Σ
V
H
=
(
u
1
,
⋯
,
u
m
)
(
σ
1
⋱
σ
r
0
)
(
v
1
H
⋮
v
n
H
)
=
σ
1
u
1
v
1
H
+
⋯
+
σ
r
u
r
v
r
H
A = U \Sigma V^H = (u_1,\ \cdots,\ u_m) \begin{pmatrix}\sigma_1 & & & \\ & \ddots & & \\ & & \sigma_r & \\ & & & 0\end{pmatrix} \begin{pmatrix}v_1^H \\ \vdots \\ v_n^H\end{pmatrix} = \sigma_1 u_1 v_1^H + \cdots + \sigma_r u_r v_r^H
A=UΣVH=(u1, ⋯, um)⎝⎜⎜⎛σ1⋱σr0⎠⎟⎟⎞⎝⎜⎛v1H⋮vnH⎠⎟⎞=σ1u1v1H+⋯+σrurvrH
左奇异向量: U = { u 1 , u 2 , ⋯ , u r } U=\{u_1,\ u_2,\ \cdots,\ u_r\} U={u1, u2, ⋯, ur}
右奇异向量: V = { v 1 , v 2 , ⋯ , v r } V=\{v_1,\ v_2,\ \cdots,\ v_r\} V={v1, v2, ⋯, vr}
A ∈ C m × n A \in C^{m \times n} A∈Cm×n,A的奇异值 σ 1 ≥ ⋯ ≥ σ r > 0 \sigma_1 \ge \cdots \ge \sigma_r > 0 σ1≥⋯≥σr>0,则 A = σ 1 u 1 v 1 H + ⋯ + σ r u r v r H A= \sigma_1 u_1 v_1^H + \cdots + \sigma_r u_r v_r^H A=σ1u1v1H+⋯+σrurvrH
A H A = A A H ⟺ Σ i = 1 n ∣ λ i ∣ 2 ≤ t r ( A H A ) = t r ( A A H ) = Σ i = 1 n σ i 2 A^HA = AA^H \iff \Sigma^n_{i = 1} |\lambda_i|^2 \leq tr(A^HA) = tr(AA^H) = \Sigma^n_{i = 1} \sigma_i ^2 AHA=AAH⟺Σi=1n∣λi∣2≤tr(AHA)=tr(AAH)=Σi=1nσi2
求A的SVD步骤:
1. 求
A
H
A
A^HA
AHA的特征值:
σ
1
2
≥
⋯
≥
σ
r
2
,
σ
r
+
1
2
=
⋯
=
σ
n
2
=
0
\sigma_1\ ^2 \ge \cdots \ge \sigma_r\ ^2,\sigma_{r + 1}\ ^2 = \cdots = \sigma_n\ ^2 = 0
σ1 2≥⋯≥σr 2,σr+1 2=⋯=σn 2=0
A
H
A
A^HA
AHA对应线性无关的特征向量:
ϵ
1
,
⋯
,
ϵ
r
,
⋯
,
ϵ
n
\epsilon_1, \cdots, \epsilon_r, \cdots, \epsilon_n
ϵ1,⋯,ϵr,⋯,ϵn
2. 将
ϵ
1
,
⋯
,
ϵ
r
,
⋯
,
ϵ
n
\epsilon_1, \cdots, \epsilon_r, \cdots, \epsilon_n
ϵ1,⋯,ϵr,⋯,ϵn标准正交化:
v
1
,
⋯
,
v
n
v_1, \cdots, v_n
v1,⋯,vn,取
V
=
(
v
1
,
⋯
,
v
n
)
V = (v_1, \cdots, v_n)
V=(v1,⋯,vn)
3. 令
u
i
=
1
σ
i
A
v
i
,
i
=
1
,
⋯
,
r
u_i = \frac{1}{\sigma_i}Av_i,i = 1, \cdots, r
ui=σi1Avi,i=1,⋯,r,将
u
1
,
⋯
,
u
r
u_1,\ \cdots, u_r
u1, ⋯,ur扩充为
c
m
c^m
cm的标准正交基
u
1
,
⋯
,
u
m
u_1, \cdots, u_m
u1,⋯,um,令
U
=
(
u
1
,
⋯
,
u
m
)
U = (u_1, \cdots, u_m)
U=(u1,⋯,um)即得
A
=
U
(
σ
1
⋱
σ
r
0
)
V
H
A = U \begin{pmatrix}\sigma_1 & & & \\ & \ddots & & \\ & & \sigma_r & \\ & & & 0\end{pmatrix}V^H
A=U⎝⎜⎜⎛σ1⋱σr0⎠⎟⎟⎞VH
注:
A
H
A
=
V
(
Δ
r
2
0
)
n
×
n
V
H
A^HA = V \begin{pmatrix}\Delta_r\ ^2 & \\ & 0\end{pmatrix}_{n \times n} V^H
AHA=V(Δr 20)n×nVH
A
A
H
=
U
(
Δ
r
2
0
)
m
×
m
U
H
AA^H = U \begin{pmatrix}\Delta_r\ ^2 & \\ & 0\end{pmatrix}_{m \times m} U^H
AAH=U(Δr 20)m×mUH
例如: