矩阵的奇异值分解
设
A
∈
R
m
×
n
A \in R^{m\times n}
A∈Rm×n,则存在正交矩阵
U
U
U 和 正交矩阵
V
V
V ,,使得
A
=
U
[
Σ
O
O
O
]
V
T
,
Σ
=
d
i
a
g
(
σ
1
,
⋯
 
,
σ
r
)
,
σ
i
为
A
的
奇
异
值
A=U\begin{bmatrix} \Sigma & O \\ O &O \end{bmatrix}V^T,\Sigma=diag(\sigma_1,\cdots,\sigma_r),\sigma_i为A的奇异值
A=U[ΣOOO]VT,Σ=diag(σ1,⋯,σr),σi为A的奇异值
证明:(以实域为例,在复数域上是等价的)
设矩阵
A
T
A
A^TA
ATA的特征值为
λ
1
≥
λ
2
≥
⋯
≥
λ
r
>
λ
r
+
1
=
⋯
=
λ
n
=
0
\lambda_1\geq \lambda_2 \geq\cdots \geq\lambda_r >\lambda_{r+1}=\cdots=\lambda_n=0
λ1≥λ2≥⋯≥λr>λr+1=⋯=λn=0,,对应的特征向量为
x
1
,
x
2
,
⋯
 
,
x
r
,
x
r
+
1
,
⋯
 
,
x
n
x_1,x_2,\cdots,x_r,x_{r+1},\cdots,x_n
x1,x2,⋯,xr,xr+1,⋯,xn, 若令
V
=
(
x
1
,
x
2
,
⋯
 
,
x
n
)
V=(x_1,x_2,\cdots,x_n)
V=(x1,x2,⋯,xn),则有
A
T
A
V
=
V
d
i
a
g
(
λ
1
,
⋯
 
,
λ
n
)
=
V
[
Σ
2
O
O
O
]
A^TAV=Vdiag(\lambda_1,\cdots,\lambda_n)=V\begin{bmatrix} \Sigma^2 &O \\ O & O \end{bmatrix}
ATAV=Vdiag(λ1,⋯,λn)=V[Σ2OOO]
其中
Σ
=
d
i
a
g
(
σ
1
,
⋯
 
,
σ
n
)
\Sigma=diag(\sigma_1,\cdots,\sigma_n)
Σ=diag(σ1,⋯,σn)为矩阵A的奇异值,若令
V
1
=
(
x
1
,
⋯
 
,
x
r
)
,
V
2
=
(
x
r
+
1
,
⋯
 
,
x
n
)
V_1=(x_1,\cdots,x_r),V_2=(x_{r+1},\cdots,x_{n})
V1=(x1,⋯,xr),V2=(xr+1,⋯,xn),则
A
T
A
[
V
1
⋮
V
2
]
=
[
V
1
⋮
V
2
]
[
Σ
2
O
O
O
]
=
[
V
1
Σ
2
⋮
O
]
A^TA[V_1 \vdots V_2]=[V_1 \vdots V_2]\begin{bmatrix} \Sigma^2 &O \\ O & O \end{bmatrix}=[V_1 \Sigma^2 \vdots O]
ATA[V1⋮V2]=[V1⋮V2][Σ2OOO]=[V1Σ2⋮O]
因此
A
T
A
V
1
=
V
1
Σ
2
,
A
T
A
V
2
=
O
;
t
h
u
s
(
Σ
−
1
)
T
V
1
T
A
T
A
V
1
Σ
−
1
=
I
r
,
A
V
2
=
O
A^TAV_1 = V_1 \Sigma^2,A^TAV_2=O;thus\\ (\Sigma^{-1})^T V_1^TA^TAV_1\Sigma^{-1}=I_r,AV_2=O
ATAV1=V1Σ2,ATAV2=O;thus(Σ−1)TV1TATAV1Σ−1=Ir,AV2=O
令
U
1
=
A
V
1
Σ
−
1
U_1=AV_1\Sigma^{-1}
U1=AV1Σ−1, 则
U
1
T
U
1
=
I
r
U_1^TU_1=I_r
U1TU1=Ir
所以
U
1
U_1
U1为酉矩阵,记
U
1
=
(
u
1
,
⋯
 
,
u
r
)
U_1=(u_1,\cdots,u_r)
U1=(u1,⋯,ur),补充
U
2
=
(
u
r
+
1
,
⋯
 
,
u
m
)
U_2=(u_{r+1},\cdots,u_{m})
U2=(ur+1,⋯,um),使得
U
=
[
U
1
⋮
U
2
]
U=[U_1 \vdots U_2]
U=[U1⋮U2] 为空间
C
m
C^{m}
Cm的一组标准正交基。且有
U
2
T
U
1
=
O
U_2^TU_1=O
U2TU1=O
根据以上,可得
U
H
A
V
=
[
U
1
T
U
2
T
]
[
A
V
1
⋮
A
V
2
]
=
[
U
1
T
A
V
1
U
1
T
A
V
2
U
2
T
A
V
1
U
2
T
A
V
2
]
=
[
Σ
O
O
O
]
U^HAV= \begin{bmatrix} U_1^T\\ U_2^T \end{bmatrix} [AV_1\vdots AV_2]=\begin{bmatrix} U_1^TAV_1 &U_1^TAV_2 \\ U_2^TAV_1 & U_2^TAV_2 \end{bmatrix}=\begin{bmatrix} \Sigma & O \\ O &O \end{bmatrix}
UHAV=[U1TU2T][AV1⋮AV2]=[U1TAV1U2TAV1U1TAV2U2TAV2]=[ΣOOO]
因此
A
=
U
[
Σ
O
O
O
]
V
T
A=U\begin{bmatrix} \Sigma & O \\ O &O \end{bmatrix}V^T
A=U[ΣOOO]VT