酉矩阵
U
H
U
=
U
U
H
=
I
U^HU=UU^H=I
UHU=UUH=I,则称
U
U
U为酉矩阵
(
H
H
H表示共轭转置,先取共轭复数,再转置,或者先转置再取共轭复数)
其实相当于正交矩阵的复数版
命题1
设
A
∈
C
m
×
n
A\in\mathbb{C}^{m\times n}
A∈Cm×n,则有
(1)
A
H
A
A^H A
AHA与
A
A
H
AA^H%
AAH的特征值均为非负实数
(2)
A
H
A^H
AH与
A
A
H
AA^H
AAH的非零特征值相同
证明:
(1)
设
0
≠
x
∈
C
n
0\neq x \in \mathbb{C}^n
0=x∈Cn为矩阵
A
H
A
A^HA
AHA的特征值
λ
\lambda
λ所对应的 特征向量,
(
A
H
A
)
H
=
A
H
A
(A^H A)^H=A^HA
(AHA)H=AHA,所以
A
H
A
A^HA
AHA是埃尔米特矩阵
x
H
A
H
A
x
=
(
A
x
)
H
(
A
x
)
≥
0
x^H A^H Ax=(Ax)^H(Ax)\ge 0
xHAHAx=(Ax)H(Ax)≥0
A
H
A
A^HA
AHA是半正定矩阵,所以
λ
≥
0
\lambda \ge 0
λ≥0
(2)
设
A
H
A
A^H A
AHA的特征值
λ
1
≥
λ
2
≥
⋯
λ
r
>
λ
r
+
1
=
⋯
=
λ
n
=
0
\lambda_1\ge \lambda_2 \ge \cdots \lambda_r>\lambda_{r+1}=\cdots =\lambda_n=0
λ1≥λ2≥⋯λr>λr+1=⋯=λn=0
设
A
A
H
AA^H
AAH的特征值
μ
1
≥
μ
2
≥
⋯
≥
μ
s
>
μ
s
+
1
=
⋯
=
μ
m
=
0
\mu_1\ge \mu_2\ge \cdots \ge \mu_s >\mu_{s+1}=\cdots =\mu_m=0
μ1≥μ2≥⋯≥μs>μs+1=⋯=μm=0
设
0
≠
x
i
∈
C
n
(
i
=
1
,
2
,
⋯
,
r
)
0\neq x_i\in C^n(i=1,2,\cdots, r)
0=xi∈Cn(i=1,2,⋯,r)为
A
H
A
A^HA
AHA的非零特征值
λ
i
(
i
=
1
,
2
,
⋯
,
r
)
\lambda_i(i=1,2,\cdots,r)
λi(i=1,2,⋯,r)所对应的特征向量,则
A
H
A
x
i
=
λ
i
x
i
(
i
=
1
,
2
,
⋯
,
r
)
A^HAx_i=\lambda_i x_i(i=1,2,\cdots,r)
AHAxi=λixi(i=1,2,⋯,r)
有
(
A
A
H
)
(
A
x
i
)
=
λ
i
(
A
x
i
)
(
i
=
1
,
2
,
⋯
,
r
)
(AA^H)(Ax_i)=\lambda_i (Ax_i)(i=1,2,\cdots,r)
(AAH)(Axi)=λi(Axi)(i=1,2,⋯,r)
且
A
x
i
≠
0
Ax_i\neq 0
Axi=0,所以
λ
i
\lambda_i
λi也是
A
A
H
AA^H
AAH的特征值,同理,
A
A
H
AA^H
AAH的非零特征值也是
A
H
A
A^HA
AHA的非零特征值
接下来证他们代数重复度相同
设 y 1 , y 2 , ⋯ , y p y_1,y_2,\cdots,y_p y1,y2,⋯,yp为 A H A A^HA AHA对应于特征值 λ ≠ 0 \lambda\neq 0 λ=0的线性无关的特征向量,由于 A H A A^HA AHA为单纯矩阵(能相似对角化,或者几何重复度=代数重复度),故 p p p为 λ \lambda λ的代数重复度,显然 A y i ( i = 1 , 2 , ⋯ , r ) Ay_i(i=1,2,\cdots,r) Ayi(i=1,2,⋯,r)是 A A H AA^H AAH的对应于 λ \lambda λ的特征向量
设
k
=
(
k
1
,
k
2
,
⋯
,
k
p
)
T
k=(k_1,k_2,\cdots,k_p)^T
k=(k1,k2,⋯,kp)T
k
1
A
y
1
+
⋯
+
k
p
A
y
p
=
0
A
(
y
1
,
y
2
,
⋯
,
y
p
)
k
=
0
A
H
A
(
y
1
,
y
2
,
⋯
,
y
p
)
k
=
0
λ
(
y
1
,
y
2
,
⋯
,
y
p
)
k
=
0
(
y
1
,
y
2
,
⋯
,
y
p
)
k
=
0
k
=
0
\begin{aligned} k_1 A y_1+\cdots+k_p A y_p&=0\\ A(y_1,y_2,\cdots,y_p)k&=0\\ A^HA(y_1,y_2,\cdots,y_p)k&=0\\ \lambda(y_1,y_2,\cdots,y_p)k&=0\\ (y_1,y_2,\cdots,y_p)k&=0\\ k&=0\\ \end{aligned}
k1Ay1+⋯+kpAypA(y1,y2,⋯,yp)kAHA(y1,y2,⋯,yp)kλ(y1,y2,⋯,yp)k(y1,y2,⋯,yp)kk=0=0=0=0=0=0
故
A
y
1
,
⋯
,
A
y
p
Ay_1,\cdots,Ay_p
Ay1,⋯,Ayp线性无关,因而
λ
\lambda
λ也是
A
A
H
AA^H
AAH的
P
P
P重非零特征值
奇异值
设
A
∈
C
r
m
×
n
A\in \mathbb{C}_r^{m\times n}
A∈Crm×n(秩为
r
r
r的
m
×
n
m\times n
m×n阶复矩阵),特征值为
λ
1
≥
λ
2
≥
⋯
≥
λ
r
>
λ
r
+
1
=
⋯
=
λ
n
=
0
\lambda_1\ge \lambda_2 \ge \cdots \ge \lambda_r >\lambda_{r+1}=\cdots =\lambda_n=0
λ1≥λ2≥⋯≥λr>λr+1=⋯=λn=0
则称
σ
i
=
λ
i
(
i
=
1
,
2
,
⋯
,
r
)
\sigma_i=\sqrt{\lambda_i}(i=1,2,\cdots,r)
σi=λi(i=1,2,⋯,r)为矩阵
A
A
A的正奇异值,简称奇异值
酉等价
设
A
,
B
∈
C
m
×
n
A,B\in \mathbb{C}^{m\times n}
A,B∈Cm×n,如果存在
m
m
m阶酉矩阵
U
U
U和
n
n
n阶酉矩阵
V
V
V,使得
B
=
U
A
V
B=UAV
B=UAV
则称
A
A
A与
B
B
B酉等价或者酉相抵
定理1
若 A A A与 B B B酉等价,则 A A A与 B B B有相同的奇异值
证明:
因为
B
=
U
A
V
B=UAV
B=UAV
B
H
B
=
V
H
A
H
U
H
U
A
V
=
V
H
A
H
A
V
B^H B=V^HA^H U^H UAV=V^HA^HAV
BHB=VHAHUHUAV=VHAHAV
所以
A
H
A
A^HA
AHA与
B
H
B
B^HB
BHB酉相似(相似的复数版),所以他们有相同的特征值
于是
A
A
A与
B
B
B有相同的奇异值
奇异值分解
定义
设
A
∈
C
r
m
×
n
A\in C_r^{m\times n}
A∈Crm×n,则存在
m
m
m阶酉矩阵
U
U
U和
n
n
n阶酉矩阵
V
V
V,使得
U
H
A
V
=
(
Δ
0
0
0
)
U^H A V=\begin{pmatrix} \Delta& 0\\ 0&0\\ \end{pmatrix}
UHAV=(Δ000)
或者
A
=
U
(
Δ
0
0
0
)
V
H
A =U\begin{pmatrix} \Delta& 0\\ 0&0\\ \end{pmatrix}V^H
A=U(Δ000)VH
其中
Δ
=
d
i
a
g
(
σ
1
,
⋯
,
σ
r
)
\Delta=diag(\sigma_1,\cdots,\sigma_r)
Δ=diag(σ1,⋯,σr),
λ
i
\lambda_i
λi为
A
A
H
AA^H
AAH的非零特征值,
且
σ
i
=
λ
i
(
i
=
1
,
2
,
⋯
,
r
)
\sigma_i=\sqrt{\lambda_i}(i=1,2,\cdots,r)
σi=λi(i=1,2,⋯,r),而
σ
i
\sigma_i
σi是
A
A
A的全部奇异值
存在性
证明:
A
A
H
AA^H
AAH是半正定的,故存在酉矩阵
U
U
U,使得
U
H
A
A
H
U
=
(
Δ
Δ
H
0
0
0
)
=
d
i
a
g
(
σ
1
2
,
σ
2
2
,
⋯
,
σ
r
2
,
0
,
⋯
,
0
)
U^HAA^HU= \begin{pmatrix} \Delta \Delta^H & 0\\ 0 & 0\\ \end{pmatrix}=diag(\sigma_1^2,\sigma_2^2,\cdots, \sigma_r^2,0,\cdots,0)
UHAAHU=(ΔΔH000)=diag(σ12,σ22,⋯,σr2,0,⋯,0)
记
U
=
(
x
1
,
⋯
,
x
r
,
x
r
+
1
,
⋯
,
x
m
)
=
(
U
1
,
U
2
)
U=(x_1,\cdots,x_r,x_{r+1},\cdots, x_m)=(U_1,U_2)
U=(x1,⋯,xr,xr+1,⋯,xm)=(U1,U2),
其中
U
1
=
(
x
1
,
⋯
,
x
r
)
,
U
2
=
(
x
r
+
1
,
⋯
,
x
m
)
U_1=(x_1,\cdots,x_r),U_2=(x_{r+1},\cdots, x_m)
U1=(x1,⋯,xr),U2=(xr+1,⋯,xm)
于是
U
H
A
A
H
U
=
(
U
1
,
U
2
)
H
A
A
H
(
U
1
,
U
2
)
=
d
i
a
g
(
σ
1
2
,
σ
2
2
,
⋯
,
σ
r
2
,
0
,
⋯
,
0
)
U^HAA^HU=(U_1,U_2)^HAA^H(U_1,U_2)=diag(\sigma_1^2,\sigma_2^2,\cdots, \sigma_r^2,0,\cdots,0)
UHAAHU=(U1,U2)HAAH(U1,U2)=diag(σ12,σ22,⋯,σr2,0,⋯,0)
比较两边得
U
1
H
A
A
H
U
1
=
d
i
a
g
(
σ
1
2
,
σ
2
2
,
⋯
,
σ
r
2
)
=
Δ
2
=
Δ
Δ
H
U_1^HAA^HU_1=diag(\sigma_1^2,\sigma_2^2,\cdots, \sigma_r^2)=\Delta^2=\Delta\Delta^H
U1HAAHU1=diag(σ12,σ22,⋯,σr2)=Δ2=ΔΔH
U
2
H
A
A
H
U
2
=
0
U_2^HAA^HU_2=0
U2HAAHU2=0
U
2
H
A
A
H
U
2
=
0
⇒
(
A
H
U
2
)
H
(
A
H
U
2
)
=
0
⇒
A
H
U
2
=
0
U_2^HAA^HU_2=0 \Rightarrow (A^HU_2)^H(A^HU_2)=0\Rightarrow A^HU_2=0
U2HAAHU2=0⇒(AHU2)H(AHU2)=0⇒AHU2=0
令
V
1
=
A
H
U
1
(
Δ
−
1
)
H
V_1=A^H U_1(\Delta^{-1})^H
V1=AHU1(Δ−1)H
V
1
H
V
1
=
Δ
−
1
U
1
H
A
A
H
U
1
(
Δ
−
1
)
H
=
Δ
−
1
Δ
Δ
H
(
Δ
−
1
)
H
=
I
r
\begin{aligned} V_1^H V_1&=\Delta^{-1}U_1^H AA^H U_1(\Delta^{-1})^H\\ &=\Delta^{-1}\Delta\Delta^H(\Delta^{-1})^H\\ &=I_r \end{aligned}
V1HV1=Δ−1U1HAAHU1(Δ−1)H=Δ−1ΔΔH(Δ−1)H=Ir
V
1
∈
U
n
×
r
V_1\in U^{n\times r}
V1∈Un×r(酉矩阵集合)
令
V
2
∈
U
n
×
(
n
−
r
)
V_2\in U^{n\times(n-r)}
V2∈Un×(n−r)
V
=
(
V
1
,
V
2
)
V=(V_1,V_2)
V=(V1,V2)
则
V
1
H
V
2
=
0
Δ
−
1
U
1
H
A
V
2
=
0
U
1
H
A
V
2
=
0
\begin{aligned} V_1^HV_2&=0\\ \Delta^{-1}U_1^HAV_2&=0\\ U_1^HAV_2&=0\\ \end{aligned}
V1HV2Δ−1U1HAV2U1HAV2=0=0=0
故
U
H
A
V
=
(
U
1
H
U
2
H
)
A
(
V
1
,
V
2
)
=
(
U
1
H
A
V
1
U
1
H
A
V
2
U
2
H
A
V
1
U
2
H
A
V
2
)
=
(
U
1
H
A
A
H
U
1
(
Δ
−
1
)
H
0
0
0
)
=
(
Δ
Δ
H
(
Δ
−
1
)
H
0
0
0
)
=
(
Δ
0
0
0
)
\begin{aligned} U^HAV&=\begin{pmatrix} U_1^H\\ U_2^H\\ \end{pmatrix}A\begin{pmatrix} V_1,V_2\\ \end{pmatrix}\\ &=\begin{pmatrix} U_1^HAV_1&U_1^HAV_2\\ U_2^HAV_1&U_2^HAV_2\\ \end{pmatrix}\\ &=\begin{pmatrix} U_1^HAA^H U_1(\Delta^{-1})^H&0\\ 0&0\\ \end{pmatrix}\\ &=\begin{pmatrix} \Delta\Delta^H(\Delta^{-1})^H&0\\ 0&0\\ \end{pmatrix}\\ &=\begin{pmatrix} \Delta&0\\ 0&0\\ \end{pmatrix} \end{aligned}
UHAV=(U1HU2H)A(V1,V2)=(U1HAV1U2HAV1U1HAV2U2HAV2)=(U1HAAHU1(Δ−1)H000)=(ΔΔH(Δ−1)H000)=(Δ000)
这种分解方式称为
A
A
A的奇异值分解(SVD分解),实际上他表明了
A
A
A与一个长方对角阵酉等价
求法
A
=
U
(
Δ
0
0
0
)
V
H
=
U
Σ
V
H
A =U\begin{pmatrix} \Delta& 0\\ 0&0\\ \end{pmatrix}V^H=U\Sigma V^H
A=U(Δ000)VH=UΣVH
根据
A
A
H
=
U
Σ
2
U
H
A
H
A
=
V
Σ
2
V
H
AA^H=U\Sigma^2 U^H\\ A^HA =V\Sigma^2 V^H
AAH=UΣ2UHAHA=VΣ2VH
(1)求出
A
A
H
AA^H
AAH或者
A
H
A
A^HA
AHA的特征值
(2)找到
A
A
H
AA^H
AAH特征值对应的标准正交的特征向量作为
U
U
U
(3)
A
H
A
A^HA
AHA的正特征值就是
A
A
H
AA^H
AAH正的特征值,剩下的根据维度补0,然后求出对应的标准正交特征向量作为
V
V
V
(也可以根据 V 1 = A H U 1 ( Δ − 1 ) H V_1=A^HU_1(\Delta^{-1})^H V1=AHU1(Δ−1)H求正的特征值对应的特征向量,然后剩下的用正交求出来)
(4)奇异值就是
A
A
H
AA^H
AAH正的特征向量开根号
(求特征值的时候,应该用
A
A
H
AA^H
AAH和
A
H
A
A^HA
AHA阶数小的来求)
例子
A
=
(
1
0
1
0
1
−
1
)
A=\begin{pmatrix} 1&0&1\\ 0&1&-1\\ \end{pmatrix}
A=(10011−1)
解:
(1)
A
A
H
AA^H
AAH是2阶的
A
H
A
A^HA
AHA是3阶的
所以求
A
A
H
AA^H
AAH的特征值
A
A
H
=
(
2
−
1
−
1
2
)
AA^H= \begin{pmatrix} 2&-1\\ -1&2\\ \end{pmatrix}
AAH=(2−1−12)
特征值
λ
1
=
1
,
λ
2
=
3
\lambda_1=1,\lambda_2=3
λ1=1,λ2=3
(2)
求
A
A
H
AA^H
AAH对应的特征向量
β
1
=
1
2
(
1
1
)
,
β
2
=
1
2
(
1
−
1
)
\beta_1=\frac{1}{\sqrt{2}}\begin{pmatrix} 1\\ 1\\ \end{pmatrix}, \beta_2=\frac{1}{\sqrt{2}}\begin{pmatrix} 1\\ -1\\ \end{pmatrix}
β1=21(11),β2=21(1−1)
(3)
A
H
A
A^HA
AHA正的特征值为
1
,
3
1,3
1,3
因为是
3
3
3阶的,所以另一个特征值是
0
0
0
对应的特征向量是
α
1
=
1
2
(
1
1
0
)
,
α
2
=
1
6
(
1
−
1
2
)
,
α
1
=
1
3
(
1
−
1
−
1
)
\alpha_1=\frac{1}{\sqrt{2}}\begin{pmatrix} 1\\ 1\\ 0\\ \end{pmatrix}, \alpha_2=\frac{1}{\sqrt{6}}\begin{pmatrix} 1\\ -1\\ 2\\ \end{pmatrix}, \alpha_1=\frac{1}{\sqrt{3}}\begin{pmatrix} 1\\ -1\\ -1\\ \end{pmatrix}
α1=21⎝⎛110⎠⎞,α2=61⎝⎛1−12⎠⎞,α1=31⎝⎛1−1−1⎠⎞
(4)
奇异值就是
1
,
3
\sqrt{1},\sqrt{3}
1,3
A = U D V H = ( 1 2 1 2 1 2 − 1 2 ) ( 1 0 0 0 3 0 ) ( 1 2 1 2 0 1 6 − 1 6 2 6 1 3 − 1 3 − 1 3 ) A=UDV^H=\begin{pmatrix} \frac{1}{\sqrt{2}}&\frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}&-\frac{1}{\sqrt{2}}\\ \end{pmatrix} \begin{pmatrix} 1&0&0\\ 0&\sqrt{3}&0\\ \end{pmatrix} \begin{pmatrix} \frac{1}{\sqrt{2}}&\frac{1}{\sqrt{2}}&0\\ \frac{1}{\sqrt{6}}&-\frac{1}{\sqrt{6}}&\frac{2}{\sqrt{6}}\\ \frac{1}{\sqrt{3}}&-\frac{1}{\sqrt{3}}&-\frac{1}{\sqrt{3}}\\ \end{pmatrix} A=UDVH=(212121−21)(100300)⎝⎜⎛21613121−61−31062−31⎠⎟⎞
奇异值性质
σ
m
a
x
(
A
)
≥
∣
λ
∣
≥
σ
m
i
n
\sigma_{max}(A)\ge \left|\lambda\right| \ge \sigma_{min}
σmax(A)≥∣λ∣≥σmin
2.
t
r
(
A
H
A
)
=
∑
i
=
1
r
σ
i
2
tr(A^HA)=\sum_{i=1}^{r} \sigma_i^2
tr(AHA)=i=1∑rσi2
3.
A
A
A列满秩
⇔
\Leftrightarrow
⇔奇异值均非零
极分解
奇异值分解
A
=
U
1
D
V
H
A=U_1DV^H
A=U1DVH
令
P
=
U
1
D
U
1
H
,
U
=
U
1
V
H
P=U_1DU_1^H,U=U_1V^H
P=U1DU1H,U=U1VH
A
∈
C
n
×
n
A\in C^{n\times n}
A∈Cn×n,则存在酉矩阵
U
U
U和唯一的半正定矩阵
P
P
P使得
A
=
P
U
A=PU
A=PU
这种分解称为极分解,矩阵
P
P
P和
U
U
U分别称为
A
A
A的埃尔米特因子和酉因子