《矩阵理论》大萌课程笔记 - 矩阵分解
总目录
章节名称与链接 | |
---|---|
线性空间与线性变换 | 线性空间与子空间 |
有限维线性空间、基、维数 | |
线性变换 | |
内积空间 | |
特征值与特征向量 | |
特殊矩阵 | |
矩阵分解 | |
矩阵函数 |
声明
本专栏博客用于记录上海交通大学研究生课程《矩阵理论》笔记,课程任教老师为邓大萌老师。所有内容均为博主个人的课堂笔记,包括课堂例题与证明。如有不妥、错误之处欢迎大家指正。
1 正交三角分解
定义: A m × n = U m × n ⋅ R n × n A_{m\times n}=U_{m\times n}\cdot R_{n\times n} Am×n=Um×n⋅Rn×n,其中 U U U的列向量是标准正交向量(单位化,正交化), R R R是主对角元素大于0的上三角阵
条件: A A A是列满秩矩阵
分解方法:斯密特正交化法,按照A的列向量标准正交化,得到的标准正交向量构成 U U U,系数为 R R R
性质:正交三角分解的结果唯一
2 谱分解
定义:已知矩阵 A A A可以相似对角化, P − 1 A P = [ λ 1 λ 2 . . . λ n ] ⇒ A = P [ λ 1 λ 2 . . . λ n ] P − 1 P^{-1}AP=\begin{bmatrix}\lambda_1&&&\\&\lambda_2&&\\&&...&\\&&&\lambda_n\end{bmatrix}\Rightarrow A=P\begin{bmatrix}\lambda_1&&&\\&\lambda_2&&\\&&...&\\&&&\lambda_n\end{bmatrix}P^{-1} P−1AP=⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤⇒A=P⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤P−1
设 P = ( α 1 , α 2 , . . . , α n ) , P − 1 = ( β 1 T , β 2 T , . . . , β n T ) T P=(\alpha_1,\alpha_2,...,\alpha_n),P^{-1}=(\beta_1^T,\beta_2^T,...,\beta_n^T)^T P=(α1,α2,...,αn),P−1=(β1T,β2T,...,βnT)T
A = λ 1 α 1 β 1 T + λ 2 α 2 β 2 T + . . . + λ n α n β n T = Σ i = 1 n λ i G i , G i = α i ⋅ β i T A=\lambda_1\alpha_1\beta_1^T+\lambda_2\alpha_2\beta_2^T+...+\lambda_n\alpha_n\beta_n^T=\Sigma_{i=1}^n\lambda_iG_i,G_i=\alpha_i\cdot\beta_i^T A=λ1α1β1T+λ2α2β2T+...+λnαnβnT=Σi=1nλiGi,Gi=αi⋅βiT
条件: A A A是方阵且可以对角化
分解方法:先求解特征值,再求特征向量得到 P P P,最后按行列写成定义形式。
性质:
- Σ i = 1 k G i = E \Sigma_{i=1}^kG_i=E Σi=1kGi=E
证明: Σ i = 1 k G i = α 1 β 1 T + α 2 β 2 T + . . . + α n β n T = ( α 1 , α 2 , . . . , α n ) ⋅ ( β 1 T , β 2 T , . . . , β n T ) T = P ⋅ P − 1 = E \Sigma_{i=1}^kG_i=\alpha_1\beta_1^T+\alpha_2\beta_2^T+...+\alpha_n\beta_n^T=(\alpha_1,\alpha_2,...,\alpha_n)\cdot (\beta_1^T,\beta_2^T,...,\beta_n^T)^T=P\cdot P^{-1}=E Σi=1kGi=α1β1T+α2β2T+...+αnβnT=(α1,α2,...,αn)⋅(β1T,β2T,...,βnT)T=P⋅P−1=E
证毕。
- G i ⋅ G i = G i , G i ⋅ G j = 0 G_i\cdot G_i=G_i,G_i\cdot G_j=0 Gi⋅Gi=Gi,Gi⋅Gj=0
证明: P − 1 ⋅ P = ( β 1 T , β 2 T , . . . , β n T ) T ⋅ ( α 1 , α 2 , . . . , α n ) = E P^{-1}\cdot P=(\beta_1^T,\beta_2^T,...,\beta_n^T)^T\cdot(\alpha_1,\alpha_2,...,\alpha_n)=E P−1⋅P=(β1T,β2T,...,βnT)T⋅(α1,α2,...,αn)=E
⇒ β i T ⋅ α i = 1 , β i T ⋅ α j = 0 \Rightarrow \beta_i^T\cdot\alpha_i=1,\beta_i^T\cdot\alpha_j=0 ⇒βiT⋅αi=1,βiT⋅αj=0
⇒ ( α i β i ) 2 = α i β i , ( α i β j ) 2 = 0 \Rightarrow (\alpha_i\beta_i)^2=\alpha_i\beta_i,(\alpha_i\beta_j)^2=0 ⇒(αiβi)2=αiβi,(αiβj)2=0
⇒ G i 2 = ( α k β k + α k + 1 β k + 1 + . . . + α k + n i β k + n i ) 2 = G i , G i ⋅ G j = 0 \Rightarrow G_i^2=(\alpha_{k}\beta_{k}+\alpha_{k+1}\beta_{k+1}+...+\alpha_{k+n_i}\beta_{k+n_i})^2=G_i,G_i\cdot G_j=0 ⇒Gi2=(αkβk+αk+1βk+1+...+αk+niβk+ni)2=Gi,Gi⋅Gj=0,证毕。
- r ( G i ) = n i r(G_i)=n_i r(Gi)=ni, n i n_i ni是重数
证明:已知 G i = α k β k + α k + 1 β k + 1 + . . . + α k + n i β k + n i ⇒ G i G_i=\alpha_{k}\beta_{k}+\alpha_{k+1}\beta_{k+1}+...+\alpha_{k+n_i}\beta_{k+n_i}\Rightarrow G_i Gi=αkβk+αk+1βk+1+...+αk+niβk+ni⇒Gi每一列向量由 α k , α k + 1 , . . . , α k + n i \alpha_k,\alpha_{k+1},...,\alpha_{k+n_i} αk,αk+1,...,αk+ni线性组合得到
已知线性无关组 ≤ \le ≤张成组,因此 r ( G i ) ≤ n i r(G_i)\le n_i r(Gi)≤ni
又 Σ i = 1 k G i = E ⇒ Σ i = 1 k r ( G i ) ≥ r ( Σ i = 1 k G i ) = r ( E ) = n = Σ i = 1 k n i \Sigma_{i=1}^k G_i=E\Rightarrow \Sigma_{i=1}^k r(G_i)\ge r(\Sigma_{i=1}^k G_i)=r(E)=n=\Sigma_{i=1}^kn_i Σi=1kGi=E⇒Σi=1kr(Gi)≥r(Σi=1kGi)=r(E)=n=Σi=1kni
⇒ r ( G i ) = n i \Rightarrow r(G_i)=n_i ⇒r(Gi)=ni
- 谱分解的结果唯一
证明:假设谱分解结果不唯一,则存在 A = λ 1 G 1 + λ 2 G 2 + . . . + λ k G k = λ 1 P 1 + λ 2 P 2 + . . . + λ k P k A=\lambda_1G_1+\lambda_2G_2+...+\lambda_kG_k=\lambda_1P_1+\lambda_2P_2+...+\lambda_k P_k A=λ1G1+λ2G2+...+λkGk=λ1P1+λ2P2+...+λkPk
等式两边左乘 G i , 1 ≤ i ≤ k ⇒ λ i G i = G i ( λ 1 P 1 + λ 2 P 2 + . . . + λ k P k ) G_i,1\le i\le k\Rightarrow \lambda_iG_i=G_i(\lambda_1P_1+\lambda_2P_2+...+\lambda_k P_k) Gi,1≤i≤k⇒λiGi=Gi(λ1P1+λ2P2+...+λkPk)
等式两边右乘 P j , 1 ≤ j ≤ k , j ≠ i ⇒ λ i G i P j = λ j G i P j P_j,1\le j\le k,j\not=i\Rightarrow \lambda_i G_iP_j=\lambda_jG_iP_j Pj,1≤j≤k,j=i⇒λiGiPj=λjGiPj
∵ λ i ≠ λ j ⇒ G i P j = 0 , i ≠ j \because \lambda_i\not=\lambda_j\Rightarrow G_iP_j=0,i\not=j ∵λi=λj⇒GiPj=0,i=j
⇒ λ i G i = G i ( λ 1 P 1 + λ 2 P 2 + . . . + λ k P k ) = λ i G i P i \Rightarrow \lambda_iG_i=G_i(\lambda_1P_1+\lambda_2P_2+...+\lambda_k P_k)=\lambda_iG_iP_i ⇒λiGi=Gi(λ1P1+λ2P2+...+λkPk)=λiGiPi
λ i P i = ( λ 1 G 1 + λ 2 G 2 + . . . + λ k G k ) P i = λ i G i P i \lambda_i P_i=(\lambda_1G_1+\lambda_2G_2+...+\lambda_kG_k)P_i=\lambda_iG_iP_i λiPi=(λ1G1+λ2G2+...+λkGk)Pi=λiGiPi
⇒ G i = P i ⇒ \Rightarrow G_i=P_i\Rightarrow ⇒Gi=Pi⇒谱分解唯一
- 正规矩阵一定有谱分解
例:“度量矩阵一定有谱分解”是对是错?
度量矩阵是Hermite阵/正定阵 ⇒ \Rightarrow ⇒度量矩阵是正规矩阵 ⇒ \Rightarrow ⇒正规矩阵一定有有谱分解
所以,度量矩阵一定有谱分解,对
3 三角分解
定义:将方阵 A A A分解为单位下三角阵(对角线元素为1)和上三角阵, A = L R A=LR A=LR
条件: n × n n\times n n×n方阵, r ( A ) = r r(A)=r r(A)=r,前 r r r阶顺序主子式不等于0
分解方法:高斯消元法(行变换化标准型)/左乘行变换矩阵
例: A = [ 1 2 4 2 3 5 3 5 9 ] A=\begin{bmatrix}1&2&4\\2&3&5\\3&5&9\end{bmatrix} A=⎣⎡123235459⎦⎤,求 A A A的三角分解。
[ 1 − 2 1 − 3 1 ] A = [ 1 2 4 0 − 1 − 3 0 − 1 − 3 ] \begin{bmatrix}1&&\\-2&1&\\-3&&1\end{bmatrix}A=\begin{bmatrix}1&2&4\\0&-1&-3\\0&-1&-3\end{bmatrix} ⎣⎡1−2−311⎦⎤A=⎣⎡1002−1−14−3−3⎦⎤
⇒ [ 1 1 − 1 1 ] [ 1 − 2 1 − 3 1 ] A = [ 1 − 2 1 − 1 − 1 1 ] A = [ 1 2 4 0 − 1 − 3 0 0 0 ] \Rightarrow \begin{bmatrix}1&&\\&1&\\&-1&1\end{bmatrix}\begin{bmatrix}1&&\\-2&1&\\-3&&1\end{bmatrix}A=\begin{bmatrix}1&&\\-2&1&\\-1&-1&1\end{bmatrix}A=\begin{bmatrix}1&2&4\\0&-1&-3\\0&0&0\end{bmatrix} ⇒⎣⎡11−11⎦⎤⎣⎡1−2−311⎦⎤A=⎣⎡1−2−11−11⎦⎤A=⎣⎡1002−104−30⎦⎤
⇒ A = [ 1 − 2 1 − 1 − 1 1 ] − 1 [ 1 2 4 0 − 1 − 3 0 0 0 ] = [ 1 2 1 3 1 1 ] [ 1 2 4 0 − 1 − 3 0 0 0 ] \Rightarrow A=\begin{bmatrix}1&&\\-2&1&\\-1&-1&1\end{bmatrix}^{-1}\begin{bmatrix}1&2&4\\0&-1&-3\\0&0&0\end{bmatrix}=\begin{bmatrix}1&&\\2&1&\\3&1&1\end{bmatrix}\begin{bmatrix}1&2&4\\0&-1&-3\\0&0&0\end{bmatrix} ⇒A=⎣⎡1−2−11−11⎦⎤−1⎣⎡1002−104−30⎦⎤=⎣⎡123111⎦⎤⎣⎡1002−104−30⎦⎤
性质:三角分解唯一
⋆ ⋆ ⋆ \star\star\star ⋆⋆⋆特殊的类三角分解:Cholesky分解
定义:若 A A A是正定矩阵,则存在上三角阵 R R R使得 A = R ∗ R A=R^*R A=R∗R,该分解成为Cholesky分解(注意: A A A正定隐含的条件是 A A A为Hermite阵)
证明:已知 A A A是正定矩阵 ⇒ A \Rightarrow A ⇒A是Hermite阵 ⇒ A \Rightarrow A ⇒A可以酉对角化
则存在酉矩阵 U U U使得 U ∗ A U = [ λ 1 λ 2 . . . λ n ] ⇒ A = U [ λ 1 λ 2 . . . λ n ] U ∗ U^*AU=\begin{bmatrix}\lambda_1&&&\\&\lambda_2&&\\&&...&\\&&&\lambda_n\end{bmatrix}\Rightarrow A=U\begin{bmatrix}\lambda_1&&&\\&\lambda_2&&\\&&...&\\&&&\lambda_n\end{bmatrix}U^* U∗AU=⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤⇒A=U⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤U∗
又因为 A A A是正定矩阵 ⇒ λ i > 0 , 1 ≤ i ≤ n \Rightarrow \lambda_i>0,1\le i\le n ⇒λi>0,1≤i≤n
∴ A = U [ λ 1 λ 2 . . . λ n ] [ λ 1 λ 2 . . . λ n ] U ∗ \therefore A=U\begin{bmatrix}\sqrt\lambda_1&&&\\&\sqrt\lambda_2&&\\&&...&\\&&&\sqrt\lambda_n\end{bmatrix}\begin{bmatrix}\sqrt\lambda_1&&&\\&\sqrt\lambda_2&&\\&&...&\\&&&\sqrt\lambda_n\end{bmatrix}U^* ∴A=U⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤U∗
令 B = [ λ 1 λ 2 . . . λ n ] U ∗ ⇒ B ∗ = U [ λ 1 λ 2 . . . λ n ] B=\begin{bmatrix}\sqrt\lambda_1&&&\\&\sqrt\lambda_2&&\\&&...&\\&&&\sqrt\lambda_n\end{bmatrix}U^*\Rightarrow B^*=U\begin{bmatrix}\sqrt\lambda_1&&&\\&\sqrt\lambda_2&&\\&&...&\\&&&\sqrt\lambda_n\end{bmatrix} B=⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤U∗⇒B∗=U⎣⎢⎢⎡λ1λ2...λn⎦⎥⎥⎤
因为 B B B是列满秩矩阵,因此存在正交三角分解,即 B = Q R , B ∗ = R ∗ Q ∗ , Q ∗ Q = E B=QR,B^*=R^*Q^*,Q^*Q=E B=QR,B∗=R∗Q∗,Q∗Q=E
⇒ A = B ∗ B = R ∗ Q ∗ Q R = R ∗ R \Rightarrow A=B^*B=R^*Q^*QR=R^*R ⇒A=B∗B=R∗Q∗QR=R∗R,证毕
求解方法:行、列变换 ⇒ \Rightarrow ⇒求逆 ⇒ \Rightarrow ⇒拆分 ⇒ \Rightarrow ⇒合并
例: A = [ 1 2 2 6 ] A=\begin{bmatrix}1&2\\2&6\end{bmatrix} A=[1226],求 A A A的Cholesky分解
对 A A A进行行列变换得: [ 1 − 2 1 ] A [ 1 − 2 1 ] = [ 1 2 ] = [ 1 2 ] [ 1 2 ] \begin{bmatrix}1&\\-2&1\end{bmatrix}A\begin{bmatrix}1&-2\\&1\end{bmatrix}=\begin{bmatrix}1&\\&2\end{bmatrix}=\begin{bmatrix}1&\\&\sqrt2\end{bmatrix}\begin{bmatrix}1&\\&\sqrt2\end{bmatrix} [1−21]A[1−21]=[12]=[12][12]
⇒ A = [ 1 − 2 1 ] − 1 [ 1 2 ] [ 1 2 ] [ 1 − 2 1 ] − 1 = [ 1 2 1 ] [ 1 2 ] [ 1 2 ] [ 1 2 1 ] = [ 1 2 2 ] [ 1 2 2 ] \Rightarrow A=\begin{bmatrix}1&\\-2&1\end{bmatrix}^{-1}\begin{bmatrix}1&\\&\sqrt2\end{bmatrix}\begin{bmatrix}1&\\&\sqrt2\end{bmatrix}\begin{bmatrix}1&-2\\&1\end{bmatrix}^{-1}=\begin{bmatrix}1&\\2&1\end{bmatrix}\begin{bmatrix}1&\\&\sqrt2\end{bmatrix}\begin{bmatrix}1&\\&\sqrt2\end{bmatrix}\begin{bmatrix}1&2\\&1\end{bmatrix}=\begin{bmatrix}1&\\2&\sqrt2\end{bmatrix}\begin{bmatrix}1&2\\&\sqrt2\end{bmatrix} ⇒A=[1−21]−1[12][12][1−21]−1=[121][12][12][121]=[122][122]
4 奇异值分解( ⋆ ⋆ ⋆ \star\star\star ⋆⋆⋆必考大题15分)
定义:矩阵 A m × n A_{m\times n} Am×n可以分解为 A m × n = U m × m ⋅ D m × n ⋅ V n × n ∗ A_{m\times n}=U_{m\times m}\cdot D_{m\times n}\cdot V^*_{n\times n} Am×n=Um×m⋅Dm×n⋅Vn×n∗,其中 U , V ∗ U,V^* U,V∗是酉矩阵, r ( A ) = r r(A)=r r(A)=r D m × n = [ δ 1 δ 2 . . . δ r 0 . . . 0 ] , δ i > 0 , 0 ≤ i ≤ r , δ i D_{m\times n}=\begin{bmatrix}\delta_1\\&\delta_2\\&&...\\&&&\delta_r\\&&&&0\\&&&&&...\\&&&&&&0\end{bmatrix},\delta_i>0,0\le i\le r,\delta_i Dm×n=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡δ1δ2...δr0...0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤,δi>0,0≤i≤r,δi就是奇异值
A A ∗ = U m × m ⋅ [ δ 1 2 δ 2 2 . . . δ r 2 0 . . . 0 ] m × m ⋅ U m × m ∗ AA^*=U_{m\times m}\cdot \begin{bmatrix}\delta_1^2\\&\delta_2^2\\&&...\\&&&\delta_r^2\\&&&&0\\&&&&&...\\&&&&&&0\end{bmatrix}_{m\times m}\cdot U^*_{m\times m} AA∗=Um×m⋅⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡δ12δ22...δr20...0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤m×m⋅Um×m∗
A ∗ A = V n × n ⋅ [ δ 1 2 δ 2 2 . . . δ r 2 0 . . . 0 ] n × n ⋅ V n × n ∗ A^*A=V_{n\times n}\cdot \begin{bmatrix}\delta_1^2\\&\delta_2^2\\&&...\\&&&\delta_r^2\\&&&&0\\&&&&&...\\&&&&&&0\end{bmatrix}_{n\times n}\cdot V^*_{n\times n} A∗A=Vn×n⋅⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡δ12δ22...δr20...0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤n×n⋅Vn×n∗
存在以下结论保证任意 A A A均有此分解:
- r ( A ) = r ( A A ∗ ) = r ( A ∗ A ) r(A)=r(AA^*)=r(A^*A) r(A)=r(AA∗)=r(A∗A)
证明:
若证得 N ( A ) = N ( A ∗ A ) N(A)=N(A^*A) N(A)=N(A∗A),则可得到 n − r ( A ) = n − r ( A ∗ A ) ⇒ r ( A ) = r ( A ∗ A ) n-r(A)=n-r(A^*A)\Rightarrow r(A)=r(A^*A) n−r(A)=n−r(A∗A)⇒r(A)=r(A∗A) 设 ∀ α ∈ N ( A ) , A α = 0 ⇒ A ∗ A α = 0 , α ∈ N ( A ∗ A ) ⇒ N ( A ) ⊆ N ( A ∗ A ) \forall\alpha\in N(A),A\alpha=0\Rightarrow A^*A\alpha=0,\alpha\in N(A^*A)\Rightarrow N(A)\sube N(A^*A) ∀α∈N(A),Aα=0⇒A∗Aα=0,α∈N(A∗A)⇒N(A)⊆N(A∗A)
设 ∀ β ∈ N ( A ∗ A ) , A ∗ A β = 0 ⇒ β ∗ A ∗ A β = ( A β , A β ) = 0 ⇒ , A β = 0 ⇒ β ∈ N ( A ) ⇒ N ( A ∗ A ) ⊆ N ( A ) \forall \beta\in N(A^*A),A^*A\beta=0\Rightarrow \beta^*A^*A\beta=(A\beta,A\beta)=0\Rightarrow,A\beta=0\Rightarrow \beta\in N(A)\Rightarrow N(A^*A)\sube N(A) ∀β∈N(A∗A),A∗Aβ=0⇒β∗A∗Aβ=(Aβ,Aβ)=0⇒,Aβ=0⇒β∈N(A)⇒N(A∗A)⊆N(A)
⇒ N ( A ) = N ( A ∗ A ) ⇒ r ( A ) = r ( A ∗ A ) \Rightarrow N(A)=N(A^*A)\Rightarrow r(A)=r(A*A) ⇒N(A)=N(A∗A)⇒r(A)=r(A∗A)
同理可证 r ( A ) = r ( A A ∗ ) r(A)=r(AA^*) r(A)=r(AA∗)
⇒ r ( A ) = r ( A A ∗ ) = r ( A ∗ A ) \Rightarrow r(A)=r(AA^*)=r(A^*A) ⇒r(A)=r(AA∗)=r(A∗A)
- A A ∗ AA^* AA∗与 A ∗ A A^*A A∗A非零特征值相同且对应重数也相同
证明:
( A A ∗ ) ∗ = A A ∗ , ( A ∗ A ) ∗ = A ∗ A ⇒ A A ∗ (AA^*)^*=AA^*,(A^*A)^*=A^*A\Rightarrow AA^* (AA∗)∗=AA∗,(A∗A)∗=A∗A⇒AA∗与 A ∗ A A^*A A∗A都是Hermite矩阵 ⇒ A ∗ A , A A ∗ \Rightarrow A^*A,AA^* ⇒A∗A,AA∗可以酉对角化 设 λ i \lambda_i λi是 A ∗ A A^*A A∗A的特征向量 , 1 ≤ i ≤ r , α i ,1\le i\le r,\alpha_i ,1≤i≤r,αi是对应的特征值,则 A ∗ A α i = λ i α i A^*A\alpha_i=\lambda_i\alpha_i A∗Aαi=λiαi
⇒ A A ∗ A α i = λ i A α i ⇒ A α i \Rightarrow AA^*A\alpha_i=\lambda_iA\alpha_i\Rightarrow A\alpha_i ⇒AA∗Aαi=λiAαi⇒Aαi是 A A ∗ AA^* AA∗的特征向量, λ i \lambda_i λi是对应的特征值
⇒ A A ∗ \Rightarrow AA^* ⇒AA∗与 A ∗ A A^*A A∗A的特征值相同
- A A ∗ AA^* AA∗与 A ∗ A A^*A A∗A半正定
证明: x ∗ A ∗ A x = ( A x , A x ) = ∣ ∣ A x ∣ ∣ 2 ≥ 0 ⇒ A ∗ A x^*A^*Ax=(Ax,Ax)=||Ax||^2\ge 0\Rightarrow A^*A x∗A∗Ax=(Ax,Ax)=∣∣Ax∣∣2≥0⇒A∗A半正定,同理, A A ∗ AA^* AA∗半正定
-
A A ∗ AA^* AA∗或者 A ∗ A A^*A A∗A的特征值开方等于奇异值
-
若 A A ∗ AA^* AA∗的单位正交特征向量为 α 1 , α 2 , . . . , α r , A ∗ A \alpha_1,\alpha_2,...,\alpha_r,A^*A α1,α2,...,αr,A∗A的单位正交特征向量为 A ∗ α 1 λ 1 , A ∗ α 2 λ 2 , . . . , A ∗ α r λ r \frac{A^*\alpha_1}{\sqrt\lambda_1},\frac{A^*\alpha_2}{\sqrt\lambda_2},...,\frac{A^*\alpha_r}{\sqrt\lambda_r} λ1A∗α1,λ2A∗α2,...,λrA∗αr
奇异值分解变体–极分解
当 A A A是一个方阵时, A = U D V ∗ = ( U D U ∗ ) ( U V ∗ ) A=UDV^*=(UDU^*)(UV^*) A=UDV∗=(UDU∗)(UV∗),其中 U D U ∗ UDU^* UDU∗是一个Hermite阵且特征值大于等于 0 0 0,因此 U D U ∗ UDU^* UDU∗是半正定阵; U V ∗ UV^* UV∗是两个酉矩阵相乘,结果仍然为酉矩阵,若 A A A可逆,则 D i i > 0 , U D U ∗ D_{ii}>0,UDU^* Dii>0,UDU∗是正定阵。
例1:证明 A n × n , r ( A ) = r A_{n\times n},r(A)=r An×n,r(A)=r可以分解为幂等阵与可逆阵的乘积,即 A = P Q , P , Q A=PQ,P,Q A=PQ,P,Q分别为幂等阵与可逆阵
方法一:
证明:已知幂等矩阵的特征值为0或1,不妨令 B = d i a g ( 1 , 1... , 1 , 0 , . . . , 0 ) B=diag(1,1...,1,0,...,0) B=diag(1,1...,1,0,...,0),其中1的个数为r
A = U D V ∗ = U ( B D ) V ∗ = U ( B D ′ ) V ∗ = ( U B U ∗ ) ( U D ′ V ∗ ) , D ′ = d i a g ( δ 1 , δ 2 , . . . , δ r , 1 , . . . , 1 ) A=UDV^*=U(BD)V^*=U(BD')V^*=(UBU^*)(UD'V^*),D'=diag(\delta_1,\delta_2,...,\delta_r,1,...,1) A=UDV∗=U(BD)V∗=U(BD′)V∗=(UBU∗)(UD′V∗),D′=diag(δ1,δ2,...,δr,1,...,1)
令 P = U B U ∗ , Q = U D ′ V ∗ ⇒ P 2 = U B U ∗ U B U ∗ = U B 2 U ∗ = U B U ∗ = P ⇒ P P=UBU^*,Q=UD'V^*\Rightarrow P^2=UBU^*UBU^*=UB^2U^*=UBU^*=P\Rightarrow P P=UBU∗,Q=UD′V∗⇒P2=UBU∗UBU∗=UB2U∗=UBU∗=P⇒P是幂等阵
又令 Q = U D ′ V ∗ Q=UD'V^* Q=UD′V∗,由 U , D ′ , V ∗ U,D',V^* U,D′,V∗均为可逆阵得 Q Q Q为可逆阵
⇒ A = P Q \Rightarrow A=PQ ⇒A=PQ证毕
方法二:
证明:设 P ′ , Q ′ P',Q' P′,Q′为 A A A的行、列变换阵,使得A化为形如 B = d i a g ( 1 , 1... , 1 , 0 , . . . , 0 ) B=diag(1,1...,1,0,...,0) B=diag(1,1...,1,0,...,0)形式
则 P ′ A Q ′ = B ⇒ A = P ′ − 1 B Q ′ − 1 = ( P ′ − 1 B P ′ ) ( P ′ − 1 Q ′ − 1 ) P'AQ'=B\Rightarrow A=P'^{-1}BQ'^{-1}=(P'^{-1}BP')(P'^{-1}Q'^{-1}) P′AQ′=B⇒A=P′−1BQ′−1=(P′−1BP′)(P′−1Q′−1)
其中令 P = P ′ − 1 B P ′ , Q = P ′ − 1 Q ′ − 1 P=P'^{-1}BP',Q=P'^{-1}Q'^{-1} P=P′−1BP′,Q=P′−1Q′−1,则 P , Q P,Q P,Q分别是幂等阵与可逆阵, A = P Q A=PQ A=PQ证毕。
例2:试将正规矩阵 A A A的谱分解改写为奇异值分解
解: A A A的谱分解为 A = U B U ∗ A=UBU^* A=UBU∗,其中 U U U是酉矩阵, B = d i a g ( λ 1 , λ 2 , . . . , λ r , 0 , . . . , 0 ) B=diag(\lambda_1,\lambda_2,...,\lambda_r,0,...,0) B=diag(λ1,λ2,...,λr,0,...,0)
将 B B B改写为 B = P Q , P = d i a g ( ∣ λ 1 ∣ , ∣ λ 2 ∣ , . . . , ∣ λ r ∣ , 0 , . . . , 0 ) , Q = d i a g ( λ 1 ∣ λ 1 ∣ , λ 2 ∣ λ 2 ∣ , . . . , λ r ∣ λ r ∣ , 1 , . . . , 1 ) B=PQ,P=diag(|\lambda_1|,|\lambda_2|,...,|\lambda_r|,0,...,0) ,Q= diag(\frac{\lambda_1}{|\lambda_1|},\frac{\lambda_2}{|\lambda_2|},...,\frac{\lambda_r}{|\lambda_r|},1,...,1) B=PQ,P=diag(∣λ1∣,∣λ2∣,...,∣λr∣,0,...,0),Q=diag(∣λ1∣λ1,∣λ2∣λ2,...,∣λr∣λr,1,...,1)
其中 P P P是半正定矩阵, Q Q Q是酉矩阵,令 V ∗ = Q U ∗ V^*=QU^* V∗=QU∗
⇒ A = U P V ∗ \Rightarrow A=UPV^* ⇒A=UPV∗,改写完成
注:其实可以利用极分解来改写, A = U B U ∗ , B = P Q , P A=UBU^*,B=PQ,P A=UBU∗,B=PQ,P是半正定阵, Q Q Q是酉矩阵…
例3:已知 A = U D 1 U ∗ , B = V D 2 V ∗ , D 1 , D 2 A=UD_1U^*,B=VD_2V^*,D_1,D_2 A=UD1U∗,B=VD2V∗,D1,D2是半正定矩阵,求 C = [ 0 A B 0 ] C=\begin{bmatrix}0&A\\B&0\end{bmatrix} C=[0BA0]的奇异值分解
解: C = [ 0 A B 0 ] = [ A 0 0 B ] ⋅ [ 0 1 1 0 ] = [ U 0 0 V ] ⋅ [ D 1 0 0 D 2 ] ⋅ [ U ∗ 0 0 V ∗ ] [ 0 1 1 0 ] C=\begin{bmatrix}0&A\\B&0\end{bmatrix}=\begin{bmatrix}A&0\\0&B\end{bmatrix}\cdot \begin{bmatrix}0&1\\1&0\end{bmatrix}=\begin{bmatrix}U&0\\0&V\end{bmatrix}\cdot \begin{bmatrix}D_1&0\\0&D_2\end{bmatrix}\cdot \begin{bmatrix}U^*&0\\0&V^*\end{bmatrix}\begin{bmatrix}0&1\\1&0\end{bmatrix} C=[0BA0]=[A00B]⋅[0110]=[U00V]⋅[D100D2]⋅[U∗00V∗][0110]
= [ U 0 0 V ] ⋅ [ D 1 0 0 D 2 ] ⋅ [ 0 U ∗ V ∗ 0 ] =\begin{bmatrix}U&0\\0&V\end{bmatrix}\cdot \begin{bmatrix}D_1&0\\0&D_2\end{bmatrix}\cdot \begin{bmatrix}0&U^*\\V^*&0\end{bmatrix} =[U00V]⋅[D100D2]⋅[0V∗U∗0]
设 D 1 = d i a g ( δ 1 , δ 2 , . . . , δ r 1 , 0 , . . . , 0 ) , D 2 = d i a g ( δ r 1 + 1 , δ r 1 + 2 , . . . , δ r 1 + r 2 , 0 , . . . , 0 ) D_1=diag(\delta_1,\delta_2,...,\delta_{r_1},0,...,0),D_2=diag(\delta_{r_1+1},\delta_{r_1+2},...,\delta_{r_1+r_2},0,...,0) D1=diag(δ1,δ2,...,δr1,0,...,0),D2=diag(δr1+1,δr1+2,...,δr1+r2,0,...,0),存在行列变换矩阵 P 2 n × 2 n , Q 2 n × 2 n P_{2n\times 2n},Q_{2n\times 2n} P2n×2n,Q2n×2n使得 P ⋅ [ D 1 0 0 D 2 ] ⋅ Q = D = d i a g ( δ 1 , δ 2 , . . . , δ r 1 + r 2 , 0 , . . . , 0 ) P\cdot\begin{bmatrix}D_1&0\\0&D_2\end{bmatrix}\cdot Q=D=diag(\delta_1,\delta_2,...,\delta_{r_1+r_2},0,...,0) P⋅[D100D2]⋅Q=D=diag(δ1,δ2,...,δr1+r2,0,...,0)
由于该行列变换只涉及行列交换运算,因此 P , Q P,Q P,Q是酉矩阵(证明略)
令 U C = [ U 0 0 V ] ⋅ P − 1 , V C ∗ = Q − 1 ⋅ [ 0 U ∗ V ∗ 0 ] U_C=\begin{bmatrix}U&0\\0&V\end{bmatrix}\cdot P^{-1},V_C^*=Q^{-1}\cdot\begin{bmatrix}0&U^*\\V^*&0\end{bmatrix} UC=[U00V]⋅P−1,VC∗=Q−1⋅[0V∗U∗0]
⇒ C = U C ⋅ D ⋅ V C ∗ \Rightarrow C=U_C\cdot D\cdot V_C^* ⇒C=UC⋅D⋅VC∗
从奇异值分解中得到四个子空间
已知 A = U D V ∗ A=UDV^* A=UDV∗,设 U = ( α 1 , α 2 , . . . , α r , α r + 1 , . . . , α m ) , V = ( β 1 , β 2 , . . . , β r , β r + 1 , . . . , β n ) U=(\alpha_1,\alpha_2,...,\alpha_r,\alpha_{r+1},...,\alpha_{m}),V=(\beta_1,\beta_2,...,\beta_r,\beta_{r+1},...,\beta_{n}) U=(α1,α2,...,αr,αr+1,...,αm),V=(β1,β2,...,βr,βr+1,...,βn)
则:
N
(
A
)
=
S
p
a
n
(
β
r
+
1
,
.
.
.
,
β
n
)
N(A)=Span(\beta_{r+1},...,\beta_n)
N(A)=Span(βr+1,...,βn)
N ( A ∗ ) = S p a n ( α r + 1 , . . . , α m ) N(A^*)=Span(\alpha_{r+1},...,\alpha_m) N(A∗)=Span(αr+1,...,αm)
C ( A ) = S p a n ( α 1 , α 2 , . . . , α r ) C(A)=Span(\alpha_1,\alpha_2,...,\alpha_r) C(A)=Span(α1,α2,...,αr)
R ( A ) = S p a n ( β 1 , β 2 , . . . , β r ) R(A)=Span(\beta_1,\beta_2,...,\beta_r) R(A)=Span(β1,β2,...,βr)