Chapter8 矩阵特征值与特征向量的计算
7.1 前言
- 一般求法:
- 先由 ∣ A − λ E ∣ = 0 |A-\lambda E|=0 ∣A−λE∣=0求特征值 λ \lambda λ
- 再有 ( A − λ E ) x = 0 (A-\lambda E)x=0 (A−λE)x=0求特征向量 x x x
- 对阶数特别多的矩阵A难以求行列式,这个时候就需要数值解法来求矩阵的特征值
- 方法:
- 乘幂法:求主特征值
- 反幂法:求最小的特征值
7.2 乘幂法
7.2.1 特点
- 求主特征值和特征向量
- 假设,对实数矩阵A:
- 其有n个线性无关的特征向量 x ( i ) , i = 1 , 2 , ⋯ , n x^{(i)},i=1,2,\cdots,n x(i),i=1,2,⋯,n
- A x ( i ) = λ i x ( i ) Ax^{(i)}=\lambda_{i}x^{(i)} Ax(i)=λix(i),其中 λ i \lambda_{i} λi是实特征值且满足 ∣ λ 1 ∣ > ∣ λ 2 ∣ ≥ ⋯ ≥ ∣ λ n ∣ > 0 |\lambda_{1}| > |\lambda_{2}| \geq \cdots \geq |\lambda_{n}| >0 ∣λ1∣>∣λ2∣≥⋯≥∣λn∣>0
- 优点:算法简单,易于实现,对于高阶稀疏矩阵较为合适
- 缺点:收敛速度依赖于 ∣ λ 2 λ 1 ∣ k |\frac{\lambda_2}{\lambda_1}|^k ∣λ1λ2∣k,越小越快
7.2.2 原始乘幂法
- 基本步骤:
- 任取非0列向量 v ( 0 ) v^{(0)} v(0)
- v ( 1 ) = A v ( 0 ) → ⋯ → v ( k + 1 ) = A v ( k ) v^{(1)}=Av^{(0)} \rightarrow \cdots \rightarrow v^{(k+1)}=Av^{(k)} v(1)=Av(0)→⋯→v(k+1)=Av(k)
- v ( k + 1 ) = A v ( k ) ≈ λ 1 v ( k ) v^{(k+1)}=Av^{(k)}\approx\lambda_{1}v^{(k)} v(k+1)=Av(k)≈λ1v(k)
- λ 1 ≈ v ( k + 1 ) v ( k ) \lambda_{1}\approx\frac{v^{(k+1)}}{v^{(k)}} λ1≈v(k)v(k+1)
- 证明:由归纳法,提出最大特征值 λ 1 \lambda_{1} λ1,其余项趋近于0
选定初始列向量 v ( 0 ) v^{(0)} v(0),由于特征向量线性无关
v ( 0 ) = ∑ i = 1 n α i x ( i ) = α 1 x ( 1 ) + α 2 x ( 2 ) + ⋯ + α n x ( n ) v^{(0)}= \sum_{i=1}^{n} \alpha_ix^{(i)}=\alpha_1x^{(1)}+\alpha_2x^{(2)}+\cdots+\alpha_nx^{(n)} v(0)=i=1∑nαix(i)=α1x(1)+α2x(2)+⋯+αnx(n)
通过迭代关系 v ( 1 ) = A v ( 0 ) v^{(1)}=Av^{(0)} v(1)=Av(0)和特征向量定义式 A x ( i ) = λ i x ( i ) Ax^{(i)}=\lambda_{i}x^{(i)} Ax(i)=λix(i)
v ( 1 ) = A v ( 0 ) = ∑ i = 1 n α i A x ( i ) = ∑ i = 1 n α i λ i x ( i ) v^{(1)}=Av^{(0)}=\sum_{i=1}^{n}\alpha_iAx^{(i)}=\sum_{i=1}^{n}\alpha_i \lambda_{i}x^{(i)} v(1)=Av(0)=i=1∑nαiAx(i)=i=1∑nαiλix(i)
由归纳法,提出最大特征值 ∣ λ 1 ∣ |\lambda_{1}| ∣λ1∣
v ( k ) = A v ( k − 1 ) = λ 1 k ( α 1 x ( 1 ) + α 2 λ 2 k λ 1 k x ( 2 ) + ⋯ + α n λ n k λ 1 k x ( n ) ) v^{(k)}=Av^{(k-1)}=\lambda_1^k(\alpha_1x^{(1)}+\alpha_2 \frac{\lambda_2^k}{\lambda_1^k}x^{(2)}+\cdots+\alpha_n \frac{\lambda_n^k}{\lambda_1^k}x^{(n)}) v(k)=Av(k−1)=λ1k(α1x(1)+α2λ1kλ2kx(2)+⋯+αnλ1kλnkx(n))
当k充分大时, ( λ i λ 1 ) k (\frac{\lambda_i}{\lambda_1})^k (λ1λi)k全部趋近于0
v ( k ) = λ 1 k α 1 x ( 1 ) v^{(k)}=\lambda_1^k\alpha_1x^{(1)} v(k)=λ1kα1x(1)
v ( k + 1 ) = λ 1 k + 1 α 1 x ( 1 ) = λ 1 v ( k ) v^{(k+1)}=\lambda_1^{k+1}\alpha_1x^{(1)}=\lambda_1v^{(k)} v(k+1)=λ1k+1α1x(1)=λ1v(k)
- 特点:主特征向量每迭代一次要乘以一个
λ
\lambda
λ
- ∣ λ 1 ∣ > 1 |\lambda_1|>1 ∣λ1∣>1,主特征向量上溢->数值特别大
- ∣ λ 1 ∣ < 1 |\lambda_1|<1 ∣λ1∣<1,主特征向量下溢->数值趋近于0
- 因此,提出如下改进乘幂法
7.2.3 改进乘幂法
- 本质:为了解决乘幂法数值上溢or下溢问题
- 特殊操作:向量规范化
- 将向量所有元素除以模最大分量
- u ( k ) = v ( k ) m a x [ v ( k ) ] u^{(k)}=\frac{v^{(k)}}{max[v^{(k)}]} u(k)=max[v(k)]v(k)
- 例子:v=(3,4,-10);max_v=|-10|=10;u=(3/10,4/10,-1)
- 步骤
- 任取非0列向量 v ( 0 ) v^{(0)} v(0),并计算规范向量 u ( 0 ) = v ( 0 ) m a x [ v ( 0 ) ] u^{(0)}=\frac{v^{(0)}}{max[v^{(0)}]} u(0)=max[v(0)]v(0)
- 计算 v ( k + 1 ) = A u ( k ) v^{(k+1)}=Au^{(k)} v(k+1)=Au(k)
- 直到 ∣ λ k + 1 − λ k ∣ < ε |\lambda_{k+1}-\lambda_k|<\varepsilon ∣λk+1−λk∣<ε
- 输出 λ 1 = m a x [ ∣ v ( k + 1 ) ∣ ] \lambda_1=max[|v^{(k+1)}|] λ1=max[∣v(k+1)∣],特征向量 u ( k + 1 ) = u ( k ) u^{(k+1)}=u^{(k)} u(k+1)=u(k)
- 证明:主要证明u和v两个列向量的关系,最后可以得出 λ 1 = m a x [ ∣ v ( k + 1 ) ∣ ] \lambda_1=max[|v^{(k+1)}|] λ1=max[∣v(k+1)∣]的结论
由定义可得:
u ( 0 ) = α 1 x ( 1 ) + α 2 x ( 2 ) + ⋯ + α n x ( n ) m a x [ α 1 x ( 1 ) + α 2 x ( 2 ) + ⋯ + α n x ( n ) ] u^{(0)}=\frac{\alpha_1x^{(1)}+\alpha_2x^{(2)}+\cdots+\alpha_nx^{(n)}}{max[\alpha_1x^{(1)}+\alpha_2x^{(2)}+\cdots+\alpha_nx^{(n)}]} u(0)=max[α1x(1)+α2x(2)+⋯+αnx(n)]α1x(1)+α2x(2)+⋯+αnx(n)
推导为k次迭代时:
u ( k ) = α 1 λ 1 k x ( 1 ) + α 2 λ 2 k x ( 2 ) + ⋯ + α n λ n k x ( n ) m a x [ α 1 λ 1 k x ( 1 ) + α 2 λ 2 k x ( 2 ) + ⋯ + α n λ n k x ( n ) ] u^{(k)}=\frac{\alpha_1 \lambda_1^k x^{(1)}+\alpha_2 \lambda_2^k x^{(2)}+\cdots+\alpha_n \lambda_n^k x^{(n)}}{max[\alpha_1 \lambda_1^k x^{(1)}+\alpha_2 \lambda_2^k x^{(2)}+\cdots+\alpha_n \lambda_n^k x^{(n)}]} u(k)=max[α1λ1kx(1)+α2λ2kx(2)+⋯+αnλnkx(n)]α1λ1kx(1)+α2λ2kx(2)+⋯+αnλnkx(n)
由归纳法,提出最大特征值 ∣ λ 1 ∣ |\lambda_{1}| ∣λ1∣
u ( k ) = λ 1 k ( α 1 x ( 1 ) + α 2 λ 2 k λ 1 k x ( 2 ) + ⋯ + α n λ n k λ 1 k x ( n ) ) m a x [ λ 1 k ( α 1 x ( 1 ) + α 2 λ 2 k λ 1 k x ( 2 ) + ⋯ + α n λ n k λ 1 k x ( n ) ) ] = α 1 x ( 1 ) + α 2 λ 2 k λ 1 k x ( 2 ) + ⋯ + α n λ n k λ 1 k x ( n ) m a x [ α 1 x ( 1 ) + α 2 λ 2 k λ 1 k x ( 2 ) + ⋯ + α n λ n k λ 1 k x ( n ) ] u^{(k)}=\frac{\lambda_1^k(\alpha_1x^{(1)}+\alpha_2 \frac{\lambda_2^k}{\lambda_1^k}x^{(2)}+\cdots+\alpha_n \frac{\lambda_n^k}{\lambda_1^k}x^{(n)})}{max[\lambda_1^k(\alpha_1x^{(1)}+\alpha_2 \frac{\lambda_2^k}{\lambda_1^k}x^{(2)}+\cdots+\alpha_n \frac{\lambda_n^k}{\lambda_1^k}x^{(n)})]}=\frac{\alpha_1x^{(1)}+\alpha_2 \frac{\lambda_2^k}{\lambda_1^k}x^{(2)}+\cdots+\alpha_n \frac{\lambda_n^k}{\lambda_1^k}x^{(n)}}{max[\alpha_1x^{(1)}+\alpha_2 \frac{\lambda_2^k}{\lambda_1^k}x^{(2)}+\cdots+\alpha_n \frac{\lambda_n^k}{\lambda_1^k}x^{(n)}]} u(k)=max[λ1k(α1x(1)+α2λ1kλ2kx(2)+⋯+αnλ1kλnkx(n))]λ1k(α1x(1)+α2λ1kλ2kx(2)+⋯+αnλ1kλnkx(n))=max[α1x(1)+α2λ1kλ2kx(2)+⋯+αnλ1kλnkx(n)]α1x(1)+α2λ1kλ2kx(2)+⋯+αnλ1kλnkx(n)
当k充分大时, ( λ i λ 1 ) k (\frac{\lambda_i}{\lambda_1})^k (λ1λi)k全部趋近于0,u向量也全部收敛
u ( k ) = x ( 1 ) m a x [ x ( 1 ) ] → u ( k + 1 ) = u ( k ) u^{(k)}=\frac{x^{(1)}}{max[x^{(1)}]}\rightarrow u^{(k+1)}=u^{(k)} u(k)=max[x(1)]x(1)→u(k+1)=u(k)
由迭代关系式 v ( k + 1 ) = A u ( k ) v^{(k+1)}=Au^{(k)} v(k+1)=Au(k)和特征向量定义式 A x ( i ) = λ i x ( i ) Ax^{(i)}=\lambda_{i}x^{(i)} Ax(i)=λix(i)
v ( k + 1 ) = A u ( k ) = A x ( 1 ) m a x [ x ( 1 ) ] = λ 1 x ( 1 ) m a x [ x ( 1 ) ] = λ 1 u ( k ) v^{(k+1)}=Au^{(k)}=\frac{Ax^{(1)}}{max[x^{(1)}]}=\frac{\lambda_1x^{(1)}}{max[x^{(1)}]}=\lambda_1u^{(k)} v(k+1)=Au(k)=max[x(1)]Ax(1)=max[x(1)]λ1x(1)=λ1u(k)
最终:
u ( k + 1 ) = v ( k + 1 ) m a x [ v ( k + 1 ) ] = λ 1 u ( k ) m a x [ v ( k + 1 ) ] u^{(k+1)}=\frac{v^{(k+1)}}{max[v^{(k+1)}]}=\frac{\lambda_1u^{(k)}}{max[v^{(k+1)}]} u(k+1)=max[v(k+1)]v(k+1)=max[v(k+1)]λ1u(k)
由 u ( k + 1 ) = u ( k ) u^{(k+1)}=u^{(k)} u(k+1)=u(k):
m a x [ v ( k + 1 ) ] = λ 1 max[v^{(k+1)}]=\lambda_1 max[v(k+1)]=λ1
- 一道例题:计算矩阵[3 -4 3;-4 6 3;3 3 1]的最大特征值和对应的特征向量
clc,clear all;
A=[3 -4 3;-4 6 3;3 3 1];
v0=[1 1 1]';
l0=max(abs(v0));
u0=v0./l0;
% calculate
l=l0;lold=inf;v=v0;u=u0;
while abs(l-lold) >1e-4
lold=l;
v=A*u;
l=max(abs(v));
u=v./l;
end
l,u
7.3 反幂法
7.3.1 特点
- 用于计算按模最小特征值及其对应特征向量的迭代方法
- 本质:原来是求A的最大特征值,现在是求A逆的最大特征值
- 特点:
A x = λ x → A − 1 A x = I x = λ A − 1 x → A − 1 x = 1 λ x Ax=\lambda x \rightarrow A^{-1}Ax=Ix=\lambda A^{-1}x\rightarrow A^{-1}x=\frac{1}{\lambda}x Ax=λx→A−1Ax=Ix=λA−1x→A−1x=λ1x
7.3.2 步骤
- 任取非0列向量 v ( 0 ) v^{(0)} v(0),并计算规范向量 u ( 0 ) = v ( 0 ) m a x [ v ( 0 ) ] u^{(0)}=\frac{v^{(0)}}{max[v^{(0)}]} u(0)=max[v(0)]v(0)
- 计算 A v ( k + 1 ) = u ( k ) Av^{(k+1)}=u^{(k)} Av(k+1)=u(k) 等价于 v ( k + 1 ) = A − 1 u ( k ) v^{(k+1)}=A^{-1}u^{(k)} v(k+1)=A−1u(k)
- 直到 ∣ λ k + 1 − 1 − λ k − 1 ∣ < ε |\lambda_{k+1}^{-1}-\lambda_k^{-1}|<\varepsilon ∣λk+1−1−λk−1∣<ε
- 输出 λ k + 1 − 1 = m a x [ ∣ v ( k + 1 ) ∣ ] − 1 \lambda_{k+1}^{-1}=max[|v^{(k+1)}|]^{-1} λk+1−1=max[∣v(k+1)∣]−1,特征向量 u ( k + 1 ) = u ( k ) u^{(k+1)}=u^{(k)} u(k+1)=u(k)
7.3.3 带原点平移的反幂法
最后计算的特征值一定记得加p
- 本质:现在已知矩阵A有一个(粗略的、大概的、不真实的)特征值p,想求距离p最近的这个特征值
- 假设:
- 数p是矩阵A的第i个特征值 λ i \lambda_i λi的近似
- 满足 0 < ∣ λ i − p ∣ < ∣ λ j − p ∣ , j ≠ i 0<|\lambda_i -p|<|\lambda_j -p|, j\neq i 0<∣λi−p∣<∣λj−p∣,j=i
- 通过平移,矩阵B=A-pI满足反幂法所要求的假设,特征值转换为
μ
=
λ
j
−
p
μ=\lambda_j -p
μ=λj−p,此时μ为最小的特征值,采用反幂法即可.
( A − p I ) x = A x − p x = λ x − p x = ( λ − p ) x (A-pI)x=Ax-px=\lambda x-px=(\lambda-p)x (A−pI)x=Ax−px=λx−px=(λ−p)x - 步骤
- 任取非0列向量 v ( 0 ) v^{(0)} v(0),并计算规范向量 u ( 0 ) = v ( 0 ) m a x [ v ( 0 ) ] u^{(0)}=\frac{v^{(0)}}{max[v^{(0)}]} u(0)=max[v(0)]v(0)
- 计算 ( A − p I ) v ( k + 1 ) = u ( k ) (A-pI)v^{(k+1)}=u^{(k)} (A−pI)v(k+1)=u(k)
- 直到 ∣ λ k + 1 − 1 − λ k − 1 ∣ < ε |\lambda_{k+1}^{-1}-\lambda_k^{-1}|<\varepsilon ∣λk+1−1−λk−1∣<ε
- 输出 λ k + 1 − 1 + p \lambda_{k+1}^{-1}+p λk+1−1+p,特征向量 u ( k + 1 ) = u ( k ) u^{(k+1)}=u^{(k)} u(k+1)=u(k)
- 例题一道
clc,clear all;
A=[6 2 1; 2 3 1;1 1 1];
p=6;
A=A-p*eye(3);
v0=[1 1 1]';
l0=max(abs(v0));
u0=v0./l0;
% calculate
l=l0;lold=inf;v=v0;u=u0;
while abs(1/l-1/lold) >1e-4
lold=l;
v=A\u;
l=max(abs(v));
u=v./l;
end
1/l+p,u