Chapter 5 (Eigenvalues and Eigenvectors): Diagonalization (对角化)

本文为《Linear algebra and its applications》的读书笔记

Diagonalization

  • The factorization A = P D P − 1 \boldsymbol{A = PDP^{-1}} A=PDP1, where D D D is a diagonal matrix, is used to compute powers of A A A, decouple dynamical systems in Sections 5.6 and 5.7, and study symmetric matrices and quadratic forms in Chapter 7.

  • Powers of a diagonal matrix are easy to compute. So if A = P D P − 1 A = PDP^{-1} A=PDP1 for some invertible P P P and diagonal D D D, then A k A^k Ak is also easy to compute.
    • For example, if D = [ 5 0 0 3 ] D=\begin{bmatrix}5&0\\0&3\end{bmatrix} D=[5003], then D k = [ 5 k 0 0 3 k ] D^k=\begin{bmatrix}5^k&0\\0&3^k\end{bmatrix} Dk=[5k003k]
  • A square matrix A A A is said to be diagonalizable (可对角化) if A A A is similar to a diagonal matrix, that is, if A = P D P − 1 A = PDP^{-1} A=PDP1 for some invertible matrix P P P and some diagonal matrix D D D.

The Diagonalization Theorem

在这里插入图片描述

  • In other words, A A A is diagonalizable if and only if there are enough eigenvectors to form a basis of R n \mathbb R^n Rn. We call such a basis an eigenvector basis of R n \mathbb R^n Rn.
  • 注意,可对角化不一定可逆 (有可能有 0 特征值)

Matrices Whose Eigenvalues Are Distinct

在这里插入图片描述

Proof: Check Theorem 2 in Section 5.1


EXAMPLE 1

Compute A 8 A^8 A8, where A = [ 4 − 3 2 − 1 ] A=\begin{bmatrix} 4&-3\\2&-1\end{bmatrix} A=[4231].

SOLUTION

  • det ⁡ ( A − λ I ) = λ 2 − 3 λ + 2 = ( λ − 2 ) ( λ − 1 ) \operatorname{det}(A-\lambda I)=\lambda^{2}-3 \lambda+2=(\lambda-2)(\lambda-1) det(AλI)=λ23λ+2=(λ2)(λ1). The eigenvalues are 2 and 1, and the corresponding eigenvectors are v 1 = [ 3 2 ] \mathbf{v}_{1}=\left[\begin{array}{l}3 \\ 2\end{array}\right] v1=[32] and v 2 = [ 1 1 ] \mathbf{v}_{2}=\left[\begin{array}{l}1 \\ 1\end{array}\right] v2=[11].
  • Next, form
    P = [ 3 1 2 1 ] , D = [ 2 0 0 1 ] ,  and  P − 1 = [ 1 − 1 − 2 3 ] P=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right], \quad D=\left[\begin{array}{ll} 2 & 0 \\ 0 & 1 \end{array}\right], \quad \text { and } \quad P^{-1}=\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] P=[3211],D=[2001], and P1=[1213]
  • Since A = P D P − 1 A=P D P^{-1} A=PDP1,
    A 8 = P D 8 P − 1 = [ 3 1 2 1 ] [ 2 8 0 0 1 8 ] [ 1 − 1 − 2 3 ] = [ 3 1 2 1 ] [ 256 0 0 1 ] [ 1 − 1 − 2 3 ] = [ 766 − 765 510 − 509 ] \begin{aligned} A^{8}=P D^{8} P^{-1} &=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right]\left[\begin{array}{rr} 2^{8} & 0 \\ 0 & 1^{8} \end{array}\right]\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] \\ &=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right]\left[\begin{array}{rr} 256 & 0 \\ 0 & 1 \end{array}\right]\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] \\ &=\left[\begin{array}{ll} 766 & -765 \\ 510 & -509 \end{array}\right] \end{aligned} A8=PD8P1=[3211][280018][1213]=[3211][256001][1213]=[766510765509]

Matrices Whose Eigenvalues Are Not Distinct

在这里插入图片描述
PROOF

  • a. Suppose that the multiplicity of the eigenvalue λ k \lambda_k λk is m m m, then d e t ( A − λ I ) det(A-\lambda I) det(AλI) has the form ( λ − λ k ) m ⋅ . . . (\lambda-\lambda_k)^m\cdot... (λλk)m..., which means that A − λ I A-\lambda I AλI can be row reduced to a triangular matrix that has m m m λ − λ k \lambda-\lambda_k λλk entries on its main diagonal. Thus when λ = λ k \lambda=\lambda_k λ=λk, A − λ I = A − λ k I A-\lambda I=A-\lambda_k I AλI=AλkI has at most m m m non-pivot columns. It indicates that d i m N u l ( A − λ k I ) dimNul(A-\lambda_k I) dimNul(AλkI)(the dimension of the eigenspace) is at most m m m.
  • c. Let { v 1 , . . . , v s } \{\boldsymbol v_1,..., \boldsymbol v_s\} {v1,...,vs} be the eigenvectors in the sets { B 1 , . . . , B k } \{\mathcal B_1,...,\mathcal B_k\} {B1,...,Bk}. Suppose { v 1 , . . . , v s } \{\boldsymbol v_1,..., \boldsymbol v_s\} {v1,...,vs} is linearly dependent. Since v 1 \boldsymbol v_1 v1 is nonzero( { v 1 } \{\boldsymbol v_1\} {v1} is linear independent), let r r r be the least index such that v 1 , . . . , v r \boldsymbol v_{1},...,\boldsymbol v_{r} v1,...,vr are linearly dependent. Then there exist scalars c 1 , . . . , c r c_1,..., c_r c1,...,cr such that
    c 1 v 1 + . . . + c r v r = 0        ( 1 ) c_1\boldsymbol v_1+...+ c_r\boldsymbol v_r=\boldsymbol 0\ \ \ \ \ \ (1) c1v1+...+crvr=0      (1)where c 1 , . . . , c r c_1,...,c_r c1,...,cr are not all 0. Suppose { v p + 1 , . . . , v r } \{\boldsymbol v_{p+1},..., \boldsymbol v_r\} {vp+1,...,vr} are eigenvectors corresponding to the same eigenvalue λ p + 1 \lambda_{p+1} λp+1. Since a linear combination of { v p + 1 , . . . , v r } \{\boldsymbol v_{p+1},..., \boldsymbol v_r\} {vp+1,...,vr} is still an eigenvector corresponding to λ p + 1 \lambda_{p+1} λp+1, the equatioin (1) can be transformed into
    c 1 v 1 + . . . + c p v p = w p + 1        ( 2 ) c_1\boldsymbol v_1+...+ c_p\boldsymbol v_p=\boldsymbol w_{p+1}\ \ \ \ \ \ (2) c1v1+...+cpvp=wp+1      (2)where − w p + 1 = c p + 1 v p + 1 + . . . + c r v r -\boldsymbol w_{p+1}=c_{p+1}\boldsymbol v_{p+1}+...+ c_r\boldsymbol v_r wp+1=cp+1vp+1+...+crvr and w p + 1 \boldsymbol w_{p+1} wp+1 is an eigenvector corresponding to λ p + 1 \lambda_{p+1} λp+1. Multiplying both sides of (2) by A A A, we obtain
    c 1 A v 1 + . . . + c p A v p = A w p + 1 c 1 λ 1 v 1 + . . . + c p λ p v p = λ p + 1 w p + 1        ( 3 ) c_1A\boldsymbol v_1+...+ c_pA\boldsymbol v_p=A\boldsymbol w_{p+1}\\c_1\lambda_1\boldsymbol v_1+...+ c_p\lambda_p\boldsymbol v_p=\lambda_{p+1}\boldsymbol w_{p+1}\ \ \ \ \ \ (3) c1Av1+...+cpAvp=Awp+1c1λ1v1+...+cpλpvp=λp+1wp+1      (3)Multiplying both sides of (2) by λ p + 1 \lambda_{p+1} λp+1 and subtracting the result from (3), we have
    c 1 ( λ 1 − λ p + 1 ) v 1 + . . . + c p ( λ p − λ p + 1 ) v p = 0        ( 7 ) c_1(\lambda_1-\lambda_{p+1})\boldsymbol v_1+...+ c_p(\lambda_p-\lambda_{p+1})\boldsymbol v_p=\boldsymbol 0\ \ \ \ \ \ (7) c1(λ1λp+1)v1+...+cp(λpλp+1)vp=0      (7)Since { v 1 , . . . , v p } \{\boldsymbol v_1,..., \boldsymbol v_p\} {v1,...,vp} is linearly independent, the weights in (7) are all zero, which is impossible. Hence { v 1 , . . . , v r } \{\boldsymbol v_1,..., \boldsymbol v_r\} {v1,...,vr} cannot be linearly dependent and therefore must be linearly independent.
好的,我理解您的要求是要手动实现计算矩阵特征值和特征向量的函数,而不使用现成的库函数如numpy.linalg.eig。以下是一个不使用库函数的手动实现方法: ```python import numpy as np def compute_eigen(A): # 计算特征值 eigenvalues = find_eigenvalues(A) # 计算特征向量 eigenvectors = [] for eig_val in eigenvalues: eigenvectors.append(find_eigenvector(A, eig_val)) return eigenvalues, eigenvectors def find_eigenvalues(A): # 使用QR算法计算特征值 n = A.shape[0] Ak = np.copy(A) for _ in range(100): # 迭代100次 Q, R = qr_decomposition(Ak) Ak = np.dot(R, Q) # 特征值就是Ak对角线上的元素 eigenvalues = np.diag(Ak) return eigenvalues def qr_decomposition(A): # 计算QR分解 n = A.shape[0] Q = np.zeros_like(A) R = np.zeros((n, n)) for i in range(n): v = A[:, i] for j in range(i): R[j, i] = np.dot(Q[:, j], A[:, i]) v = v - R[j, i] * Q[:, j] R[i, i] = np.linalg.norm(v) Q[:, i] = v / R[i, i] return Q, R def find_eigenvector(A, eigenvalue): # 使用幂法计算特征向量 n = A.shape[0] x = np.random.rand(n) x = x / np.linalg.norm(x) for _ in range(100): x_new = np.dot(A - eigenvalue * np.eye(n), x) x_new = x_new / np.linalg.norm(x_new) if np.allclose(x, x_new): break x = x_new return x ``` 这个实现包含了以下几个主要步骤: 1. `compute_eigen`函数是主函数,它首先计算特征值,然后对每个特征值计算对应的特征向量。 2. `find_eigenvalues`函数使用QR算法来计算特征值。QR算法是一种迭代方法,通过反复进行QR分解并相乘来近似计算特征值。 3. `qr_decomposition`函数实现了QR分解算法。它将矩阵分解为一个正交矩阵Q和一个上三角矩阵R。 4. `find_eigenvector`函数使用幂法来计算特征向量。幂法是一种迭代方法,可以通过反复与矩阵相乘来找到最大特征值的特征向量。 这个实现是一个基本的版本,可能需要根据具体情况进行调整和优化。在实际应用中,对于大型矩阵或需要高精度结果的场景,建议使用专业的数值计算库。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值