本文为《Linear algebra and its applications》的读书笔记
目录
Diagonalization
- The factorization A = P D P − 1 \boldsymbol{A = PDP^{-1}} A=PDP−1, where D D D is a diagonal matrix, is used to compute powers of A A A, decouple dynamical systems in Sections 5.6 and 5.7, and study symmetric matrices and quadratic forms in Chapter 7.
- Powers of a diagonal matrix are easy to compute. So if
A
=
P
D
P
−
1
A = PDP^{-1}
A=PDP−1 for some invertible
P
P
P and diagonal
D
D
D, then
A
k
A^k
Ak is also easy to compute.
- For example, if D = [ 5 0 0 3 ] D=\begin{bmatrix}5&0\\0&3\end{bmatrix} D=[5003], then D k = [ 5 k 0 0 3 k ] D^k=\begin{bmatrix}5^k&0\\0&3^k\end{bmatrix} Dk=[5k003k]
- A square matrix A A A is said to be diagonalizable (可对角化) if A A A is similar to a diagonal matrix, that is, if A = P D P − 1 A = PDP^{-1} A=PDP−1 for some invertible matrix P P P and some diagonal matrix D D D.
The Diagonalization Theorem
- In other words, A A A is diagonalizable if and only if there are enough eigenvectors to form a basis of R n \mathbb R^n Rn. We call such a basis an eigenvector basis of R n \mathbb R^n Rn.
- 注意,可对角化不一定可逆 (有可能有 0 特征值)
Matrices Whose Eigenvalues Are Distinct
Proof: Check Theorem 2 in Section 5.1
EXAMPLE 1
Compute A 8 A^8 A8, where A = [ 4 − 3 2 − 1 ] A=\begin{bmatrix} 4&-3\\2&-1\end{bmatrix} A=[42−3−1].
SOLUTION
- det ( A − λ I ) = λ 2 − 3 λ + 2 = ( λ − 2 ) ( λ − 1 ) \operatorname{det}(A-\lambda I)=\lambda^{2}-3 \lambda+2=(\lambda-2)(\lambda-1) det(A−λI)=λ2−3λ+2=(λ−2)(λ−1). The eigenvalues are 2 and 1, and the corresponding eigenvectors are v 1 = [ 3 2 ] \mathbf{v}_{1}=\left[\begin{array}{l}3 \\ 2\end{array}\right] v1=[32] and v 2 = [ 1 1 ] \mathbf{v}_{2}=\left[\begin{array}{l}1 \\ 1\end{array}\right] v2=[11].
- Next, form
P = [ 3 1 2 1 ] , D = [ 2 0 0 1 ] , and P − 1 = [ 1 − 1 − 2 3 ] P=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right], \quad D=\left[\begin{array}{ll} 2 & 0 \\ 0 & 1 \end{array}\right], \quad \text { and } \quad P^{-1}=\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] P=[3211],D=[2001], and P−1=[1−2−13] - Since
A
=
P
D
P
−
1
A=P D P^{-1}
A=PDP−1,
A 8 = P D 8 P − 1 = [ 3 1 2 1 ] [ 2 8 0 0 1 8 ] [ 1 − 1 − 2 3 ] = [ 3 1 2 1 ] [ 256 0 0 1 ] [ 1 − 1 − 2 3 ] = [ 766 − 765 510 − 509 ] \begin{aligned} A^{8}=P D^{8} P^{-1} &=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right]\left[\begin{array}{rr} 2^{8} & 0 \\ 0 & 1^{8} \end{array}\right]\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] \\ &=\left[\begin{array}{ll} 3 & 1 \\ 2 & 1 \end{array}\right]\left[\begin{array}{rr} 256 & 0 \\ 0 & 1 \end{array}\right]\left[\begin{array}{rr} 1 & -1 \\ -2 & 3 \end{array}\right] \\ &=\left[\begin{array}{ll} 766 & -765 \\ 510 & -509 \end{array}\right] \end{aligned} A8=PD8P−1=[3211][280018][1−2−13]=[3211][256001][1−2−13]=[766510−765−509]
Matrices Whose Eigenvalues Are Not Distinct
PROOF
- a. Suppose that the multiplicity of the eigenvalue λ k \lambda_k λk is m m m, then d e t ( A − λ I ) det(A-\lambda I) det(A−λI) has the form ( λ − λ k ) m ⋅ . . . (\lambda-\lambda_k)^m\cdot... (λ−λk)m⋅..., which means that A − λ I A-\lambda I A−λI can be row reduced to a triangular matrix that has m m m λ − λ k \lambda-\lambda_k λ−λk entries on its main diagonal. Thus when λ = λ k \lambda=\lambda_k λ=λk, A − λ I = A − λ k I A-\lambda I=A-\lambda_k I A−λI=A−λkI has at most m m m non-pivot columns. It indicates that d i m N u l ( A − λ k I ) dimNul(A-\lambda_k I) dimNul(A−λkI)(the dimension of the eigenspace) is at most m m m.
- c. Let
{
v
1
,
.
.
.
,
v
s
}
\{\boldsymbol v_1,..., \boldsymbol v_s\}
{v1,...,vs} be the eigenvectors in the sets
{
B
1
,
.
.
.
,
B
k
}
\{\mathcal B_1,...,\mathcal B_k\}
{B1,...,Bk}. Suppose
{
v
1
,
.
.
.
,
v
s
}
\{\boldsymbol v_1,..., \boldsymbol v_s\}
{v1,...,vs} is linearly dependent. Since
v
1
\boldsymbol v_1
v1 is nonzero(
{
v
1
}
\{\boldsymbol v_1\}
{v1} is linear independent), let
r
r
r be the least index such that
v
1
,
.
.
.
,
v
r
\boldsymbol v_{1},...,\boldsymbol v_{r}
v1,...,vr are linearly dependent. Then there exist scalars
c
1
,
.
.
.
,
c
r
c_1,..., c_r
c1,...,cr such that
c 1 v 1 + . . . + c r v r = 0 ( 1 ) c_1\boldsymbol v_1+...+ c_r\boldsymbol v_r=\boldsymbol 0\ \ \ \ \ \ (1) c1v1+...+crvr=0 (1)where c 1 , . . . , c r c_1,...,c_r c1,...,cr are not all 0. Suppose { v p + 1 , . . . , v r } \{\boldsymbol v_{p+1},..., \boldsymbol v_r\} {vp+1,...,vr} are eigenvectors corresponding to the same eigenvalue λ p + 1 \lambda_{p+1} λp+1. Since a linear combination of { v p + 1 , . . . , v r } \{\boldsymbol v_{p+1},..., \boldsymbol v_r\} {vp+1,...,vr} is still an eigenvector corresponding to λ p + 1 \lambda_{p+1} λp+1, the equatioin (1) can be transformed into
c 1 v 1 + . . . + c p v p = w p + 1 ( 2 ) c_1\boldsymbol v_1+...+ c_p\boldsymbol v_p=\boldsymbol w_{p+1}\ \ \ \ \ \ (2) c1v1+...+cpvp=wp+1 (2)where − w p + 1 = c p + 1 v p + 1 + . . . + c r v r -\boldsymbol w_{p+1}=c_{p+1}\boldsymbol v_{p+1}+...+ c_r\boldsymbol v_r −wp+1=cp+1vp+1+...+crvr and w p + 1 \boldsymbol w_{p+1} wp+1 is an eigenvector corresponding to λ p + 1 \lambda_{p+1} λp+1. Multiplying both sides of (2) by A A A, we obtain
c 1 A v 1 + . . . + c p A v p = A w p + 1 c 1 λ 1 v 1 + . . . + c p λ p v p = λ p + 1 w p + 1 ( 3 ) c_1A\boldsymbol v_1+...+ c_pA\boldsymbol v_p=A\boldsymbol w_{p+1}\\c_1\lambda_1\boldsymbol v_1+...+ c_p\lambda_p\boldsymbol v_p=\lambda_{p+1}\boldsymbol w_{p+1}\ \ \ \ \ \ (3) c1Av1+...+cpAvp=Awp+1c1λ1v1+...+cpλpvp=λp+1wp+1 (3)Multiplying both sides of (2) by λ p + 1 \lambda_{p+1} λp+1 and subtracting the result from (3), we have
c 1 ( λ 1 − λ p + 1 ) v 1 + . . . + c p ( λ p − λ p + 1 ) v p = 0 ( 7 ) c_1(\lambda_1-\lambda_{p+1})\boldsymbol v_1+...+ c_p(\lambda_p-\lambda_{p+1})\boldsymbol v_p=\boldsymbol 0\ \ \ \ \ \ (7) c1(λ1−λp+1)v1+...+cp(λp−λp+1)vp=0 (7)Since { v 1 , . . . , v p } \{\boldsymbol v_1,..., \boldsymbol v_p\} {v1,...,vp} is linearly independent, the weights in (7) are all zero, which is impossible. Hence { v 1 , . . . , v r } \{\boldsymbol v_1,..., \boldsymbol v_r\} {v1,...,vr} cannot be linearly dependent and therefore must be linearly independent.