矩阵计算选做

正定矩阵Gauss消元

A A A是对称正定矩阵,经过Gauss消去法一步后, A A A约化为 ( a 11 a 1 T 0 A 2 ) \begin{pmatrix} a_{11} & a_1^T \\ 0 & A_2 \\ \end{pmatrix} (a110a1TA2),其中
A = ( a i j ) n A = (a_{ij})_{n} A=(aij)n, A 2 = ( a i j ( 2 ) ) n − 1 A_2 = (a^{(2)}_{ij})_{n-1} A2=(aij(2))n1.
(1) A A A的对角元素 a i i > 0 ( i = 1 , 2 , … , n ) a_{ii}>0(i=1,2,…,n) aii>0(i=1,2,,n);
(2) A 2 A_2 A2是对称正定矩阵;
(3) a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯   , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)aii(i=2,3,,n);
(4) A A A的绝对值最大的元素必在对角线上;
(5) max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2i,jnmaxaij(2)2i,jnmaxaij;
(6) 从(2),(3),(5)推出,如果 ∣ a i j ∣ < 1 |a_{ij}|<1 aij<1,则对于所有 k k k, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}|<1 aij(k)<1.

注意记号: 如果上标(k)则指明是“第k-1次Gauss消元之后”“第k次Gauss消元之前”的矩阵/向量/元素.
例如上标(2)指明是“第1次Gauss消元之后”“第2次Gauss消元之前”的矩阵/向量/元素.
除了题目中定义的符号, 如果不加上标则默认上标是(1), “第1次Gauss消元之前”即的矩阵/向量/元素.

A = A T A = A^T A=AT, A ≽ 0 A \succcurlyeq 0 A0.
A ( 1 ) = A A^{(1)} = A A(1)=A
A ( 2 ) = G ( 1 ) A ( 1 ) A^{(2)} = G^{(1)} A^{(1)} A(2)=G(1)A(1)

A ( 1 ) = ( a 11 ( 1 ) a 1 ( 1 ) T a 1 ( 1 ) A 2 ( 1 ) ) A^{(1)} = \begin{pmatrix} a^{(1)}_{11} & {a^{(1)}_1}^T \\ a^{(1)}_1 & A^{(1)}_2 \\ \end{pmatrix} A(1)=(a11(1)a1(1)a1(1)TA2(1))
G ( 1 ) = ( 1 0 0 0 − a 21 a 11 1 0 0 − a 31 a 11 0 1 0 ⋮ ⋱ − a n 1 a 11 0 0 1 ) = ( 1 0 T g ( 1 ) I ) G^{(1)} = \begin{pmatrix} 1 & 0 & 0 & & 0 \\ -\frac{a_{21}}{a_{11}} & 1 & 0 & & 0 \\ -\frac{a_{31}}{a_{11}} & 0 & 1 & & 0 \\ \vdots & & & \ddots & \\ -\frac{a_{n1}}{a_{11}} & 0 & 0 & & 1 \\ \end{pmatrix} = \begin{pmatrix} 1 & 0^T \\ g^{(1)} & I \\ \end{pmatrix} G(1)=1a11a21a11a31a11an1010000100001=(1g(1)0TI)

(1) A A A的对角元素 a i i > 0 ( i = 1 , 2 , … , n ) a_{ii}>0(i=1,2,…,n) aii>0(i=1,2,,n);

取基向量 e i e_i ei, ( e i ) i = 1 (e_i)_i = 1 (ei)i=1, ( e i ) j = 0 ( j ≠ i ) (e_i)_j = 0 (j \neq i) (ei)j=0(j=i).
由正定性 A i i = e i T A e i > 0 A_{ii} = e_i^TAe_i > 0 Aii=eiTAei>0.

A i i = e i T A e i A_{ii} = e_i^TAe_i Aii=eiTAei是矩阵乘法的展开.
e i T A e i > 0 e_i^TAe_i > 0 eiTAei>0是正定性的推论.

(2) A 2 A_2 A2是对称正定矩阵;

( a 11 a 1 T 0 A 2 ) = A ( 2 ) = G ( 1 ) A ( 1 ) = ( 1 0 T g ( 1 ) I ) ( a 11 ( 1 ) a 1 ( 1 ) T a 1 ( 1 ) A 2 ( 1 ) ) \begin{pmatrix} a_{11} & a_1^T \\ 0 & A_2 \\ \end{pmatrix} = A^{(2)} = G^{(1)} A^{(1)} = \begin{pmatrix} 1 & 0^T \\ g^{(1)} & I \\ \end{pmatrix} \begin{pmatrix} a^{(1)}_{11} & {a^{(1)}_1}^T \\ a^{(1)}_1 & A^{(1)}_2 \\ \end{pmatrix} (a110a1TA2)=A(2)=G(1)A(1)=(1g(1)0TI)(a11(1)a1(1)a1(1)TA2(1))
A 2 = A 2 ( 1 ) + g ( 1 ) a 1 ( 1 ) T = A 2 ( 1 ) − 1 a 11 ( 1 ) a 1 ( 1 ) a 1 ( 1 ) T A_2 = A^{(1)}_2 + g^{(1)}{a^{(1)}_1}^T = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T A2=A2(1)+g(1)a1(1)T=A2(1)a11(1)1a1(1)a1(1)T

对称性
由于 A ( 1 ) A^{(1)} A(1)对称, 所以 A 2 ( 1 ) A^{(1)}_2 A2(1)对称.
显然 a 1 ( 1 ) a 1 ( 1 ) T {a^{(1)}_1}{a^{(1)}_1}^T a1(1)a1(1)T对称.
由于对称矩阵的线性组合仍是对称矩阵, 所以 A 2 = A 2 ( 1 ) − 1 a 11 ( 1 ) a 1 ( 1 ) a 1 ( 1 ) T A_2 = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T A2=A2(1)a11(1)1a1(1)a1(1)T是对称矩阵.

正定性
要证明 A 2 ≻ 0 A_2 \succ 0 A20, 只需证明对于任意的向量 v ≠ 0 v \neq 0 v=0 v T A 2 v > 0 v^TA_2v > 0 vTA2v>0.
v T A 2 v = v T A 2 ( 1 ) v − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) 2 v^TA_2v = v^TA^{(1)}_2v - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)^2 vTA2v=vTA2(1)va11(1)1(vTa1(1))2
构造 v ~ = ( − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) v ) \tilde{v} = \begin{pmatrix} - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) \\ v \\ \end{pmatrix} v~=(a11(1)1(vTa1(1))v).
v ~ T A ( 1 ) v ~ = a 11 ( 1 ) ( − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) ) 2 − 2 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) v T a 1 ( 1 ) + v T A 2 ( 1 ) v = v T A 2 ( 1 ) v − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) 2 = v T A 2 v \tilde{v}^TA^{(1)}\tilde{v} = a^{(1)}_{11}\left(- \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)\right)^2 - 2 \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) v^T a^{(1)}_1 + v^TA^{(1)}_2v = v^TA^{(1)}_2v - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)^2 = v^TA_2v v~TA(1)v~=a11(1)(a11(1)1(vTa1(1)))22a11(1)1(vTa1(1))vTa1(1)+vTA2(1)v=vTA2(1)va11(1)1(vTa1(1))2=vTA2v
由于 A ( 1 ) ≻ 0 A^{(1)} \succ 0 A(1)0, 所以 v ~ T A ( 1 ) v ~ > 0 \tilde{v}^TA^{(1)}\tilde{v} > 0 v~TA(1)v~>0, 亦即 v T A 2 v > 0 v^TA_2v > 0 vTA2v>0.
注意到 v ≠ 0 v \neq 0 v=0是任取的, 总可以通过构造 v ~ \tilde{v} v~来证明 v T A 2 v > 0 v^TA_2v > 0 vTA2v>0, 所以 A 2 ≻ 0 A_2 \succ 0 A20.

注意: v ∈ R n − 1 v \in \mathbb{R}^{n-1} vRn1, v ~ ∈ R n \tilde{v} \in \mathbb{R}^n v~Rn.
将n元二次型归纳为n-1元二次型: ( x y ) T ( μ m T m M ) ( x y ) = μ x 2 + 2 x m T y + y T M y \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix}^T \begin{pmatrix} \mu & \bm{m}^T \\ \bm{m} & \mathbf{M} \\ \end{pmatrix} \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix} = \mu x^2 + 2 x \bm{m}^T \bm{y} + \bm{y}^T \mathbf{M} \bm{y} (xy)T(μmmTM)(xy)=μx2+2xmTy+yTMy
x = − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) x = - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) x=a11(1)1(vTa1(1))
y = v \bm{y} = v y=v
μ = a 11 ( 1 ) \mu = a^{(1)}_{11} μ=a11(1)
m = a 1 ( 1 ) \bm{m} = a^{(1)}_1 m=a1(1)
M = A 2 ( 1 ) \mathbf{M} = A^{(1)}_2 M=A2(1)
( x y ) = v ~ \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix} = \tilde{v} (xy)=v~
( μ m T m M ) = A ( 1 ) \begin{pmatrix} \mu & \bm{m}^T \\ \bm{m} & \mathbf{M} \\ \end{pmatrix} = A^{(1)} (μmmTM)=A(1)

(3) a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯   , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)aii(i=2,3,,n);

(2)中已经证明 A 2 = A 2 ( 1 ) − 1 a 11 ( 1 ) a 1 ( 1 ) a 1 ( 1 ) T A_2 = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T A2=A2(1)a11(1)1a1(1)a1(1)T.
注意到 a i i ( 2 ) = ( A 2 ) i i = ( A 2 ( 1 ) ) i i − 1 a 11 ( 1 ) ( a 1 ( 1 ) a 1 ( 1 ) T ) i i = a i i − 1 a 11 ( 1 ) a i 1 2 a^{(2)}_{ii} = \left(A_2\right)_{ii} = \left(A^{(1)}_2\right)_{ii} - \frac{1}{a^{(1)}_{11}} \left({a^{(1)}_1}{a^{(1)}_1}^T\right)_{ii} = a_{ii} - \frac{1}{a^{(1)}_{11}} a_{i1}^2 aii(2)=(A2)ii=(A2(1))iia11(1)1(a1(1)a1(1)T)ii=aiia11(1)1ai12.
(1)中已经证明 a 11 ( 1 ) > 0 a^{(1)}_{11} > 0 a11(1)>0, 所以 a i i ( 2 ) = a i i − 1 a 11 ( 1 ) a i 1 2 ⩽ a i i a^{(2)}_{ii} = a_{ii} - \frac{1}{a^{(1)}_{11}} a_{i1}^2 \leqslant a_{ii} aii(2)=aiia11(1)1ai12aii

(4) A A A的绝对值最大的元素必在对角线上;

由于对称矩阵是Hermite矩阵, 因此根据Hermite矩阵的性质知 A A A①可对角化②特征值均为实数③特征向量正交, 进而有 A = Q Λ Q T A = Q \Lambda Q^T A=QΛQT, 其中 Λ = d i a g ( λ 1 , ⋯   , λ n ) \Lambda = \mathrm{diag}(\lambda_1, \cdots, \lambda_n) Λ=diag(λ1,,λn), Q T Q = Q Q T = I Q^TQ=QQ^T=I QTQ=QQT=I.
由于正定性, Λ ≻ 0 \Lambda \succ 0 Λ0.

A = Q Λ Q T = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( λ 1 ⋱ λ n ) ( ← q 1 T → ⋮ ← q n T → ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ↑ ↑ λ 1 e 1 ⋯ λ n e n ↓ ↓ ) ( ← q 1 T → ⋮ ← q n T → ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ∑ i = 1 n λ i e i q i T ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ← λ 1 q 1 T → ⋮ ← λ n q n T → ) \begin{aligned} A &= Q \Lambda Q^T \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \\ \end{pmatrix} \begin{pmatrix} \leftarrow & q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & q_n^T & \rightarrow \\ \end{pmatrix} \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \uparrow & & \uparrow \\ \lambda_1 e_1 & \cdots & \lambda_n e_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \leftarrow & q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & q_n^T & \rightarrow \\ \end{pmatrix} \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \left( \sum\limits_{i=1}^{n} \lambda_i e_i q_i^T \right) \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \leftarrow & \lambda_1 q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & \lambda_n q_n^T & \rightarrow \\ \end{pmatrix} \\ \end{aligned} A=QΛQT=q1qnλ1λnq1TqnT=q1qnλ1e1λnenq1TqnT=q1qn(i=1nλieiqiT)=q1qnλ1q1TλnqnT

将A的元素重写成 A i j = ∑ k = 1 n λ k q k i q k j A_{ij} = \sum\limits_{k=1}^{n} \lambda_k q_{ki} q_{kj} Aij=k=1nλkqkiqkj, λ k > 0 \lambda_k>0 λk>0.

根据柯西不等式, ( ∑ i = 1 n λ k q k i q k j ) 2 ⩽ ( ∑ i = 1 n ( λ k q k i ) 2 ) ( ∑ i = 1 n ( λ k q k j ) 2 ) = ( ∑ i = 1 n λ k q k i 2 ) ( ∑ i = 1 n λ k q k j 2 ) \left(\sum\limits_{i=1}^{n} \lambda_k q_{ki} q_{kj}\right)^2 \leqslant \left(\sum\limits_{i=1}^{n} \left(\sqrt{\lambda_k} q_{ki}\right)^2\right) \left(\sum\limits_{i=1}^{n} \left(\sqrt{\lambda_k} q_{kj}\right)^2\right) = \left(\sum\limits_{i=1}^{n} \lambda_k q_{ki}^2\right) \left(\sum\limits_{i=1}^{n} \lambda_k q_{kj}^2\right) (i=1nλkqkiqkj)2(i=1n(λk qki)2)(i=1n(λk qkj)2)=(i=1nλkqki2)(i=1nλkqkj2), 即 A i j 2 ⩽ A i i A j j A_{ij}^2 \leqslant A_{ii} A_{jj} Aij2AiiAjj.
(1)中已经证明 A i i , A j j > 0 A_{ii}, A_{jj} > 0 Aii,Ajj>0, 因此 max ⁡ { ∣ A i i ∣ , ∣ A j j ∣ } = max ⁡ { A i i , A j j } = ( max ⁡ { A i i , A j j } ) 2 ⩾ A i i A j j = A i j 2 = ∣ A i j ∣ \max\{|A_{ii}|,|A_{jj}|\} = \max\{A_{ii},A_{jj}\} = \sqrt{\left(\max\{A_{ii},A_{jj}\}\right)^2} \geqslant \sqrt{A_{ii}A_{jj}} = \sqrt{A_{ij}^2} = |A_{ij}| max{Aii,Ajj}=max{Aii,Ajj}=(max{Aii,Ajj})2 AiiAjj =Aij2 =Aij.
∣ A i j ∣ ⩽ max ⁡ { ∣ A i i ∣ , ∣ A j j ∣ } |A_{ij}| \leqslant \max\{|A_{ii}|,|A_{jj}|\} Aijmax{Aii,Ajj}, 这样就证明了 A A A的绝对值最大的元素必在对角线上.

[半]正定矩阵(semi positive definite matrix)主子式(principal minor)必然[半]正定.
以上证明过程可以简化为: 由于A正定, 因此二阶主子式 ( A i i A i j A i j A j j ) \begin{pmatrix} A_{ii} & A_{ij} \\ A_{ij} & A_{jj} \\ \end{pmatrix} (AiiAijAijAjj)正定, 故其行列式 A i i A j j − A i j 2 > 0 A_{ii}A_{jj} - A_{ij}^2 > 0 AiiAjjAij2>0.

(5) max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2i,jnmaxaij(2)2i,jnmaxaij;

由于 A A A是正定矩阵, 因此主子式 A 2 ( 1 ) A_2^{(1)} A2(1)是正定对称矩阵.
(2)中已经证明 A 2 A_2 A2是正定对称矩阵.
(4)中已经证明正定对称的绝对值最大的元素必在对角线上.

因此不妨假设 max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ = max ⁡ 2 ⩽ i ⩽ n a i i ( 2 ) = a x x ( 2 ) \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| = \max\limits_{2 \leqslant i \leqslant n} a_{ii}^{(2)} = a^{(2)}_{xx} 2i,jnmaxaij(2)=2inmaxaii(2)=axx(2), max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ∣ = max ⁡ 2 ⩽ i ⩽ n a i i = a y y \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| = \max\limits_{2 \leqslant i \leqslant n} a_{ii} = a_{yy} 2i,jnmaxaij=2inmaxaii=ayy, 2 ⩽ x , y ⩽ n 2 \leqslant x,y \leqslant n 2x,yn.

(3)中已经证明 a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯   , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)aii(i=2,3,,n).
因此 max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ = a x x ( 2 ) ⩽ a x x ⩽ max ⁡ 2 ⩽ i ⩽ n a i i = max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| = a^{(2)}_{xx} \leqslant a_{xx} \leqslant \max\limits_{2 \leqslant i \leqslant n} a_{ii} = \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2i,jnmaxaij(2)=axx(2)axx2inmaxaii=2i,jnmaxaij.

(6) 从(2),(3),(5)推出,如果 ∣ a i j ∣ < 1 |a_{ij}|<1 aij<1,则对于所有 k k k, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}|<1 aij(k)<1.

根据(5)知 max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max ⁡ 2 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ ⩽ max ⁡ 1 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| 2i,jnmaxaij(2)2i,jnmaxaij(1)1i,jnmaxaij(1)推广到 max ⁡ k ⩽ i , j ⩽ n ∣ a i j ( k ) ∣ ⩽ max ⁡ k ⩽ i , j ⩽ n ∣ a i j ( k − 1 ) ∣ ⩽ max ⁡ k − 1 ⩽ i , j ⩽ n ∣ a i j ( k − 1 ) ∣ \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k)}| \leqslant \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| \leqslant \max\limits_{k-1 \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| ki,jnmaxaij(k)ki,jnmaxaij(k1)k1i,jnmaxaij(k1).

对于 k ⩽ i , j ⩽ n k \leqslant i,j \leqslant n ki,jn, ∣ a i j ( k ) ∣ ⩽ max ⁡ k ⩽ i , j ⩽ n ∣ a i j ( k ) ∣ ⩽ max ⁡ k − 1 ⩽ i , j ⩽ n ∣ a i j ( k − 1 ) ∣ ⩽ ⋯ ⩽ max ⁡ 1 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ = max ⁡ 1 ⩽ i , j ⩽ n ∣ a i j ∣ < 1 |a_{ij}^{(k)}| \leqslant \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k)}| \leqslant \max\limits_{k-1 \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| \leqslant \cdots \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| = \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}| < 1 aij(k)ki,jnmaxaij(k)k1i,jnmaxaij(k1)1i,jnmaxaij(1)=1i,jnmaxaij<1.
对于 i < k ∨ j < k i<k \vee j<k i<kj<k, 注意到 a i j ( k ) = a i j ( k − 1 ) = ⋯ = a i j ( min ⁡ { i , j } ) a_{ij}^{(k)} = a_{ij}^{(k-1)} = \cdots = a_{ij}^{(\min\{i,j\})} aij(k)=aij(k1)==aij(min{i,j}).
∣ a i j ( k ) ∣ = ∣ a i j ( k − 1 ) ∣ = ⋯ = ∣ a i j ( min ⁡ { i , j } ) ∣ ⩽ max ⁡ min ⁡ { i , j } ⩽ i , j ⩽ n ∣ a i j ( min ⁡ { i , j } ) ∣ ⩽ max ⁡ min ⁡ { i , j } − 1 ⩽ i , j ⩽ n ∣ a i j ( min ⁡ { i , j } − 1 ) ∣ ⩽ ⋯ ⩽ max ⁡ 1 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ = max ⁡ 1 ⩽ i , j ⩽ n ∣ a i j ∣ < 1 |a_{ij}^{(k)}| = |a_{ij}^{(k-1)}| = \cdots = |a_{ij}^{(\min\{i,j\})}| \leqslant \max\limits_{\min\{i,j\} \leqslant i,j \leqslant n} |a_{ij}^{(\min\{i,j\})}| \leqslant \max\limits_{\min\{i,j\}-1 \leqslant i,j \leqslant n} |a_{ij}^{(\min\{i,j\}-1)}| \leqslant \cdots \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| = \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}| < 1 aij(k)=aij(k1)==aij(min{i,j})min{i,j}i,jnmaxaij(min{i,j})min{i,j}1i,jnmaxaij(min{i,j}1)1i,jnmaxaij(1)=1i,jnmaxaij<1.

综上所述, 对于 1 ⩽ i , j ⩽ n 1 \leqslant i,j \leqslant n 1i,jn, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}| < 1 aij(k)<1.

追赶法

( b 1 c 1 a 2 b 2 c 2 ⋱ ⋱ ⋱ a n − 1 b n − 1 c n − 1 a n b n ) = ( 1 β 1 1 β 2 ⋱ ⋱ 1 β n − 1 1 ) ( α 1 γ 2 α 2 ⋱ ⋱ γ n − 1 α n − 1 γ n α n ) \begin{pmatrix} b_1 & c_1 & & & \\ a_2 & b_2 & c_2 & & \\ & \ddots & \ddots & \ddots & \\ & & a_{n-1} & b_{n-1} & c_{n-1} \\ & & & a_n & b_n \\ \end{pmatrix} = \begin{array}{cc} & \begin{pmatrix} 1 & \beta_1 & & & \\ & 1 & \beta_2 & & \\ & & \ddots & \ddots & \\ & & & 1 & \beta_{n-1} \\ & & & & 1 \\ \end{pmatrix} \\ \begin{pmatrix} \alpha_1 & & & & \\ \gamma_2 & \alpha_2 & & \\ & \ddots & \ddots & & \\ & & \gamma_{n-1} & \alpha_{n-1} & \\ & & & \gamma_n & \alpha_n \\ \end{pmatrix} & \\ \end{array} b1a2c1b2c2an1bn1ancn1bn=α1γ2α2γn1αn1γnαn1β11β21βn11

A = L U    ⟹    { a i = γ i b i = α i + β i − 1 γ i c i = α i β i A = LU \implies \begin{cases} a_i = \gamma_i \\ b_i = \alpha_i + \beta_{i-1} \gamma_i \\ c_i = \alpha_i \beta_i \\ \end{cases} A=LUai=γibi=αi+βi1γici=αiβi

L y = f    ⟹    f i = α i y i + γ i y i − 1 Ly = f \implies f_i = \alpha_i y_i + \gamma_i y_{i-1} Ly=ffi=αiyi+γiyi1

U x = y    ⟹    y i = x i + β i x i + 1 Ux = y \implies y_i = x_i + \beta_i x_{i+1} Ux=yyi=xi+βixi+1

a 1 = a n + 1 = b 0 = b n + 1 = c 0 = c n = α 0 = α n + 1 = β 0 = β n = γ 1 = γ n + 1 = 0 a_1 = a_{n+1} = b_{0} = b_{n+1} = c_0 = c_n = \alpha_0 = \alpha_{n+1} = \beta_0 = \beta_n = \gamma_1 = \gamma_{n+1} = 0 a1=an+1=b0=bn+1=c0=cn=α0=αn+1=β0=βn=γ1=γn+1=0

追循环(第1轮)
α 1 = b 1 \alpha_1 = b_1 α1=b1
α i = b i − β i − 1 a i \alpha_i = b_i - \beta_{i-1} a_i αi=biβi1ai ( 2 ⩽ i ⩽ n 2 \leqslant i \leqslant n 2in)
β i = c i / α i \beta_i = c_i / \alpha_i βi=ci/αi ( 1 ⩽ i ⩽ n − 1 1 \leqslant i \leqslant n-1 1in1)
追循环(第2轮)
y 1 = f 1 / α 1 y_1 = f_1 / \alpha_1 y1=f1/α1
y i = ( f i − a i y i − 1 ) / α i y_i = \left( f_i - a_i y_{i-1} \right) / \alpha_i yi=(fiaiyi1)/αi ( 2 ⩽ i ⩽ n 2 \leqslant i \leqslant n 2in)
赶循环
x n = y n x_n = y_n xn=yn
x i = y i − β i x i + 1 x_i = y_i - \beta_i x_{i+1} xi=yiβixi+1 ( 1 ⩽ i ⩽ n − 1 1 \leqslant i \leqslant n-1 1in1)

显然 γ \gamma γ只是用于推导的符号, 不具备实际意义(不具备自由度).
如果要节省内存空间, 可以不储存 α \alpha α.

A = sym([
  2 -1  0  0  0;
 -1  2 -1  0  0;
  0 -1  2 -1  0;
  0  0 -1  2 -1;
  0  0  0 -1  2;
]);
f = sym([
  1;
  0;
  0;
  0;
  0;
])
n = 5;

a = @(i) A(i,i-1);
b = @(i) A(i,i);
c = @(i) A(i,i+1);

alpha = sym(zeros(n,1));
beta = sym(zeros(n-1,1));

y = sym(zeros(n,1));
x = sym(zeros(n,1));

% alpha, beta, 1 -> n
alpha(1) = b(1);
for i = 2:n
    beta(i-1) = c(i-1) ./ alpha(i-1);
    alpha(i) = b(i) - beta(i-1) .* a(i);
end

% y 1 -> n
y(1) = f(1) ./ alpha(1);
for i = 2:n
    y(i) = (f(i) - a(i) .* y(i-1)) ./ alpha(i);
end

% x n -> 1
x(n) = y(n);
for j = 1:n-1
    i = n - j;
    x(i) = y(i) - beta(i) .* x(i+1);
end

disp(x);

矩阵算子范数

L1算子范数证明:
∥ A x ∥ 1 = ∑ i = 1 n ∣ ∑ j = 1 n a i j x j ∣ ⩽ ∑ j = 1 n ∣ x j ∣ ∑ i = 1 n ∣ a i j ∣ ⩽ ∑ j = 1 n ∣ x j ∣ ( max ⁡ j ∑ i = 1 n ∣ a i j ∣ ) = ∥ x ∥ 1 ( max ⁡ j ∑ i = 1 n ∣ a i j ∣ ) \|Ax\|_1 = \sum\limits_{i=1}^{n} \left| \sum\limits_{j=1}^{n} a_{ij} x_{j} \right| \leqslant \sum\limits_{j=1}^{n} |x_{j}| \sum\limits_{i=1}^{n} |a_{ij}| \leqslant \sum\limits_{j=1}^{n} |x_{j}| \left( \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| \right) = \|x\|_1 \left( \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| \right) Ax1=i=1nj=1naijxjj=1nxji=1naijj=1nxj(jmaxi=1naij)=x1(jmaxi=1naij)
取等条件 x j 0 = ± ∥ x ∥ 1 x_{j_0} = \pm \|x\|_1 xj0=±x1, x o t h e r = 0 x_{\mathrm{other}} = 0 xother=0, j 0 = arg ⁡ max ⁡ j ∑ i = 1 n ∣ a i j ∣ j_0 = \arg\max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| j0=argjmaxi=1naij.

L2算子范数证明:
见下一部分.

L∞算子范数证明:
∥ A x ∥ + ∞ = max ⁡ i ∣ ∑ j = 1 n a i j x j ∣ ⩽ max ⁡ i ∑ j = 1 n ∣ a i j ∣ ∣ x j ∣ ⩽ max ⁡ j ∣ x j ∣ max ⁡ i ∑ j = 1 n ∣ a i j ∣ = ∥ x ∥ + ∞ max ⁡ i ∑ j = 1 n ∣ a i j ∣ \|Ax\|_{+\infty} = \max\limits_{i} \left| \sum\limits_{j=1}^{n} a_{ij} x_{j} \right| \leqslant \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| |x_{j}| \leqslant \max\limits_{j} |x_{j}| \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| = \|x\|_{+\infty} \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| Ax+=imaxj=1naijxjimaxj=1naijxjjmaxxjimaxj=1naij=x+imaxj=1naij
取等条件 x i = s i g n ( a i 0 j ) ∥ x ∥ + ∞ x_{i} = \mathrm{sign}(a_{i_0 j}) \|x\|_{+\infty} xi=sign(ai0j)x+ x i = − s i g n ( a i 0 j ) ∥ x ∥ + ∞ x_{i} = - \mathrm{sign}(a_{i_0 j}) \|x\|_{+\infty} xi=sign(ai0j)x+, i 0 = arg ⁡ max ⁡ i ∑ j = 1 n ∣ a i j ∣ i_0 = \arg\max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| i0=argimaxj=1naij.

∥ A ∥ F = ∑ i = 1 n ∑ j = 1 n a i j 2 \|A\|_F = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} a_{ij}^2 AF=i=1nj=1naij2
∥ A ∥ 1 = sup ⁡ ∥ x ∥ 1 = 1 ∥ A x ∥ 1 = max ⁡ j ∑ i = 1 n ∣ a i j ∣ \|A\|_1 = \sup\limits_{\|x\|_1 = 1} \|Ax\|_1 = \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| A1=x1=1supAx1=jmaxi=1naij
∥ A ∥ 2 = sup ⁡ ∥ x ∥ 2 = 1 ∥ A x ∥ 2 = ∣ σ ∣ max ⁡ ( A ) \|A\|_2 = \sup\limits_{\|x\|_2 = 1} \|Ax\|_2 = |\sigma|_{\max}(A) A2=x2=1supAx2=σmax(A)
∥ A ∥ + ∞ = sup ⁡ ∥ x ∥ + ∞ = 1 ∥ A x ∥ + ∞ = max ⁡ i ∑ j = 1 n ∣ a i j ∣ \|A\|_{+\infty} = \sup\limits_{\|x\|_{+\infty} = 1} \|Ax\|_{+\infty} = \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| A+=x+=1supAx+=imaxj=1naij

% F
nF = sqrt(sum(A(:).^2))
assert(nF == norm(A,'fro'));

% 1
n1 = max(sum(abs(A)))
assert(n1 == norm(A,1));

% inf
nInf = max(sum(abs(A')))
assert(nInf == norm(A,inf));

% 2
n2 = max(svd(A))
assert(n2 == norm(A,2));

向量矩阵范数等价性

证明

  1. ∥ x ∥ + ∞ ⩽ ∥ x ∥ 1 ⩽ n ∥ x ∥ + ∞ \|x\|_{+\infty} \leqslant \|x\|_{1} \leqslant n \|x\|_{+\infty} x+x1nx+
  2. 1 n ∥ A ∥ F ⩽ ∥ A ∥ 2 ⩽ ∥ A ∥ F \frac{1}{\sqrt{n}} \|A\|_{F} \leqslant \|A\|_{2} \leqslant \|A\|_{F} n 1AFA2AF

∥ x ∥ + ∞ = max ⁡ 1 ⩽ i ⩽ n ∣ a i ∣ ⩽ ∑ 1 ⩽ i ⩽ n ∣ a i ∣ = ∥ x ∥ 1 ⩽ ∑ 1 ⩽ i ⩽ n ( max ⁡ 1 ⩽ i ⩽ n ∣ a i ∣ ) = n max ⁡ 1 ⩽ i ⩽ n ∣ a i ∣ = n ∥ x ∥ + ∞ \|x\|_{+\infty} = \max\limits_{1\leqslant i \leqslant n} |a_i| \leqslant \sum\limits_{1\leqslant i \leqslant n} |a_i| = \|x\|_{1} \leqslant \sum\limits_{1\leqslant i \leqslant n} \left( \max\limits_{1\leqslant i \leqslant n} |a_i| \right) = n \max\limits_{1\leqslant i \leqslant n} |a_i| = n \|x\|_{+\infty} x+=1inmaxai1inai=x11in(1inmaxai)=n1inmaxai=nx+

A = U Σ V H A = U \Sigma V^H A=UΣVH

为什么选用奇异值分解而非特征值分解?

  1. 不是所有矩阵的特征值分解都可以写成 A = Q Λ Q − 1 A = Q \Lambda Q^{-1} A=QΛQ1(考虑Jordan标准型, 只有代数重数等于几何重数的矩阵才可对角化)
  2. 不是所有可对角化矩阵的特征值分解都可以写成 A = Q Λ Q T A = Q \Lambda Q^{T} A=QΛQT(只有归正矩阵 A A H = A H A AA^H=A^HA AAH=AHA才能保证特征向量正交)

∥ A ∥ F = t r ( A H A ) = t r ( V Σ 2 V H ) = t r ( Σ 2 V V H ) = t r ( Σ 2 I ) = t r ( Σ 2 ) = ∑ 1 ⩽ i ⩽ n σ i ( A H A ) = ∑ 1 ⩽ i ⩽ n σ i 2 ( A ) \|A\|_{F} = \sqrt{\mathrm{tr}(A^HA)} = \sqrt{\mathrm{tr}(V \Sigma^2 V^H)} = \sqrt{\mathrm{tr}(\Sigma^2 V V^H)} = \sqrt{\mathrm{tr}(\Sigma^2 I)} = \sqrt{\mathrm{tr}(\Sigma^2)} = \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i(A^HA)} = \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)} AF=tr(AHA) =tr(VΣ2VH) =tr(Σ2VVH) =tr(Σ2I) =tr(Σ2) =1inσi(AHA) =1inσi2(A)
∥ A ∥ 2 = sup ⁡ ∥ x ∥ 2 = 1 ∥ A x ∥ 2 = sup ⁡ ∥ x ∥ 2 = 1 ∥ U Σ V H x ∥ 2 = z = V H x sup ⁡ ∥ z ∥ 2 = 1 ∥ U Σ z ∥ 2 = sup ⁡ ∥ z ∥ 2 = 1 ∑ i = 1 n σ i 2 ( A ) ⟨ u i , z ⟩ 2 = σ max ⁡ 2 ( A ) = ∣ σ ∣ max ⁡ ( A ) \|A\|_{2} = \sup\limits_{\|x\|_2 = 1} \|Ax\|_2 = \sup\limits_{\|x\|_2 = 1} \|U \Sigma V^H x\|_2 \stackrel{z = V^H x}{=} \sup\limits_{\|z\|_2 = 1} \|U \Sigma z\|_2 = \sup\limits_{\|z\|_2 = 1} \sqrt{\sum\limits_{i=1}^{n} \sigma_i^2(A) \langle u_i, z \rangle^2} = \sqrt{\sigma_{\max}^2(A)} = |\sigma|_{\max}(A) A2=x2=1supAx2=x2=1supUΣVHx2=z=VHxz2=1supUΣz2=z2=1supi=1nσi2(A)ui,z2 =σmax2(A) =σmax(A)

其中 u i u_i ui U U U的第 i i i行, 由于 U U U是酉矩阵, 因此 { u i } i = 1 n \{u_i\}_{i=1}^{n} {ui}i=1n也构成一组正交基.
其中 ⟨ x , y ⟩ \langle x, y \rangle x,y是两个 n n n维向量内积, 忽略行向量和列向量的格式差异.
由于 { u i } i = 1 n \{u_i\}_{i=1}^{n} {ui}i=1n构成一组正交基且 ∥ z ∥ 2 = 1 \|z\|_2 = 1 z2=1, 由Parseval定理知 ∑ i = 1 n ⟨ u i , z ⟩ 2 = ∥ z ∥ 2 2 = 1 \sum\limits_{i=1}^{n} \langle u_i, z \rangle^2 = \|z\|_2^2 = 1 i=1nui,z2=z22=1.

1 n ∥ A ∥ F = 1 n ∑ 1 ⩽ i ⩽ n σ i 2 ( A ) ⩽ 1 n ∑ 1 ⩽ i ⩽ n σ max ⁡ 2 ( A ) = σ max ⁡ 2 ( A ) = ∥ A ∥ 2 ⩽ ∑ 1 ⩽ i ⩽ n σ i 2 ( A ) = ∥ A ∥ F \frac{1}{\sqrt{n}} \|A\|_{F} = \sqrt{\frac{1}{n} \sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)} \leqslant \sqrt{\frac{1}{n} \sum\limits_{1\leqslant i \leqslant n} \sigma_{\max}^2(A)} = \sqrt{\sigma_{\max}^2(A)} = \|A\|_{2} \leqslant \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)} = \|A\|_{F} n 1AF=n11inσi2(A) n11inσmax2(A) =σmax2(A) =A21inσi2(A) =AF

矩阵算子范数下的条件数

设A,B∈ℝn×n且∥·∥为ℝn×n上矩阵的算子范数,证明cond(AB)⩽cond(A)cond(B).

假设 A A A, B B B均非奇异,
c o n d ( A ) c o n d ( B ) = ∥ A ∥ ∥ A − 1 ∥ ∥ B ∥ ∥ B − 1 ∥ \mathrm{cond}(A)\mathrm{cond}(B) = \|A\|\|A^{-1}\|\|B\|\|B^{-1}\| cond(A)cond(B)=AA1BB1
c o n d ( A B ) = ∥ A B ∥ ∥ B − 1 A − 1 ∥ \mathrm{cond}(AB) = \|AB\|\|B^{-1}A^{-1}\| cond(AB)=ABB1A1
即证 ∥ A B ∥ ∥ B − 1 A − 1 ∥ ⩽ ( ∥ A ∥ ∥ B ∥ ) ( ∥ B − 1 ∥ ∥ A − 1 ∥ ) \|AB\|\|B^{-1}A^{-1}\| \leqslant (\|A\|\|B\|) (\|B^{-1}\|\|A^{-1}\|) ABB1A1(AB)(B1A1)

我们证明如下更强的结论, 假设 X X X, Y Y Y均非奇异, ∥ X Y ∥ ⩽ ∥ X ∥ ∥ Y ∥ \|XY\| \leqslant \|X\|\|Y\| XYXY
∥ X Y ∥ = sup ⁡ w ≠ 0 ∥ X Y w ∥ ∥ w ∥ = sup ⁡ w ≠ 0 ∥ X Y w ∥ ∥ Y w ∥ ∥ Y w ∥ ∥ w ∥ ⩽ sup ⁡ w ≠ 0 ∥ X Y w ∥ ∥ Y w ∥ sup ⁡ v ≠ 0 ∥ Y v ∥ ∥ v ∥ = u = Y w sup ⁡ v ≠ 0 ∥ Y v ∥ ∥ v ∥ sup ⁡ u ≠ 0 ∥ X u ∥ ∥ u ∥ = ∥ Y ∥ ∥ X ∥ \|XY\| = \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|w\|} = \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|Yw\|}\frac{\|Yw\|}{\|w\|} \leqslant \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|Yw\|} \sup\limits_{v \neq 0} \frac{\|Yv\|}{\|v\|} \stackrel{u=Yw}{=} \sup\limits_{v \neq 0} \frac{\|Yv\|}{\|v\|} \sup\limits_{u \neq 0} \frac{\|Xu\|}{\|u\|} = \|Y\| \|X\| XY=w=0supwXYw=w=0supYwXYwwYww=0supYwXYwv=0supvYv=u=Ywv=0supvYvu=0supuXu=YX

注意 w ≠ 0 w \neq 0 w=0当且仅当 u = Y w ≠ 0 u = Yw \neq 0 u=Yw=0, 原因是 Y Y Y非奇异.

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 7
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值