文章目录
- 正定矩阵Gauss消元
- (1) A A A的对角元素 a i i > 0 ( i = 1 , 2 , … , n ) a_{ii}>0(i=1,2,…,n) aii>0(i=1,2,…,n);
- (2) A 2 A_2 A2是对称正定矩阵;
- (3) a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯ , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)⩽aii(i=2,3,⋯,n);
- (4) A A A的绝对值最大的元素必在对角线上;
- (5) max 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2⩽i,j⩽nmax∣aij(2)∣⩽2⩽i,j⩽nmax∣aij∣;
- (6) 从(2),(3),(5)推出,如果 ∣ a i j ∣ < 1 |a_{ij}|<1 ∣aij∣<1,则对于所有 k k k, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}|<1 ∣aij(k)∣<1.
- 追赶法
- 矩阵算子范数
- 向量矩阵范数等价性
正定矩阵Gauss消元
设 A A A是对称正定矩阵,经过Gauss消去法一步后, A A A约化为 ( a 11 a 1 T 0 A 2 ) \begin{pmatrix} a_{11} & a_1^T \\ 0 & A_2 \\ \end{pmatrix} (a110a1TA2),其中
A = ( a i j ) n A = (a_{ij})_{n} A=(aij)n, A 2 = ( a i j ( 2 ) ) n − 1 A_2 = (a^{(2)}_{ij})_{n-1} A2=(aij(2))n−1.
(1) A A A的对角元素 a i i > 0 ( i = 1 , 2 , … , n ) a_{ii}>0(i=1,2,…,n) aii>0(i=1,2,…,n);
(2) A 2 A_2 A2是对称正定矩阵;
(3) a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯ , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)⩽aii(i=2,3,⋯,n);
(4) A A A的绝对值最大的元素必在对角线上;
(5) max 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2⩽i,j⩽nmax∣aij(2)∣⩽2⩽i,j⩽nmax∣aij∣;
(6) 从(2),(3),(5)推出,如果 ∣ a i j ∣ < 1 |a_{ij}|<1 ∣aij∣<1,则对于所有 k k k, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}|<1 ∣aij(k)∣<1.
注意记号: 如果上标(k)则指明是“第k-1次Gauss消元之后”“第k次Gauss消元之前”的矩阵/向量/元素.
例如上标(2)指明是“第1次Gauss消元之后”“第2次Gauss消元之前”的矩阵/向量/元素.
除了题目中定义的符号, 如果不加上标则默认上标是(1), “第1次Gauss消元之前”即的矩阵/向量/元素.
A
=
A
T
A = A^T
A=AT,
A
≽
0
A \succcurlyeq 0
A≽0.
A
(
1
)
=
A
A^{(1)} = A
A(1)=A
A
(
2
)
=
G
(
1
)
A
(
1
)
A^{(2)} = G^{(1)} A^{(1)}
A(2)=G(1)A(1)
A
(
1
)
=
(
a
11
(
1
)
a
1
(
1
)
T
a
1
(
1
)
A
2
(
1
)
)
A^{(1)} = \begin{pmatrix} a^{(1)}_{11} & {a^{(1)}_1}^T \\ a^{(1)}_1 & A^{(1)}_2 \\ \end{pmatrix}
A(1)=(a11(1)a1(1)a1(1)TA2(1))
G
(
1
)
=
(
1
0
0
0
−
a
21
a
11
1
0
0
−
a
31
a
11
0
1
0
⋮
⋱
−
a
n
1
a
11
0
0
1
)
=
(
1
0
T
g
(
1
)
I
)
G^{(1)} = \begin{pmatrix} 1 & 0 & 0 & & 0 \\ -\frac{a_{21}}{a_{11}} & 1 & 0 & & 0 \\ -\frac{a_{31}}{a_{11}} & 0 & 1 & & 0 \\ \vdots & & & \ddots & \\ -\frac{a_{n1}}{a_{11}} & 0 & 0 & & 1 \\ \end{pmatrix} = \begin{pmatrix} 1 & 0^T \\ g^{(1)} & I \\ \end{pmatrix}
G(1)=⎝⎜⎜⎜⎜⎜⎛1−a11a21−a11a31⋮−a11an101000010⋱0001⎠⎟⎟⎟⎟⎟⎞=(1g(1)0TI)
(1) A A A的对角元素 a i i > 0 ( i = 1 , 2 , … , n ) a_{ii}>0(i=1,2,…,n) aii>0(i=1,2,…,n);
取基向量
e
i
e_i
ei,
(
e
i
)
i
=
1
(e_i)_i = 1
(ei)i=1,
(
e
i
)
j
=
0
(
j
≠
i
)
(e_i)_j = 0 (j \neq i)
(ei)j=0(j=i).
由正定性
A
i
i
=
e
i
T
A
e
i
>
0
A_{ii} = e_i^TAe_i > 0
Aii=eiTAei>0.
A i i = e i T A e i A_{ii} = e_i^TAe_i Aii=eiTAei是矩阵乘法的展开.
e i T A e i > 0 e_i^TAe_i > 0 eiTAei>0是正定性的推论.
(2) A 2 A_2 A2是对称正定矩阵;
(
a
11
a
1
T
0
A
2
)
=
A
(
2
)
=
G
(
1
)
A
(
1
)
=
(
1
0
T
g
(
1
)
I
)
(
a
11
(
1
)
a
1
(
1
)
T
a
1
(
1
)
A
2
(
1
)
)
\begin{pmatrix} a_{11} & a_1^T \\ 0 & A_2 \\ \end{pmatrix} = A^{(2)} = G^{(1)} A^{(1)} = \begin{pmatrix} 1 & 0^T \\ g^{(1)} & I \\ \end{pmatrix} \begin{pmatrix} a^{(1)}_{11} & {a^{(1)}_1}^T \\ a^{(1)}_1 & A^{(1)}_2 \\ \end{pmatrix}
(a110a1TA2)=A(2)=G(1)A(1)=(1g(1)0TI)(a11(1)a1(1)a1(1)TA2(1))
A
2
=
A
2
(
1
)
+
g
(
1
)
a
1
(
1
)
T
=
A
2
(
1
)
−
1
a
11
(
1
)
a
1
(
1
)
a
1
(
1
)
T
A_2 = A^{(1)}_2 + g^{(1)}{a^{(1)}_1}^T = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T
A2=A2(1)+g(1)a1(1)T=A2(1)−a11(1)1a1(1)a1(1)T
对称性
由于
A
(
1
)
A^{(1)}
A(1)对称, 所以
A
2
(
1
)
A^{(1)}_2
A2(1)对称.
显然
a
1
(
1
)
a
1
(
1
)
T
{a^{(1)}_1}{a^{(1)}_1}^T
a1(1)a1(1)T对称.
由于对称矩阵的线性组合仍是对称矩阵, 所以
A
2
=
A
2
(
1
)
−
1
a
11
(
1
)
a
1
(
1
)
a
1
(
1
)
T
A_2 = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T
A2=A2(1)−a11(1)1a1(1)a1(1)T是对称矩阵.
正定性
要证明
A
2
≻
0
A_2 \succ 0
A2≻0, 只需证明对于任意的向量
v
≠
0
v \neq 0
v=0有
v
T
A
2
v
>
0
v^TA_2v > 0
vTA2v>0.
v
T
A
2
v
=
v
T
A
2
(
1
)
v
−
1
a
11
(
1
)
(
v
T
a
1
(
1
)
)
2
v^TA_2v = v^TA^{(1)}_2v - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)^2
vTA2v=vTA2(1)v−a11(1)1(vTa1(1))2
构造
v
~
=
(
−
1
a
11
(
1
)
(
v
T
a
1
(
1
)
)
v
)
\tilde{v} = \begin{pmatrix} - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) \\ v \\ \end{pmatrix}
v~=(−a11(1)1(vTa1(1))v).
v
~
T
A
(
1
)
v
~
=
a
11
(
1
)
(
−
1
a
11
(
1
)
(
v
T
a
1
(
1
)
)
)
2
−
2
1
a
11
(
1
)
(
v
T
a
1
(
1
)
)
v
T
a
1
(
1
)
+
v
T
A
2
(
1
)
v
=
v
T
A
2
(
1
)
v
−
1
a
11
(
1
)
(
v
T
a
1
(
1
)
)
2
=
v
T
A
2
v
\tilde{v}^TA^{(1)}\tilde{v} = a^{(1)}_{11}\left(- \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)\right)^2 - 2 \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) v^T a^{(1)}_1 + v^TA^{(1)}_2v = v^TA^{(1)}_2v - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right)^2 = v^TA_2v
v~TA(1)v~=a11(1)(−a11(1)1(vTa1(1)))2−2a11(1)1(vTa1(1))vTa1(1)+vTA2(1)v=vTA2(1)v−a11(1)1(vTa1(1))2=vTA2v
由于
A
(
1
)
≻
0
A^{(1)} \succ 0
A(1)≻0, 所以
v
~
T
A
(
1
)
v
~
>
0
\tilde{v}^TA^{(1)}\tilde{v} > 0
v~TA(1)v~>0, 亦即
v
T
A
2
v
>
0
v^TA_2v > 0
vTA2v>0.
注意到
v
≠
0
v \neq 0
v=0是任取的, 总可以通过构造
v
~
\tilde{v}
v~来证明
v
T
A
2
v
>
0
v^TA_2v > 0
vTA2v>0, 所以
A
2
≻
0
A_2 \succ 0
A2≻0.
注意: v ∈ R n − 1 v \in \mathbb{R}^{n-1} v∈Rn−1, v ~ ∈ R n \tilde{v} \in \mathbb{R}^n v~∈Rn.
将n元二次型归纳为n-1元二次型: ( x y ) T ( μ m T m M ) ( x y ) = μ x 2 + 2 x m T y + y T M y \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix}^T \begin{pmatrix} \mu & \bm{m}^T \\ \bm{m} & \mathbf{M} \\ \end{pmatrix} \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix} = \mu x^2 + 2 x \bm{m}^T \bm{y} + \bm{y}^T \mathbf{M} \bm{y} (xy)T(μmmTM)(xy)=μx2+2xmTy+yTMy
x = − 1 a 11 ( 1 ) ( v T a 1 ( 1 ) ) x = - \frac{1}{a^{(1)}_{11}} \left(v^T a^{(1)}_1\right) x=−a11(1)1(vTa1(1))
取 y = v \bm{y} = v y=v
取 μ = a 11 ( 1 ) \mu = a^{(1)}_{11} μ=a11(1)
取 m = a 1 ( 1 ) \bm{m} = a^{(1)}_1 m=a1(1)
取 M = A 2 ( 1 ) \mathbf{M} = A^{(1)}_2 M=A2(1)
取 ( x y ) = v ~ \begin{pmatrix} x \\ \bm{y} \\ \end{pmatrix} = \tilde{v} (xy)=v~
取 ( μ m T m M ) = A ( 1 ) \begin{pmatrix} \mu & \bm{m}^T \\ \bm{m} & \mathbf{M} \\ \end{pmatrix} = A^{(1)} (μmmTM)=A(1)
(3) a i i ( 2 ) ⩽ a i i ( i = 2 , 3 , ⋯ , n ) a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n) aii(2)⩽aii(i=2,3,⋯,n);
(2)中已经证明
A
2
=
A
2
(
1
)
−
1
a
11
(
1
)
a
1
(
1
)
a
1
(
1
)
T
A_2 = A^{(1)}_2 - \frac{1}{a^{(1)}_{11}}{a^{(1)}_1}{a^{(1)}_1}^T
A2=A2(1)−a11(1)1a1(1)a1(1)T.
注意到
a
i
i
(
2
)
=
(
A
2
)
i
i
=
(
A
2
(
1
)
)
i
i
−
1
a
11
(
1
)
(
a
1
(
1
)
a
1
(
1
)
T
)
i
i
=
a
i
i
−
1
a
11
(
1
)
a
i
1
2
a^{(2)}_{ii} = \left(A_2\right)_{ii} = \left(A^{(1)}_2\right)_{ii} - \frac{1}{a^{(1)}_{11}} \left({a^{(1)}_1}{a^{(1)}_1}^T\right)_{ii} = a_{ii} - \frac{1}{a^{(1)}_{11}} a_{i1}^2
aii(2)=(A2)ii=(A2(1))ii−a11(1)1(a1(1)a1(1)T)ii=aii−a11(1)1ai12.
(1)中已经证明
a
11
(
1
)
>
0
a^{(1)}_{11} > 0
a11(1)>0, 所以
a
i
i
(
2
)
=
a
i
i
−
1
a
11
(
1
)
a
i
1
2
⩽
a
i
i
a^{(2)}_{ii} = a_{ii} - \frac{1}{a^{(1)}_{11}} a_{i1}^2 \leqslant a_{ii}
aii(2)=aii−a11(1)1ai12⩽aii
(4) A A A的绝对值最大的元素必在对角线上;
由于对称矩阵是Hermite矩阵, 因此根据Hermite矩阵的性质知
A
A
A①可对角化②特征值均为实数③特征向量正交, 进而有
A
=
Q
Λ
Q
T
A = Q \Lambda Q^T
A=QΛQT, 其中
Λ
=
d
i
a
g
(
λ
1
,
⋯
,
λ
n
)
\Lambda = \mathrm{diag}(\lambda_1, \cdots, \lambda_n)
Λ=diag(λ1,⋯,λn),
Q
T
Q
=
Q
Q
T
=
I
Q^TQ=QQ^T=I
QTQ=QQT=I.
由于正定性,
Λ
≻
0
\Lambda \succ 0
Λ≻0.
A = Q Λ Q T = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( λ 1 ⋱ λ n ) ( ← q 1 T → ⋮ ← q n T → ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ↑ ↑ λ 1 e 1 ⋯ λ n e n ↓ ↓ ) ( ← q 1 T → ⋮ ← q n T → ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ∑ i = 1 n λ i e i q i T ) = ( ↑ ↑ q 1 ⋯ q n ↓ ↓ ) ( ← λ 1 q 1 T → ⋮ ← λ n q n T → ) \begin{aligned} A &= Q \Lambda Q^T \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \\ \end{pmatrix} \begin{pmatrix} \leftarrow & q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & q_n^T & \rightarrow \\ \end{pmatrix} \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \uparrow & & \uparrow \\ \lambda_1 e_1 & \cdots & \lambda_n e_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \leftarrow & q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & q_n^T & \rightarrow \\ \end{pmatrix} \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \left( \sum\limits_{i=1}^{n} \lambda_i e_i q_i^T \right) \\ &= \begin{pmatrix} \uparrow & & \uparrow \\ q_1 & \cdots & q_n \\ \downarrow & & \downarrow \\ \end{pmatrix} \begin{pmatrix} \leftarrow & \lambda_1 q_1^T & \rightarrow \\ & \vdots & \\ \leftarrow & \lambda_n q_n^T & \rightarrow \\ \end{pmatrix} \\ \end{aligned} A=QΛQT=⎝⎛↑q1↓⋯↑qn↓⎠⎞⎝⎛λ1⋱λn⎠⎞⎝⎜⎛←←q1T⋮qnT→→⎠⎟⎞=⎝⎛↑q1↓⋯↑qn↓⎠⎞⎝⎛↑λ1e1↓⋯↑λnen↓⎠⎞⎝⎜⎛←←q1T⋮qnT→→⎠⎟⎞=⎝⎛↑q1↓⋯↑qn↓⎠⎞(i=1∑nλieiqiT)=⎝⎛↑q1↓⋯↑qn↓⎠⎞⎝⎜⎛←←λ1q1T⋮λnqnT→→⎠⎟⎞
将A的元素重写成 A i j = ∑ k = 1 n λ k q k i q k j A_{ij} = \sum\limits_{k=1}^{n} \lambda_k q_{ki} q_{kj} Aij=k=1∑nλkqkiqkj, λ k > 0 \lambda_k>0 λk>0.
根据柯西不等式,
(
∑
i
=
1
n
λ
k
q
k
i
q
k
j
)
2
⩽
(
∑
i
=
1
n
(
λ
k
q
k
i
)
2
)
(
∑
i
=
1
n
(
λ
k
q
k
j
)
2
)
=
(
∑
i
=
1
n
λ
k
q
k
i
2
)
(
∑
i
=
1
n
λ
k
q
k
j
2
)
\left(\sum\limits_{i=1}^{n} \lambda_k q_{ki} q_{kj}\right)^2 \leqslant \left(\sum\limits_{i=1}^{n} \left(\sqrt{\lambda_k} q_{ki}\right)^2\right) \left(\sum\limits_{i=1}^{n} \left(\sqrt{\lambda_k} q_{kj}\right)^2\right) = \left(\sum\limits_{i=1}^{n} \lambda_k q_{ki}^2\right) \left(\sum\limits_{i=1}^{n} \lambda_k q_{kj}^2\right)
(i=1∑nλkqkiqkj)2⩽(i=1∑n(λkqki)2)(i=1∑n(λkqkj)2)=(i=1∑nλkqki2)(i=1∑nλkqkj2), 即
A
i
j
2
⩽
A
i
i
A
j
j
A_{ij}^2 \leqslant A_{ii} A_{jj}
Aij2⩽AiiAjj.
(1)中已经证明
A
i
i
,
A
j
j
>
0
A_{ii}, A_{jj} > 0
Aii,Ajj>0, 因此
max
{
∣
A
i
i
∣
,
∣
A
j
j
∣
}
=
max
{
A
i
i
,
A
j
j
}
=
(
max
{
A
i
i
,
A
j
j
}
)
2
⩾
A
i
i
A
j
j
=
A
i
j
2
=
∣
A
i
j
∣
\max\{|A_{ii}|,|A_{jj}|\} = \max\{A_{ii},A_{jj}\} = \sqrt{\left(\max\{A_{ii},A_{jj}\}\right)^2} \geqslant \sqrt{A_{ii}A_{jj}} = \sqrt{A_{ij}^2} = |A_{ij}|
max{∣Aii∣,∣Ajj∣}=max{Aii,Ajj}=(max{Aii,Ajj})2⩾AiiAjj=Aij2=∣Aij∣.
即
∣
A
i
j
∣
⩽
max
{
∣
A
i
i
∣
,
∣
A
j
j
∣
}
|A_{ij}| \leqslant \max\{|A_{ii}|,|A_{jj}|\}
∣Aij∣⩽max{∣Aii∣,∣Ajj∣}, 这样就证明了
A
A
A的绝对值最大的元素必在对角线上.
[半]正定矩阵(semi positive definite matrix)主子式(principal minor)必然[半]正定.
以上证明过程可以简化为: 由于A正定, 因此二阶主子式 ( A i i A i j A i j A j j ) \begin{pmatrix} A_{ii} & A_{ij} \\ A_{ij} & A_{jj} \\ \end{pmatrix} (AiiAijAijAjj)正定, 故其行列式 A i i A j j − A i j 2 > 0 A_{ii}A_{jj} - A_{ij}^2 > 0 AiiAjj−Aij2>0.
(5) max 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max 2 ⩽ i , j ⩽ n ∣ a i j ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| 2⩽i,j⩽nmax∣aij(2)∣⩽2⩽i,j⩽nmax∣aij∣;
由于
A
A
A是正定矩阵, 因此主子式
A
2
(
1
)
A_2^{(1)}
A2(1)是正定对称矩阵.
(2)中已经证明
A
2
A_2
A2是正定对称矩阵.
(4)中已经证明正定对称的绝对值最大的元素必在对角线上.
因此不妨假设 max 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ = max 2 ⩽ i ⩽ n a i i ( 2 ) = a x x ( 2 ) \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| = \max\limits_{2 \leqslant i \leqslant n} a_{ii}^{(2)} = a^{(2)}_{xx} 2⩽i,j⩽nmax∣aij(2)∣=2⩽i⩽nmaxaii(2)=axx(2), max 2 ⩽ i , j ⩽ n ∣ a i j ∣ = max 2 ⩽ i ⩽ n a i i = a y y \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}| = \max\limits_{2 \leqslant i \leqslant n} a_{ii} = a_{yy} 2⩽i,j⩽nmax∣aij∣=2⩽i⩽nmaxaii=ayy, 2 ⩽ x , y ⩽ n 2 \leqslant x,y \leqslant n 2⩽x,y⩽n.
(3)中已经证明
a
i
i
(
2
)
⩽
a
i
i
(
i
=
2
,
3
,
⋯
,
n
)
a_{ii}^{(2)} \leqslant a_{ii} (i=2,3,\cdots,n)
aii(2)⩽aii(i=2,3,⋯,n).
因此
max
2
⩽
i
,
j
⩽
n
∣
a
i
j
(
2
)
∣
=
a
x
x
(
2
)
⩽
a
x
x
⩽
max
2
⩽
i
⩽
n
a
i
i
=
max
2
⩽
i
,
j
⩽
n
∣
a
i
j
∣
\max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| = a^{(2)}_{xx} \leqslant a_{xx} \leqslant \max\limits_{2 \leqslant i \leqslant n} a_{ii} = \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}|
2⩽i,j⩽nmax∣aij(2)∣=axx(2)⩽axx⩽2⩽i⩽nmaxaii=2⩽i,j⩽nmax∣aij∣.
(6) 从(2),(3),(5)推出,如果 ∣ a i j ∣ < 1 |a_{ij}|<1 ∣aij∣<1,则对于所有 k k k, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}|<1 ∣aij(k)∣<1.
根据(5)知 max 2 ⩽ i , j ⩽ n ∣ a i j ( 2 ) ∣ ⩽ max 2 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ ⩽ max 1 ⩽ i , j ⩽ n ∣ a i j ( 1 ) ∣ \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(2)}| \leqslant \max\limits_{2 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| 2⩽i,j⩽nmax∣aij(2)∣⩽2⩽i,j⩽nmax∣aij(1)∣⩽1⩽i,j⩽nmax∣aij(1)∣推广到 max k ⩽ i , j ⩽ n ∣ a i j ( k ) ∣ ⩽ max k ⩽ i , j ⩽ n ∣ a i j ( k − 1 ) ∣ ⩽ max k − 1 ⩽ i , j ⩽ n ∣ a i j ( k − 1 ) ∣ \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k)}| \leqslant \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| \leqslant \max\limits_{k-1 \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| k⩽i,j⩽nmax∣aij(k)∣⩽k⩽i,j⩽nmax∣aij(k−1)∣⩽k−1⩽i,j⩽nmax∣aij(k−1)∣.
对于
k
⩽
i
,
j
⩽
n
k \leqslant i,j \leqslant n
k⩽i,j⩽n,
∣
a
i
j
(
k
)
∣
⩽
max
k
⩽
i
,
j
⩽
n
∣
a
i
j
(
k
)
∣
⩽
max
k
−
1
⩽
i
,
j
⩽
n
∣
a
i
j
(
k
−
1
)
∣
⩽
⋯
⩽
max
1
⩽
i
,
j
⩽
n
∣
a
i
j
(
1
)
∣
=
max
1
⩽
i
,
j
⩽
n
∣
a
i
j
∣
<
1
|a_{ij}^{(k)}| \leqslant \max\limits_{k \leqslant i,j \leqslant n} |a_{ij}^{(k)}| \leqslant \max\limits_{k-1 \leqslant i,j \leqslant n} |a_{ij}^{(k-1)}| \leqslant \cdots \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| = \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}| < 1
∣aij(k)∣⩽k⩽i,j⩽nmax∣aij(k)∣⩽k−1⩽i,j⩽nmax∣aij(k−1)∣⩽⋯⩽1⩽i,j⩽nmax∣aij(1)∣=1⩽i,j⩽nmax∣aij∣<1.
对于
i
<
k
∨
j
<
k
i<k \vee j<k
i<k∨j<k, 注意到
a
i
j
(
k
)
=
a
i
j
(
k
−
1
)
=
⋯
=
a
i
j
(
min
{
i
,
j
}
)
a_{ij}^{(k)} = a_{ij}^{(k-1)} = \cdots = a_{ij}^{(\min\{i,j\})}
aij(k)=aij(k−1)=⋯=aij(min{i,j}).
∣
a
i
j
(
k
)
∣
=
∣
a
i
j
(
k
−
1
)
∣
=
⋯
=
∣
a
i
j
(
min
{
i
,
j
}
)
∣
⩽
max
min
{
i
,
j
}
⩽
i
,
j
⩽
n
∣
a
i
j
(
min
{
i
,
j
}
)
∣
⩽
max
min
{
i
,
j
}
−
1
⩽
i
,
j
⩽
n
∣
a
i
j
(
min
{
i
,
j
}
−
1
)
∣
⩽
⋯
⩽
max
1
⩽
i
,
j
⩽
n
∣
a
i
j
(
1
)
∣
=
max
1
⩽
i
,
j
⩽
n
∣
a
i
j
∣
<
1
|a_{ij}^{(k)}| = |a_{ij}^{(k-1)}| = \cdots = |a_{ij}^{(\min\{i,j\})}| \leqslant \max\limits_{\min\{i,j\} \leqslant i,j \leqslant n} |a_{ij}^{(\min\{i,j\})}| \leqslant \max\limits_{\min\{i,j\}-1 \leqslant i,j \leqslant n} |a_{ij}^{(\min\{i,j\}-1)}| \leqslant \cdots \leqslant \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}^{(1)}| = \max\limits_{1 \leqslant i,j \leqslant n} |a_{ij}| < 1
∣aij(k)∣=∣aij(k−1)∣=⋯=∣aij(min{i,j})∣⩽min{i,j}⩽i,j⩽nmax∣aij(min{i,j})∣⩽min{i,j}−1⩽i,j⩽nmax∣aij(min{i,j}−1)∣⩽⋯⩽1⩽i,j⩽nmax∣aij(1)∣=1⩽i,j⩽nmax∣aij∣<1.
综上所述, 对于 1 ⩽ i , j ⩽ n 1 \leqslant i,j \leqslant n 1⩽i,j⩽n, ∣ a i j ( k ) ∣ < 1 |a_{ij}^{(k)}| < 1 ∣aij(k)∣<1.
追赶法
( b 1 c 1 a 2 b 2 c 2 ⋱ ⋱ ⋱ a n − 1 b n − 1 c n − 1 a n b n ) = ( 1 β 1 1 β 2 ⋱ ⋱ 1 β n − 1 1 ) ( α 1 γ 2 α 2 ⋱ ⋱ γ n − 1 α n − 1 γ n α n ) \begin{pmatrix} b_1 & c_1 & & & \\ a_2 & b_2 & c_2 & & \\ & \ddots & \ddots & \ddots & \\ & & a_{n-1} & b_{n-1} & c_{n-1} \\ & & & a_n & b_n \\ \end{pmatrix} = \begin{array}{cc} & \begin{pmatrix} 1 & \beta_1 & & & \\ & 1 & \beta_2 & & \\ & & \ddots & \ddots & \\ & & & 1 & \beta_{n-1} \\ & & & & 1 \\ \end{pmatrix} \\ \begin{pmatrix} \alpha_1 & & & & \\ \gamma_2 & \alpha_2 & & \\ & \ddots & \ddots & & \\ & & \gamma_{n-1} & \alpha_{n-1} & \\ & & & \gamma_n & \alpha_n \\ \end{pmatrix} & \\ \end{array} ⎝⎜⎜⎜⎜⎛b1a2c1b2⋱c2⋱an−1⋱bn−1ancn−1bn⎠⎟⎟⎟⎟⎞=⎝⎜⎜⎜⎜⎛α1γ2α2⋱⋱γn−1αn−1γnαn⎠⎟⎟⎟⎟⎞⎝⎜⎜⎜⎜⎛1β11β2⋱⋱1βn−11⎠⎟⎟⎟⎟⎞
A = L U ⟹ { a i = γ i b i = α i + β i − 1 γ i c i = α i β i A = LU \implies \begin{cases} a_i = \gamma_i \\ b_i = \alpha_i + \beta_{i-1} \gamma_i \\ c_i = \alpha_i \beta_i \\ \end{cases} A=LU⟹⎩⎪⎨⎪⎧ai=γibi=αi+βi−1γici=αiβi
L y = f ⟹ f i = α i y i + γ i y i − 1 Ly = f \implies f_i = \alpha_i y_i + \gamma_i y_{i-1} Ly=f⟹fi=αiyi+γiyi−1
U x = y ⟹ y i = x i + β i x i + 1 Ux = y \implies y_i = x_i + \beta_i x_{i+1} Ux=y⟹yi=xi+βixi+1
a 1 = a n + 1 = b 0 = b n + 1 = c 0 = c n = α 0 = α n + 1 = β 0 = β n = γ 1 = γ n + 1 = 0 a_1 = a_{n+1} = b_{0} = b_{n+1} = c_0 = c_n = \alpha_0 = \alpha_{n+1} = \beta_0 = \beta_n = \gamma_1 = \gamma_{n+1} = 0 a1=an+1=b0=bn+1=c0=cn=α0=αn+1=β0=βn=γ1=γn+1=0
追循环(第1轮)
α
1
=
b
1
\alpha_1 = b_1
α1=b1
α
i
=
b
i
−
β
i
−
1
a
i
\alpha_i = b_i - \beta_{i-1} a_i
αi=bi−βi−1ai (
2
⩽
i
⩽
n
2 \leqslant i \leqslant n
2⩽i⩽n)
β
i
=
c
i
/
α
i
\beta_i = c_i / \alpha_i
βi=ci/αi (
1
⩽
i
⩽
n
−
1
1 \leqslant i \leqslant n-1
1⩽i⩽n−1)
追循环(第2轮)
y
1
=
f
1
/
α
1
y_1 = f_1 / \alpha_1
y1=f1/α1
y
i
=
(
f
i
−
a
i
y
i
−
1
)
/
α
i
y_i = \left( f_i - a_i y_{i-1} \right) / \alpha_i
yi=(fi−aiyi−1)/αi (
2
⩽
i
⩽
n
2 \leqslant i \leqslant n
2⩽i⩽n)
赶循环
x
n
=
y
n
x_n = y_n
xn=yn
x
i
=
y
i
−
β
i
x
i
+
1
x_i = y_i - \beta_i x_{i+1}
xi=yi−βixi+1 (
1
⩽
i
⩽
n
−
1
1 \leqslant i \leqslant n-1
1⩽i⩽n−1)
显然
γ
\gamma
γ只是用于推导的符号, 不具备实际意义(不具备自由度).
如果要节省内存空间, 可以不储存
α
\alpha
α.
A = sym([
2 -1 0 0 0;
-1 2 -1 0 0;
0 -1 2 -1 0;
0 0 -1 2 -1;
0 0 0 -1 2;
]);
f = sym([
1;
0;
0;
0;
0;
])
n = 5;
a = @(i) A(i,i-1);
b = @(i) A(i,i);
c = @(i) A(i,i+1);
alpha = sym(zeros(n,1));
beta = sym(zeros(n-1,1));
y = sym(zeros(n,1));
x = sym(zeros(n,1));
% alpha, beta, 1 -> n
alpha(1) = b(1);
for i = 2:n
beta(i-1) = c(i-1) ./ alpha(i-1);
alpha(i) = b(i) - beta(i-1) .* a(i);
end
% y 1 -> n
y(1) = f(1) ./ alpha(1);
for i = 2:n
y(i) = (f(i) - a(i) .* y(i-1)) ./ alpha(i);
end
% x n -> 1
x(n) = y(n);
for j = 1:n-1
i = n - j;
x(i) = y(i) - beta(i) .* x(i+1);
end
disp(x);
矩阵算子范数
L1算子范数证明:
∥
A
x
∥
1
=
∑
i
=
1
n
∣
∑
j
=
1
n
a
i
j
x
j
∣
⩽
∑
j
=
1
n
∣
x
j
∣
∑
i
=
1
n
∣
a
i
j
∣
⩽
∑
j
=
1
n
∣
x
j
∣
(
max
j
∑
i
=
1
n
∣
a
i
j
∣
)
=
∥
x
∥
1
(
max
j
∑
i
=
1
n
∣
a
i
j
∣
)
\|Ax\|_1 = \sum\limits_{i=1}^{n} \left| \sum\limits_{j=1}^{n} a_{ij} x_{j} \right| \leqslant \sum\limits_{j=1}^{n} |x_{j}| \sum\limits_{i=1}^{n} |a_{ij}| \leqslant \sum\limits_{j=1}^{n} |x_{j}| \left( \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| \right) = \|x\|_1 \left( \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}| \right)
∥Ax∥1=i=1∑n∣∣∣∣∣j=1∑naijxj∣∣∣∣∣⩽j=1∑n∣xj∣i=1∑n∣aij∣⩽j=1∑n∣xj∣(jmaxi=1∑n∣aij∣)=∥x∥1(jmaxi=1∑n∣aij∣)
取等条件
x
j
0
=
±
∥
x
∥
1
x_{j_0} = \pm \|x\|_1
xj0=±∥x∥1,
x
o
t
h
e
r
=
0
x_{\mathrm{other}} = 0
xother=0,
j
0
=
arg
max
j
∑
i
=
1
n
∣
a
i
j
∣
j_0 = \arg\max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}|
j0=argjmaxi=1∑n∣aij∣.
L2算子范数证明:
见下一部分.
L∞算子范数证明:
∥
A
x
∥
+
∞
=
max
i
∣
∑
j
=
1
n
a
i
j
x
j
∣
⩽
max
i
∑
j
=
1
n
∣
a
i
j
∣
∣
x
j
∣
⩽
max
j
∣
x
j
∣
max
i
∑
j
=
1
n
∣
a
i
j
∣
=
∥
x
∥
+
∞
max
i
∑
j
=
1
n
∣
a
i
j
∣
\|Ax\|_{+\infty} = \max\limits_{i} \left| \sum\limits_{j=1}^{n} a_{ij} x_{j} \right| \leqslant \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| |x_{j}| \leqslant \max\limits_{j} |x_{j}| \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}| = \|x\|_{+\infty} \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}|
∥Ax∥+∞=imax∣∣∣∣∣j=1∑naijxj∣∣∣∣∣⩽imaxj=1∑n∣aij∣∣xj∣⩽jmax∣xj∣imaxj=1∑n∣aij∣=∥x∥+∞imaxj=1∑n∣aij∣
取等条件
x
i
=
s
i
g
n
(
a
i
0
j
)
∥
x
∥
+
∞
x_{i} = \mathrm{sign}(a_{i_0 j}) \|x\|_{+\infty}
xi=sign(ai0j)∥x∥+∞ 或
x
i
=
−
s
i
g
n
(
a
i
0
j
)
∥
x
∥
+
∞
x_{i} = - \mathrm{sign}(a_{i_0 j}) \|x\|_{+\infty}
xi=−sign(ai0j)∥x∥+∞,
i
0
=
arg
max
i
∑
j
=
1
n
∣
a
i
j
∣
i_0 = \arg\max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}|
i0=argimaxj=1∑n∣aij∣.
∥
A
∥
F
=
∑
i
=
1
n
∑
j
=
1
n
a
i
j
2
\|A\|_F = \sum\limits_{i=1}^{n} \sum\limits_{j=1}^{n} a_{ij}^2
∥A∥F=i=1∑nj=1∑naij2
∥
A
∥
1
=
sup
∥
x
∥
1
=
1
∥
A
x
∥
1
=
max
j
∑
i
=
1
n
∣
a
i
j
∣
\|A\|_1 = \sup\limits_{\|x\|_1 = 1} \|Ax\|_1 = \max\limits_{j} \sum\limits_{i=1}^{n} |a_{ij}|
∥A∥1=∥x∥1=1sup∥Ax∥1=jmaxi=1∑n∣aij∣
∥
A
∥
2
=
sup
∥
x
∥
2
=
1
∥
A
x
∥
2
=
∣
σ
∣
max
(
A
)
\|A\|_2 = \sup\limits_{\|x\|_2 = 1} \|Ax\|_2 = |\sigma|_{\max}(A)
∥A∥2=∥x∥2=1sup∥Ax∥2=∣σ∣max(A)
∥
A
∥
+
∞
=
sup
∥
x
∥
+
∞
=
1
∥
A
x
∥
+
∞
=
max
i
∑
j
=
1
n
∣
a
i
j
∣
\|A\|_{+\infty} = \sup\limits_{\|x\|_{+\infty} = 1} \|Ax\|_{+\infty} = \max\limits_{i} \sum\limits_{j=1}^{n} |a_{ij}|
∥A∥+∞=∥x∥+∞=1sup∥Ax∥+∞=imaxj=1∑n∣aij∣
% F
nF = sqrt(sum(A(:).^2))
assert(nF == norm(A,'fro'));
% 1
n1 = max(sum(abs(A)))
assert(n1 == norm(A,1));
% inf
nInf = max(sum(abs(A')))
assert(nInf == norm(A,inf));
% 2
n2 = max(svd(A))
assert(n2 == norm(A,2));
向量矩阵范数等价性
证明
- ∥ x ∥ + ∞ ⩽ ∥ x ∥ 1 ⩽ n ∥ x ∥ + ∞ \|x\|_{+\infty} \leqslant \|x\|_{1} \leqslant n \|x\|_{+\infty} ∥x∥+∞⩽∥x∥1⩽n∥x∥+∞
- 1 n ∥ A ∥ F ⩽ ∥ A ∥ 2 ⩽ ∥ A ∥ F \frac{1}{\sqrt{n}} \|A\|_{F} \leqslant \|A\|_{2} \leqslant \|A\|_{F} n1∥A∥F⩽∥A∥2⩽∥A∥F
∥ x ∥ + ∞ = max 1 ⩽ i ⩽ n ∣ a i ∣ ⩽ ∑ 1 ⩽ i ⩽ n ∣ a i ∣ = ∥ x ∥ 1 ⩽ ∑ 1 ⩽ i ⩽ n ( max 1 ⩽ i ⩽ n ∣ a i ∣ ) = n max 1 ⩽ i ⩽ n ∣ a i ∣ = n ∥ x ∥ + ∞ \|x\|_{+\infty} = \max\limits_{1\leqslant i \leqslant n} |a_i| \leqslant \sum\limits_{1\leqslant i \leqslant n} |a_i| = \|x\|_{1} \leqslant \sum\limits_{1\leqslant i \leqslant n} \left( \max\limits_{1\leqslant i \leqslant n} |a_i| \right) = n \max\limits_{1\leqslant i \leqslant n} |a_i| = n \|x\|_{+\infty} ∥x∥+∞=1⩽i⩽nmax∣ai∣⩽1⩽i⩽n∑∣ai∣=∥x∥1⩽1⩽i⩽n∑(1⩽i⩽nmax∣ai∣)=n1⩽i⩽nmax∣ai∣=n∥x∥+∞
A = U Σ V H A = U \Sigma V^H A=UΣVH
为什么选用奇异值分解而非特征值分解?
- 不是所有矩阵的特征值分解都可以写成 A = Q Λ Q − 1 A = Q \Lambda Q^{-1} A=QΛQ−1(考虑Jordan标准型, 只有代数重数等于几何重数的矩阵才可对角化)
- 不是所有可对角化矩阵的特征值分解都可以写成 A = Q Λ Q T A = Q \Lambda Q^{T} A=QΛQT(只有归正矩阵 A A H = A H A AA^H=A^HA AAH=AHA才能保证特征向量正交)
∥
A
∥
F
=
t
r
(
A
H
A
)
=
t
r
(
V
Σ
2
V
H
)
=
t
r
(
Σ
2
V
V
H
)
=
t
r
(
Σ
2
I
)
=
t
r
(
Σ
2
)
=
∑
1
⩽
i
⩽
n
σ
i
(
A
H
A
)
=
∑
1
⩽
i
⩽
n
σ
i
2
(
A
)
\|A\|_{F} = \sqrt{\mathrm{tr}(A^HA)} = \sqrt{\mathrm{tr}(V \Sigma^2 V^H)} = \sqrt{\mathrm{tr}(\Sigma^2 V V^H)} = \sqrt{\mathrm{tr}(\Sigma^2 I)} = \sqrt{\mathrm{tr}(\Sigma^2)} = \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i(A^HA)} = \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)}
∥A∥F=tr(AHA)=tr(VΣ2VH)=tr(Σ2VVH)=tr(Σ2I)=tr(Σ2)=1⩽i⩽n∑σi(AHA)=1⩽i⩽n∑σi2(A)
∥
A
∥
2
=
sup
∥
x
∥
2
=
1
∥
A
x
∥
2
=
sup
∥
x
∥
2
=
1
∥
U
Σ
V
H
x
∥
2
=
z
=
V
H
x
sup
∥
z
∥
2
=
1
∥
U
Σ
z
∥
2
=
sup
∥
z
∥
2
=
1
∑
i
=
1
n
σ
i
2
(
A
)
⟨
u
i
,
z
⟩
2
=
σ
max
2
(
A
)
=
∣
σ
∣
max
(
A
)
\|A\|_{2} = \sup\limits_{\|x\|_2 = 1} \|Ax\|_2 = \sup\limits_{\|x\|_2 = 1} \|U \Sigma V^H x\|_2 \stackrel{z = V^H x}{=} \sup\limits_{\|z\|_2 = 1} \|U \Sigma z\|_2 = \sup\limits_{\|z\|_2 = 1} \sqrt{\sum\limits_{i=1}^{n} \sigma_i^2(A) \langle u_i, z \rangle^2} = \sqrt{\sigma_{\max}^2(A)} = |\sigma|_{\max}(A)
∥A∥2=∥x∥2=1sup∥Ax∥2=∥x∥2=1sup∥UΣVHx∥2=z=VHx∥z∥2=1sup∥UΣz∥2=∥z∥2=1supi=1∑nσi2(A)⟨ui,z⟩2=σmax2(A)=∣σ∣max(A)
其中 u i u_i ui是 U U U的第 i i i行, 由于 U U U是酉矩阵, 因此 { u i } i = 1 n \{u_i\}_{i=1}^{n} {ui}i=1n也构成一组正交基.
其中 ⟨ x , y ⟩ \langle x, y \rangle ⟨x,y⟩是两个 n n n维向量内积, 忽略行向量和列向量的格式差异.
由于 { u i } i = 1 n \{u_i\}_{i=1}^{n} {ui}i=1n构成一组正交基且 ∥ z ∥ 2 = 1 \|z\|_2 = 1 ∥z∥2=1, 由Parseval定理知 ∑ i = 1 n ⟨ u i , z ⟩ 2 = ∥ z ∥ 2 2 = 1 \sum\limits_{i=1}^{n} \langle u_i, z \rangle^2 = \|z\|_2^2 = 1 i=1∑n⟨ui,z⟩2=∥z∥22=1.
1 n ∥ A ∥ F = 1 n ∑ 1 ⩽ i ⩽ n σ i 2 ( A ) ⩽ 1 n ∑ 1 ⩽ i ⩽ n σ max 2 ( A ) = σ max 2 ( A ) = ∥ A ∥ 2 ⩽ ∑ 1 ⩽ i ⩽ n σ i 2 ( A ) = ∥ A ∥ F \frac{1}{\sqrt{n}} \|A\|_{F} = \sqrt{\frac{1}{n} \sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)} \leqslant \sqrt{\frac{1}{n} \sum\limits_{1\leqslant i \leqslant n} \sigma_{\max}^2(A)} = \sqrt{\sigma_{\max}^2(A)} = \|A\|_{2} \leqslant \sqrt{\sum\limits_{1\leqslant i \leqslant n} \sigma_i^2(A)} = \|A\|_{F} n1∥A∥F=n11⩽i⩽n∑σi2(A)⩽n11⩽i⩽n∑σmax2(A)=σmax2(A)=∥A∥2⩽1⩽i⩽n∑σi2(A)=∥A∥F
矩阵算子范数下的条件数
设A,B∈ℝn×n且∥·∥为ℝn×n上矩阵的算子范数,证明cond(AB)⩽cond(A)cond(B).
假设
A
A
A,
B
B
B均非奇异,
c
o
n
d
(
A
)
c
o
n
d
(
B
)
=
∥
A
∥
∥
A
−
1
∥
∥
B
∥
∥
B
−
1
∥
\mathrm{cond}(A)\mathrm{cond}(B) = \|A\|\|A^{-1}\|\|B\|\|B^{-1}\|
cond(A)cond(B)=∥A∥∥A−1∥∥B∥∥B−1∥
c
o
n
d
(
A
B
)
=
∥
A
B
∥
∥
B
−
1
A
−
1
∥
\mathrm{cond}(AB) = \|AB\|\|B^{-1}A^{-1}\|
cond(AB)=∥AB∥∥B−1A−1∥
即证
∥
A
B
∥
∥
B
−
1
A
−
1
∥
⩽
(
∥
A
∥
∥
B
∥
)
(
∥
B
−
1
∥
∥
A
−
1
∥
)
\|AB\|\|B^{-1}A^{-1}\| \leqslant (\|A\|\|B\|) (\|B^{-1}\|\|A^{-1}\|)
∥AB∥∥B−1A−1∥⩽(∥A∥∥B∥)(∥B−1∥∥A−1∥)
我们证明如下更强的结论, 假设
X
X
X,
Y
Y
Y均非奇异,
∥
X
Y
∥
⩽
∥
X
∥
∥
Y
∥
\|XY\| \leqslant \|X\|\|Y\|
∥XY∥⩽∥X∥∥Y∥
∥
X
Y
∥
=
sup
w
≠
0
∥
X
Y
w
∥
∥
w
∥
=
sup
w
≠
0
∥
X
Y
w
∥
∥
Y
w
∥
∥
Y
w
∥
∥
w
∥
⩽
sup
w
≠
0
∥
X
Y
w
∥
∥
Y
w
∥
sup
v
≠
0
∥
Y
v
∥
∥
v
∥
=
u
=
Y
w
sup
v
≠
0
∥
Y
v
∥
∥
v
∥
sup
u
≠
0
∥
X
u
∥
∥
u
∥
=
∥
Y
∥
∥
X
∥
\|XY\| = \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|w\|} = \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|Yw\|}\frac{\|Yw\|}{\|w\|} \leqslant \sup\limits_{w \neq 0} \frac{\|XYw\|}{\|Yw\|} \sup\limits_{v \neq 0} \frac{\|Yv\|}{\|v\|} \stackrel{u=Yw}{=} \sup\limits_{v \neq 0} \frac{\|Yv\|}{\|v\|} \sup\limits_{u \neq 0} \frac{\|Xu\|}{\|u\|} = \|Y\| \|X\|
∥XY∥=w=0sup∥w∥∥XYw∥=w=0sup∥Yw∥∥XYw∥∥w∥∥Yw∥⩽w=0sup∥Yw∥∥XYw∥v=0sup∥v∥∥Yv∥=u=Ywv=0sup∥v∥∥Yv∥u=0sup∥u∥∥Xu∥=∥Y∥∥X∥
注意 w ≠ 0 w \neq 0 w=0当且仅当 u = Y w ≠ 0 u = Yw \neq 0 u=Yw=0, 原因是 Y Y Y非奇异.