注:本文是对Matrix Analysis and Applied Linear Algebra一书3.9节Elementary Matrices and Equivalence的学习笔记
将复杂的对象分解成几个基本对象的组合是一种常用的处理数学问题的方式,比如因式分解。在矩阵代数中,类似地,一个一般的矩阵也可能可以被分解成几个初等矩阵(Elementary Matrices)的乘积。
Matrices of the form I − u v T \mathbf I − \mathbf {uv}^T I−uvT, where u \mathbf u u and v \mathbf v v are n × 1 n \times 1 n×1 columns such that v T u ≠ 1 \mathbf v^T \mathbf u \ne 1 vTu=1 are called elementary matrices, and we know from Sherman–Morrison Formula that all such matrices are nonsingular and ( I − u v T ) − 1 = I − u v T v T u − 1 . (\mathbf I − \mathbf {uv}^T)^{−1} = \mathbf I − \frac{\mathbf {uv}^T}{\mathbf v^T \mathbf u − 1} . (I−uvT)−1=I−vTu−1uvT. Notice that inverses of elementary matrices are elementary matrices.
我们特别关注和基本行/列变换有关的的初等矩阵。定义:
- Type I \text{Type }\rm I Type I: 交换第 i , j i,j i,j行/列
- Type I I \text{Type }\rm {II} Type II: 将第 i i i行/列乘以 α ( α ≠ 0 ) \alpha(\alpha \ne 0) α(α=0)倍
- Type I I I \text{Type }\rm {III} Type III: 将第 i i i行/列的若干倍加到第 j j j行/列
这三种变换对应的初等矩阵,分别是 E 1 = I − u u T , u = e j − e i E 2 = I − ( 1 − α ) e i e i T E 3 = I + α e j e i T \begin{aligned}\mathbf E_1&=\mathbf I-\mathbf u \mathbf u^T, \mathbf u=\mathbf e_j-\mathbf e_i \\ \mathbf E_2&=\mathbf I-(1-\alpha) \mathbf e_i \mathbf e_i^T \\ \mathbf E_3&=\mathbf I+\alpha \mathbf e_j \mathbf e_i^T \end{aligned} E1E2E3=I−uuT,u=ej−ei=I−(1−α)eieiT=I+αejeiT可以验证它们满足这样的性质:
- When used as a left-hand multiplier, an elementary matrix of Type I, II, or III executes the corresponding row operation.
- When used as a right-hand multiplier, an elementary matrix of Type I, II, or III executes the corresponding column operation.
比如 Type I \text{Type }\rm I Type I: E 1 A = ( I − ( e j − e i ) ( e j − e i ) T ) A = A − ( e j e j T + e i e i T − e i e j T − e j e i T ) A = A − ( e j A j ∗ + e i A i ∗ − e i A j ∗ − e j A i ∗ ) = A − ( [ 0 ⋮ 0 ⋮ A j ∗ ⋮ 0 ] + [ 0 ⋮ A i ∗ ⋮ 0 ⋮ 0 ] − [ 0 ⋮ A j ∗ ⋮ 0 ⋮ 0 ] − [ 0 ⋮ 0 ⋮ A i ∗ ⋮ 0 ] ) , A E 1 = ( I − ( e j − e i ) ( e j − e i ) T ) A = A − A ( e j e j T + e i e i T − e i e j T − e j e i T ) = A − ( A ∗ j e j T + A ∗ i e i T − A ∗ j e i T − A ∗ i e j T ) = A − ( [ 0 ⋮ 0 ⋮ A ∗ j T ⋮ 0 ] T + [ 0 ⋮ A ∗ i T ⋮ 0 ⋮ 0 ] T − [ 0 ⋮ A ∗ j T ⋮ 0 ⋮ 0 ] T − [ 0 ⋮ 0 ⋮ A ∗ i T ⋮ 0 ] T ) \begin{aligned}\mathbf E_1 \mathbf A &=(\mathbf I-(\mathbf e_j-\mathbf e_i)(\mathbf e_j-\mathbf e_i)^T)\mathbf A\\&=\mathbf A-(\mathbf e_j \mathbf e_j^T+\mathbf e_i \mathbf e_i^T-\mathbf e_i \mathbf e_j^T-\mathbf e_j \mathbf e_i^T)\mathbf A\\&=\mathbf A-(\mathbf e_j \mathbf A_{j*}+\mathbf e_i \mathbf A_{i*}-\mathbf e_i \mathbf A_{j*}-\mathbf e_j \mathbf A_{i*}) \\ &=\mathbf A-\left(\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{j*} \\ \vdots \\ \mathbf 0 \end{matrix}\right]+\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{i*} \\ \vdots \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{matrix}\right]-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{j*} \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf 0 \end{matrix}\right]-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{i*} \\ \vdots \\ \mathbf 0 \end{matrix}\right] \right),\\ \mathbf A \mathbf E_1 &=(\mathbf I-(\mathbf e_j-\mathbf e_i)(\mathbf e_j-\mathbf e_i)^T)\mathbf A\\&=\mathbf A-\mathbf A(\mathbf e_j \mathbf e_j^T+\mathbf e_i \mathbf e_i^T-\mathbf e_i \mathbf e_j^T-\mathbf e_j \mathbf e_i^T)\\&=\mathbf A-(\mathbf A_{*j} \mathbf e_j ^T+ \mathbf A_{*i} \mathbf e_i^T- \mathbf A_{*j} \mathbf e_i^T- \mathbf A_{*i} \mathbf e_j^T) \\ &=\mathbf A-\left(\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{*j}^T \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T+\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{*i}^T \\ \vdots \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{matrix}\right]^T-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{*j}^T \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{*i}^T \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T \right)\end{aligned} E1AAE1=(I−(ej−ei)(ej−ei)T)A=A−(ejejT+eieiT−eiejT−ejeiT)A=A−(ejAj∗+eiAi∗−eiAj∗−ejAi∗)=A−⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮Aj∗⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤+⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮Ai∗⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮Aj∗⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮Ai∗⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎠⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎞,=(I−(ej−ei)(ej−ei)T)A=A−A(ejejT+eieiT−eiejT−ejeiT)=A−(A∗jejT+A∗ieiT−A∗jeiT−A∗iejT)=A−⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮A∗jT⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T+⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮A∗iT⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T−⎣⎢⎢⎢⎢