本文为《Linear algebra and its applications》的读书笔记
目录
Partitioned matrices
- A key feature of our work with matrices has been the ability to regard a matrix A A A as a list of column vectors rather than just a rectangular array of numbers. This point of view has been so useful that we wish to consider other partitions of A A A, indicated by horizontal and vertical dividing rules.
EXAMPLE 1
- The matrix
can also be written as the 2 × 3 2 \times 3 2×3 partitioned (or block) matrix
whose entries are the blocks (分块) (or submatrices)
EXAMPLE 2
- When a matrix
A
A
A appears in a mathematical model of a physical system such as an electrical network, a transportation system, or a large corporation, it may be natural to regard
A
A
A as a partitioned matrix.
- For instance, if a microcomputer circuit board consists mainly of three VLSI (very large-scale integrated) microchips, then the matrix for the circuit board might have the general form
The submatrices on the “diagonal” of A A A—namely, A 11 A_{11} A11, A 22 A_{22} A22, and A 33 A_{33} A33—concern the three VLSI chips, while the other submatrices depend on the interconnections among those microchips.
- For instance, if a microcomputer circuit board consists mainly of three VLSI (very large-scale integrated) microchips, then the matrix for the circuit board might have the general form
Addition and Scalar Multiplication
- If matrices A A A and B B B are the same size and are partitioned in exactly the same way, then it is natural to make the same partition of the ordinary matrix sum A + B A + B A+B.
- In this case, each block of A + B A + B A+B is the (matrix) sum of the corresponding blocks of A A A and B B B. Multiplication of a partitioned matrix by a scalar is also computed block by block.
Multiplication of Partitioned Matrices
- Partitioned matrices can be multiplied by the usual row–column rule as if the block entries were scalars, provided that for a product A B AB AB, the column partition of A A A matches the row partition of B B B.
EXAMPLE 3
- Let
The 5 columns of A A A are partitioned into a set of 3 columns and then a set of 2 columns. The 5 rows of B B B are partitioned in the same way—into a set of 3 rows and then a set of 2 rows. We say that the partitions of A A A and B B B are conformable for block multiplication. It can be shown that the ordinary product A B AB AB can be written as
It is important for each smaller product in the expression for A B AB AB to be written with the submatrix from A A A on the left. For instance,
- The row–column rule for multiplication of block matrices provides the most general way to regard the product of two matrices.
- 下面就理解一下为什么像上面这样进行分块矩阵乘法是正确的:
- 对于分块矩阵
A
A
A 和
B
B
B,证明
[ A 11 A 12 A 21 A 22 ] [ B 11 B 12 B 21 B 22 ] = [ A 11 B 11 + A 12 B 21 A 11 B 12 + A 12 B 22 A 21 B 11 + A 22 B 21 A 21 B 12 + A 22 B 22 ] ( 1 ) \begin{bmatrix} A_{11}&A_{12}\\ A_{21}&A_{22}\end{bmatrix} \begin{bmatrix} B_{11}&B_{12}\\ B_{21}&B_{22}\end{bmatrix}=\begin{bmatrix} A_{11}B_{11}+A_{12}B_{21}&A_{11}B_{12}+A_{12}B_{22}\\ A_{21}B_{11}+A_{22}B_{21}&A_{21}B_{12}+A_{22}B_{22}\end{bmatrix}\ \ \ \ \ \ \ (1) [A11A21A12A22][B11B21B12B22]=[A11B11+A12B21A21B11+A22B21A11B12+A12B22A21B12+A22B22] (1)- 首先考虑
A
A
A 不分块,将
B
B
B 按列分为两块
A B = A [ B 11 B 12 ] AB=A\begin{bmatrix} B_{11}&B_{12}\end{bmatrix} AB=A[B11B12]回忆矩阵乘法的定义, A B = [ A b 1 A b 2 . . . A b p ] AB=\begin{bmatrix} A\boldsymbol b_{1}&A\boldsymbol b_{2}&...&A\boldsymbol b_p\end{bmatrix} AB=[Ab1Ab2...Abp]。也就是说, B B B 的每一列都作为权重,对 A A A 的列向量进行线性组合,每一列都是单独工作的。因此,就可以很自然地得到
A B = A [ B 11 B 12 ] = [ A B 11 A B 12 ] AB=A\begin{bmatrix} B_{11}&B_{12}\end{bmatrix}=\begin{bmatrix} AB_{11}&AB_{12}\end{bmatrix} AB=A[B11B12]=[AB11AB12]- 由于每一列都是单独工作的,从中还可以总结出一条规则:当对
B
B
B 的列进行划分时,只要找到
A
B
AB
AB 的第一列的形式,结果中的每一列都保持相同形式,只不过将
B
B
B 的列标替换为相应的列标
- 例如在 ( 1 ) (1) (1) 式中,第二列就是第一列将 B 11 → B 12 , B 21 → B 22 B_{11}\rightarrow B_{12},B_{21}\rightarrow B_{22} B11→B12,B21→B22
- 由于每一列都是单独工作的,从中还可以总结出一条规则:当对
B
B
B 的列进行划分时,只要找到
A
B
AB
AB 的第一列的形式,结果中的每一列都保持相同形式,只不过将
B
B
B 的列标替换为相应的列标
- 再考虑
B
B
B 不分块,将
A
A
A 按行分为两块
A B = [ A 11 A 21 ] B AB=\begin{bmatrix} A_{11}\\A_{21}\end{bmatrix}B AB=[A11A21]B而 A B = [ r o w 1 A ⋅ B r o w 2 A ⋅ B . . . r o w p A ⋅ B ] AB=\begin{bmatrix} row_{1}A\cdot B\\row_{2}A\cdot B\\...\\row_{p}A\cdot B\end{bmatrix} AB=⎣⎢⎢⎡row1A⋅Brow2A⋅B...rowpA⋅B⎦⎥⎥⎤。因此
A B = [ A 11 A 21 ] B = [ A 11 B A 21 B ] AB=\begin{bmatrix} A_{11}\\A_{21}\end{bmatrix}B=\begin{bmatrix} A_{11}B\\A_{21}B\end{bmatrix} AB=[A11A21]B=[A11BA21B]- 类似得,也可以总结出一条规则:当对
A
A
A 的行进行划分时,只要找到
A
B
AB
AB 的第一行的形式,结果中的每一行都保持相同形式,只不过将
A
A
A 的行标替换为相应的行标
- 例如在 ( 1 ) (1) (1) 式中,第二行就是第一行将 A 11 → A 21 , A 12 → A 22 A_{11}\rightarrow A_{21},A_{12}\rightarrow A_{22} A11→A21,A12→A22
- 类似得,也可以总结出一条规则:当对
A
A
A 的行进行划分时,只要找到
A
B
AB
AB 的第一行的形式,结果中的每一行都保持相同形式,只不过将
A
A
A 的行标替换为相应的行标
- 同时对
A
A
A 按行进行水平划分,对
B
B
B 按列进行垂直划分
A B = [ A 11 A 21 ] [ B 11 B 12 ] = [ [ A 11 A 21 ] B 11 [ A 11 A 21 ] B 12 ] = [ A 11 B 11 A 11 B 12 A 21 B 11 A 21 B 12 ] AB=\begin{bmatrix} A_{11}\\A_{21}\end{bmatrix}\begin{bmatrix} B_{11}&B_{12}\end{bmatrix}=\begin{bmatrix} \begin{bmatrix}A_{11}\\A_{21}\end{bmatrix}B_{11}&\begin{bmatrix}A_{11}\\A_{21}\end{bmatrix}B_{12}\end{bmatrix}=\begin{bmatrix} A_{11}B_{11}&A_{11}B_{12}\\A_{21}B_{11}&A_{21}B_{12}\end{bmatrix} AB=[A11A21][B11B12]=[[A11A21]B11[A11A21]B12]=[A11B11A21B11A11B12A21B12] - 按照上面总结的两条规则,
[ A 11 A 12 A 21 A 22 ] [ B 11 B 12 B 21 B 22 ] = [ [ A 11 A 12 A 21 A 22 ] [ B 11 B 21 ] [ A 11 A 12 A 21 A 22 ] [ B 12 B 22 ] ] = [ [ A 11 A 12 ] [ B 11 B 21 ] [ A 11 A 12 ] [ B 12 B 22 ] [ A 21 A 22 ] [ B 11 B 21 ] [ A 21 A 22 ] [ B 12 B 22 ] ] ( 1 ) \begin{aligned}\begin{bmatrix} A_{11}&A_{12}\\ A_{21}&A_{22}\end{bmatrix} \begin{bmatrix} B_{11}&B_{12}\\ B_{21}&B_{22}\end{bmatrix}&= \begin{bmatrix}\begin{bmatrix} A_{11}&A_{12}\\ A_{21}&A_{22}\end{bmatrix}\begin{bmatrix} B_{11}\\B_{21}\end{bmatrix}&\begin{bmatrix} A_{11}&A_{12}\\ A_{21}&A_{22}\end{bmatrix}\begin{bmatrix} B_{12}\\B_{22}\end{bmatrix}\end{bmatrix} \\&=\begin{bmatrix}\begin{bmatrix} A_{11}&A_{12}\end{bmatrix}\begin{bmatrix} B_{11}\\B_{21}\end{bmatrix}&\begin{bmatrix} A_{11}&A_{12}\end{bmatrix}\begin{bmatrix} B_{12}\\B_{22}\end{bmatrix}\\ \begin{bmatrix} A_{21}&A_{22}\end{bmatrix}\begin{bmatrix} B_{11}\\B_{21}\end{bmatrix}&\begin{bmatrix} A_{21}&A_{22}\end{bmatrix}\begin{bmatrix} B_{12}\\B_{22}\end{bmatrix}\end{bmatrix}\ \ \ \ \ \ \ (1) \end{aligned} [A11A21A12A22][B11B21B12B22]=[[A11A21A12A22][B11B21][A11A21A12A22][B12B22]]=⎣⎢⎢⎡[A11A12][B11B21][A21A22][B11B21][A11A12][B12B22][A21A22][B12B22]⎦⎥⎥⎤ (1) 要证明 ( 1 ) (1) (1) 式,其实就只要证明 [ A 11 A 12 ] [ B 11 B 21 ] = [ A 11 B 11 + A 12 B 21 ] \begin{bmatrix} A_{11}&A_{12}\end{bmatrix}\begin{bmatrix} B_{11}\\B_{21}\end{bmatrix}=\begin{bmatrix}A_{11}B_{11}+A_{12}B_{21}\end{bmatrix} [A11A12][B11B21]=[A11B11+A12B21]。 ( 1 ) (1) (1) 式的其余部分可以用两条规则中的替换的方法得到。这部分可以由列的观点说明:本来我们是用 B B B 的每一列对 A A A 的列向量线性组合,现在把这个工作分成两截,用 B B B 的前几行对 A A A 的前几列线性组合,用 B B B 的后几行对 A A A 的后几列线性组合,然后再把它们加到一起,这和用 B B B 对整个 A A A 的列线性组合效果是一样的
- 首先考虑
A
A
A 不分块,将
B
B
B 按列分为两块
EXAMPLE 4
Compute X T X X^TX XTX, where X X X is partitioned as [ X 1 X 2 ] \begin{bmatrix} X_1&X_2\end{bmatrix} [X1X2].
SOLUTION
X
T
X
=
[
X
1
T
X
2
T
]
[
X
1
X
2
]
=
[
X
1
T
X
1
X
1
T
X
2
X
2
T
X
1
X
2
T
X
2
]
X^TX=\begin{bmatrix} X_1^T\\X_2^T\end{bmatrix}\begin{bmatrix} X_1&X_2\end{bmatrix}=\begin{bmatrix} X_1^TX_1&X_1^TX_2\\X_2^TX_1&X_2^TX_2\end{bmatrix}
XTX=[X1TX2T][X1X2]=[X1TX1X2TX1X1TX2X2TX2]
- The partitions of X T X^T XT and X X X are automatically conformable for block multiplication because the columns of X T X^T XT are the rows of X X X.
- This partition of X T X X^TX XTX is used in several computer algorithms for matrix computations.
EXAMPLE 5
Use partitioned matrices to prove by induction (数学归纳法) that the product of two lower triangular matrices (下三角矩阵) is also lower triangular.
SOLUTION
- [Hint: A
(
k
+
1
)
×
(
k
+
1
)
(k + 1)\times (k + 1)
(k+1)×(k+1) matrix
A
1
A_1
A1 can be written in the form below, where
a
a
a is a scalar,
v
\boldsymbol v
v is in
R
k
\mathbb R_k
Rk, and
A
A
A is a
k
×
k
k \times k
k×k lower triangular matrix. ]
A 1 = [ a 0 T v A ] A_1=\begin{bmatrix} a&\boldsymbol 0^T\\\boldsymbol v&A\end{bmatrix} A1=[av0TA] - Prove: The product of two
n
×
n
n\times n
n×n lower triangular matrices is lower triangular. (*)
- The statement is true for n = 1 n = 1 n=1.
- The “induction step” is next. Suppose that (*) is true when
n
n
n is some positive integer
k
k
k, and consider any
(
k
+
1
)
×
(
k
+
1
)
(k+1)\times(k+1)
(k+1)×(k+1) lower-triangular matrices
A
1
A_1
A1 and
B
1
B_1
B1. Partition these matrices as
A 1 = [ a 0 T v A ] , B 1 = [ b 0 T w B ] A_1=\begin{bmatrix} a&\boldsymbol 0^T\\\boldsymbol v&A\end{bmatrix},B_1=\begin{bmatrix} b&\boldsymbol 0^T\\\boldsymbol w&B\end{bmatrix} A1=[av0TA],B1=[bw0TB]where A A A and B B B are k × k k×k k×k matrices, v \boldsymbol v v and w \boldsymbol w w are in R k \mathbb R^k Rk, and a a a and b b b are scalars. Since A 1 A_1 A1 and B 1 B_1 B1 are lower triangular, so are A A A and B B B. NowAssuming (*) is true for n = k n = k n=k, A B AB AB must be lower triangular. The form of A 1 B 1 A_1B_1 A1B1 shows that it, too, is lower triangular. Thus the statement (*) about lower triangular matrices is true for n = k + 1 n = k +1 n=k+1 if it is true for n = k n = k n=k. By the principle of induction, (*) is true for all n ≥ 1 n \geq 1 n≥1.
Inverses of Partitioned Matrices
EXAMPLE 6
A matrix of the form
A
=
[
A
11
A
12
0
A
22
]
A=\begin{bmatrix} A_{11}& A_{12}\\0& A_{22}\end{bmatrix}
A=[A110A12A22]is said to be
b
l
o
c
k
u
p
p
e
r
t
r
i
a
n
g
u
l
a
r
(
分
块
上
三
角
矩
阵
)
block\ upper\ triangular (分块上三角矩阵)
block upper triangular(分块上三角矩阵). Assume that
A
11
A_{11}
A11 is
p
×
p
p \times p
p×p,
A
22
A_{22}
A22 is
q
×
q
q \times q
q×q, and
A
A
A is invertible. Find a formula for
A
−
1
A^{-1}
A−1.
SOLUTION
- Denote
A
−
1
A^{-1}
A−1 by
B
B
B and partition
B
B
B so that
[ A 11 A 12 0 A 22 ] [ B 11 B 12 B 21 B 22 ] = [ I p 0 0 I q ] \begin{bmatrix} A_{11}& A_{12}\\0& A_{22}\end{bmatrix}\begin{bmatrix} B_{11}& B_{12}\\B_{21}& B_{22}\end{bmatrix}=\begin{bmatrix} I_p& 0\\0& I_q\end{bmatrix} [A110A12A22][B11B21B12B22]=[Ip00Iq] - This matrix equation provides four equations:
Since A 22 A_{22} A22 is square, equation (6) shows that A 22 A_{22} A22 is invertible and B 22 = A 22 − 1 B_{22}=A_{22}^{-1} B22=A22−1. Next, left-multiply both sides of (5) by A 22 − 1 A_{22}^{-1} A22−1 and obtain
B 21 = 0 B_{21}=0 B21=0so that (3) simplifies to
A 11 B 11 = I p A_{11}B_{11} = I_p A11B11=IpSince A 11 A_{11} A11 is square, this shows that A 11 A_{11} A11 is invertible and B 11 = A 11 − 1 B_{11}=A_{11}^{-1} B11=A11−1. Finally, use these results with (4) to find that
B 12 = − A 11 − 1 A 12 A 22 − 1 B_{12}=-A_{11}^{-1}A_{12}A_{22}^{-1} B12=−A11−1A12A22−1Thus
A − 1 = [ A 11 A 12 0 A 22 ] − 1 = [ A 11 − 1 − A 11 − 1 A 12 A 22 − 1 0 A 22 − 1 ] A^{-1}=\begin{bmatrix} A_{11}& A_{12}\\0& A_{22}\end{bmatrix}^{-1}=\begin{bmatrix} A_{11}^{-1}& -A_{11}^{-1}A_{12}A_{22}^{-1}\\0& A_{22}^{-1}\end{bmatrix} A−1=[A110A12A22]−1=[A11−10−A11−1A12A22−1A22−1]
- A block diagonal matrix (分块对角矩阵) is a partitioned matrix with zero blocks off the main diagonal (of blocks).
- Such a matrix is invertible if and only if each block on the diagonal is invertible, and the inverse is the block diagonal matrix formed from the inverses of the diagonal blocks.
EXAMPLE 7
Without using row reduction, find the inverse of
A
=
[
1
2
0
0
0
3
5
0
0
0
0
0
2
0
0
0
0
0
7
8
0
0
0
5
6
]
A=\left[\begin{array}{lllll} 1 & 2 & 0 & 0 & 0 \\ 3 & 5 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & 0 \\ 0 & 0 & 0 & 7 & 8 \\ 0 & 0 & 0 & 5 & 6 \end{array}\right]
A=⎣⎢⎢⎢⎢⎡1300025000002000007500086⎦⎥⎥⎥⎥⎤
SOLUTION
- View the
5
×
5
5\times5
5×5 matrix in this example as a
3
×
3
3\times3
3×3 block matrix:
Finish by inverting each of the diagonal blocks and use the results to assemble A – 1 A^{–1} A–1,