本文为《Linear algebra and its applications》的读书笔记
目录
- If
A
A
A is an
m
×
n
m \times n
m×n matrix, each column of
A
A
A identifies a vector in
R
m
\mathbb R^m
Rm.
A = [ a 1 a 2 . . . a n ] A=[\ \ \boldsymbol a_1\ \ \ \boldsymbol a_2\ \ \ ...\ \ \ \boldsymbol a_n\ \ ] A=[ a1 a2 ... an ] - The diagonal entries in an m × n m \times n m×n matrix A = [ a i j ] A = [a_{ij}] A=[aij] are a 11 a_{11} a11, a 22 a_{22} a22, a 33 a_{33} a33… and they form the main diagonal (主对角线) of A A A. A diagonal matrix (对角矩阵) is a square n × n n \times n n×n matrix whose nondiagonal entries are zero. An m × n m \times n m×n matrix whose entries are all zero is a zero matrix (零矩阵) and is written as 0 0 0. The size of a zero matrix is usually clear from the context.
- The arithmetic for vectors described earlier has a natural extension to matrices.
Sums and Scalar Multiples
Sum
- The sum A + B A + B A+B is the m × n m \times n m×n matrix whose columns are the sums of the corresponding columns in A A A and B B B.
- The sum A + B A + B A+B is defined only when A A A and B B B are the same size.
Scalar Multiple
- If r r r is a scalar and A A A is a matrix, then the scalar multiple r A rA rA is the matrix whose columns are r r r times the corresponding columns in A A A.
Matrix Multiplication
Multiplication of matrices corresponds to composition of linear transformations
- When a matrix
B
B
B multiplies a vector
x
\boldsymbol x
x, it transforms
x
\boldsymbol x
x into the vector
B
x
B\boldsymbol x
Bx. If this vector is then multiplied in turn by a matrix
A
A
A, the resulting vector is
A
(
B
x
)
A(B\boldsymbol x)
A(Bx). See Figure 2.
Thus A ( B x ) A(B\boldsymbol x) A(Bx) is produced from x \boldsymbol x x by a c o m p o s i t i o n composition composition of mappings. Our goal is to represent this composite mapping as multiplication by a single matrix, denoted by A B AB AB, so that
A ( B x ) = ( A B ) x A(B\boldsymbol x)=(AB)\boldsymbol x A(Bx)=(AB)x
- If
A
A
A is
m
×
n
m \times n
m×n,
B
B
B is
n
×
p
n \times p
n×p, and
x
\boldsymbol x
x is in
R
p
\mathbb R^p
Rp. Then
B x = x 1 b 1 + . . . + x p b p B\boldsymbol x=x_1\boldsymbol b_1+...+x_p\boldsymbol b_p Bx=x1b1+...+xpbpBy the linearity of multiplication by A A A,
A ( B x ) = A ( x 1 b 1 ) + . . . + A ( x p b p ) = x 1 A b 1 + . . . + x p A b p A(B\boldsymbol x)=A(x_1\boldsymbol b_1)+...+A(x_p\boldsymbol b_p)=x_1A\boldsymbol b_1+...+x_pA\boldsymbol b_p A(Bx)=A(x1b1)+...+A(xpbp)=x1Ab1+...+xpAbp - The vector
A
(
B
x
)
A(B\boldsymbol x)
A(Bx) is a linear combination of the vectors
A
b
1
A\boldsymbol b_1
Ab1,…,
A
b
p
A\boldsymbol b_p
Abp, using the entries in
x
\boldsymbol x
x as weights. In matrix notation, this linear combination is written as
A ( B x ) = [ A b 1 A b 2 . . . A b p ] x A(B\boldsymbol x)=[\ \ A\boldsymbol b_1\ \ \ A\boldsymbol b_2\ \ \ ...\ \ \ A\boldsymbol b_p\ \ ]\boldsymbol x A(Bx)=[ Ab1 Ab2 ... Abp ]xThus multiplication by [ A b 1 A b 2 . . . A b p ] [\ \ A\boldsymbol b_1\ \ \ A\boldsymbol b_2\ \ \ ...\ \ \ A\boldsymbol b_p\ \ ] [ Ab1 Ab2 ... Abp ] transforms x \boldsymbol x x into A ( B x ) A(B\boldsymbol x) A(Bx). We have found the matrix we sought!
- Each column of A B AB AB is a linear combination of the columns of A A A using weights from the corresponding column of B B B.
- Obviously, the number of columns of A A A must match the number of rows in B B B in order for a linear combination such as A b 1 A\boldsymbol b_1 Ab1 to be defined. Also, the definition of A B AB AB shows that A B AB AB has the same number of rows as A A A and the same number of columns as B B B.
The definition of A B AB AB lends itself well to parallel processing on a computer. The columns of B B B are assigned individually or in groups to different processors, which independently and hence simultaneously compute the corresponding columns of A B AB AB.
EXAMPLE 3
Compute A B AB AB, where A = [ 2 3 1 − 5 ] A =\begin{bmatrix}2&3\\1&-5\end{bmatrix} A=[213−5] and B = [ 4 3 6 1 − 2 3 ] B =\begin{bmatrix}4&3&6\\1&-2&3\end{bmatrix} B=[413−263].
SOLUTION
- The definition of A B AB AB is important for theoretical work and applications, but the following rule provides a more efficient method for calculating the individual entries in A B AB AB when working small problems by hand.
- Let
r
o
w
i
(
A
)
row_i(A)
rowi(A) denote the
i
i
i th row of a matrix
A
A
A. Then
r o w i ( A B ) = r o w i ( A ) ⋅ B row_i(AB)=row_i(A)\cdot B rowi(AB)=rowi(A)⋅B
Inner product
- View vectors in R n \mathbb R^n Rn as n × 1 n \times 1 n×1 matrices. For u \boldsymbol u u and v \boldsymbol v v in R n \mathbb R^n Rn, the matrix product u T v \boldsymbol u^T \boldsymbol v uTv is a 1 × 1 1 \times 1 1×1 matrix, called the scalar product (数量积), or inner product (内积), of u \boldsymbol u u and v \boldsymbol v v. It is usually written as a single real number without brackets.
- Inner products ( u T v \boldsymbol u^T \boldsymbol v uTv and v T u \boldsymbol v^T \boldsymbol u vTu) have the transpose symbol in the middle.
- u T v = v T u \boldsymbol u^T \boldsymbol v=\boldsymbol v^T \boldsymbol u uTv=vTu
Outer product
- The matrix product u v T \boldsymbol u\boldsymbol v^T uvT is an n × n n \times n n×n matrix, called the outer product (外积) of u \boldsymbol u u and v \boldsymbol v v.
- Outer products ( u v T \boldsymbol u\boldsymbol v^T uvT and v u T \boldsymbol v\boldsymbol u^T vuT) have the transpose symbol on the outside.
- 外积 u v T \boldsymbol u\boldsymbol v^T uvT 和 v u T \boldsymbol v\boldsymbol u^T vuT 互为对称矩阵
Properties of Matrix Multiplication
- Recall that I m I_m Im represents the m × m m \times m m×m identity matrix and I m x = x I_m\boldsymbol x=\boldsymbol x Imx=x for all x \boldsymbol x x in R m \mathbb R^m Rm.
- 补充性质:
- (1) 两个同阶上三角矩阵的乘积仍为上三角矩阵
- [Hint: 可以使用分块矩阵的思想来证明;答案可参考 “Partitioned matrices” 的 EXAMPLE 5]
- (2) 两个同阶下三角矩阵的乘积仍为下三角矩阵
- [Hint: 可以使用分块矩阵的思想来证明;答案可参考 “Partitioned matrices” 的 EXAMPLE 5]
- (3) 两个同阶对角矩阵的乘积仍为对角矩阵,且新的对角元为两个原对角元的乘积
- (4) A e i = a i A\boldsymbol e_i=\boldsymbol a_i Aei=ai
- (1) 两个同阶上三角矩阵的乘积仍为上三角矩阵
PROOF
Property (a)
- Property (a) follows from the fact that matrix multiplication corresponds to composition of linear transformations (which are functions), and it is known that the composition of functions is associative.
在离散数学中,函数的复合运算 f ∘ g f\circ g f∘g 被定义为关系乘积 g ∗ f g*f g∗f,可以证明关系乘积是满足结合律的,因此函数的复合运算也满足结合律
- Here is another proof of (a) that rests on the “column definition” of the product of two matrices. Let
C = [ c 1 c 2 . . . c p ] B C = [ B c 1 B c 2 . . . B c p ] A ( B C ) = [ A ( B c 1 ) A ( B c 2 ) . . . A ( B c p ) ] C=[\ \ \boldsymbol c_1\ \ \ \boldsymbol c_2\ \ \ ...\ \ \ \boldsymbol c_p\ \ ]\\BC=[\ \ B\boldsymbol c_1\ \ \ B\boldsymbol c_2\ \ \ ...\ \ \ B\boldsymbol c_p\ \ ]\\A(BC)=[\ \ A(B\boldsymbol c_1)\ \ \ A(B\boldsymbol c_2)\ \ \ ...\ \ \ A(B\boldsymbol c_p)\ \ ] C=[ c1 c2 ... cp ]BC=[ Bc1 Bc2 ... Bcp ]A(BC)=[ A(Bc1) A(Bc2) ... A(Bcp) ]Recall that the definition of A B AB AB makes A ( B x ) = ( A B ) x A(B\boldsymbol x)= (AB)\boldsymbol x A(Bx)=(AB)x for all x \boldsymbol x x, so
A ( B C ) = [ ( A B ) c 1 . . . ( A B ) c p ] = ( A B ) C A(BC)= [\ \ (AB)\boldsymbol c_1\ \ \ ...\ \ \ (AB)\boldsymbol c_p\ \ ]=(AB)C A(BC)=[ (AB)c1 ... (AB)cp ]=(AB)C
WARNINGS:
- In general, A B ≠ B A AB \neq BA AB=BA. If A B = B A AB = BA AB=BA, we say that A A A and B B B commute with one another.(可交换的)
- The cancellation laws (消去律) do not hold for matrix multiplication. That is, if A B = A C AB = AC AB=AC, then it is not true in general that B = C B = C B=C.
- If a product A B AB AB is the zero matrix, you cannot conclude in general that either A = 0 A = 0 A=0 or B = 0 B = 0 B=0.
Tip: When B B B is square and C C C has fewer columns than A A A has rows, it is more efficient to compute A ( B C ) A(BC) A(BC) than ( A B ) C (AB)C (AB)C .
C h e c k p o i n t Checkpoint Checkpoint:
Show that if y \boldsymbol y y is a linear combination of the columns of A B AB AB, then y \boldsymbol y y is a linear combination of the columns of A A A.
A n s w e r t o C h e c k p o i n t Answer\ to\ Checkpoint Answer to Checkpoint:
- If y \boldsymbol y y is a linear combination of the columns of A B AB AB, then there is a vector x \boldsymbol x x such that y \boldsymbol y y = ( A B ) x (AB)\boldsymbol x (AB)x. By definition of matrix multiplication, y = A ( B x ) \boldsymbol y = A(B\boldsymbol x) y=A(Bx). This expresses y \boldsymbol y y as a linear combination of the columns of A A A using the entries in the vector B x B\boldsymbol x Bx as weights.
EXAMPLE 4
Let
x
1
,
.
.
.
,
x
n
x_1,..., x_n
x1,...,xn be fixed numbers. The matrix below, called a Vandermonde matrix (范德蒙德矩阵), occurs in applications such as signal processing, error-correcting codes (纠错码), and polynomial interpolation.
Given
y
=
(
y
1
,
.
.
.
,
y
n
)
\boldsymbol y =(y_1,..., y_n)
y=(y1,...,yn) in
R
n
\mathbb R^n
Rn, suppose
c
=
(
c
0
,
.
.
.
,
c
n
−
1
)
\boldsymbol c = (c_0,..., c_{n-1})
c=(c0,...,cn−1) in
R
n
\mathbb R^n
Rn satisfies
V
c
=
y
V \boldsymbol c = \boldsymbol y
Vc=y, and define the polynomial
p
(
t
)
=
c
0
+
c
1
t
+
c
2
t
2
+
.
.
.
+
c
n
−
1
t
n
−
1
p(t) = c_0 + c_1t + c_2t^2 + ...+ c_{n-1}t^{ n-1}
p(t)=c0+c1t+c2t2+...+cn−1tn−1
- a. Show that p ( x 1 ) = y 1 , . . . , p ( x n ) = y n p(x_1)= y_1,..., p(x_n)= y_n p(x1)=y1,...,p(xn)=yn. We call p ( t ) p(t) p(t) an i n t e r p o l a t i n g interpolating interpolating p o l y n o m i a l polynomial polynomial f o r for for t h e the the p o i n t s points points ( x 1 , y 1 ) , . . . , ( x n , y n ) (x_1, y_1) ,...,(x_n,y_n) (x1,y1),...,(xn,yn) because the graph of p ( t ) p(t) p(t) passes through the points.
- b. Suppose x 1 , . . . , x n x_1,...,x_n x1,...,xn are distinct numbers (相异的数). Show that the columns of V V V are linearly independent.
- c. Prove: “If x 1 , . . . , x n x_1,..., x_n x1,...,xn are distinct numbers, and y 1 , . . . , y n y_1,..., y_n y1,...,yn are arbitrary numbers, then there is an interpolating polynomial of degree ≤ n − 1 \leq n - 1 ≤n−1 for ( x 1 , y 1 ) , . . . , ( x n , y n ) (x_1, y_1),...,(x_n, y_n) (x1,y1),...,(xn,yn).”
SOLUTION
- (b) [Hint: How many zeros can a polynomial of degree n − 1 n -1 n−1 have?] (一个 n − 1 n-1 n−1 次多项式有多少个零点?)
Powers of a Matrix
- If
A
A
A is an
n
×
n
n \times n
n×n matrix and if
k
k
k is a positive integer, then
- If k = 0 k = 0 k=0; then A 0 x A^0\boldsymbol x A0x should be x \boldsymbol x x itself. Thus A 0 A^0 A0 is interpreted as the identity matrix.
The Transpose of a Matrix
- The generalization of Theorem 3(d) to products of more than two factors can be stated in words as follows: