Matrix Theory(矩阵理论)
文章目录
(未完待续 2021-09-28)
1. Intro. and Basics
1.1. Matrix Definition
- An m × n m \times n m×n matrix is a rectangular array of numbers (or other mathematical objects) with m rows and n columns.
- It is usually denoted as:
A = ( a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ) A = \left( \begin{array}{c} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \\ \end{array} \right) A=⎝⎜⎜⎜⎛a11a21⋮am1a12a22⋮am2⋯⋯⋱⋯a1na2n⋮amn⎠⎟⎟⎟⎞ - Meanwhile, the matrix element of A in the i i ith row and j j jth column is denoted as a i j a_{ij} aij.
1.2. Basic Operations and Calculations
1.2.1. Addition
- Can only be added for the same dimensional matrices:
( a b c d ) + ( e f g h ) = ( a + e b + f c + g d + h ) \left( \begin{array}{c} a & b \\ c & d \end{array}\right) + \left( \begin{array}{c} e & f \\ g & h \end{array}\right) = \left( \begin{array}{c} a+e & b+f \\ c+g & d+h \end{array}\right) (acbd)+(egfh)=(a+ec+gb+fd+h)
1.2.2. Scaling
k ( a b c d ) = ( k a k b k c k d ) k\left(\begin{array}{c} a & b \\ c & d \end{array}\right)= \left( \begin{array}{c} ka & kb \\ kc & kd \end{array}\right) k(acbd)=(kakckbkd)
1.2.3. Matrix Multiplication
-
Other than the scaling, two matrices can be multiplied only if the number of columns of the left matrix equals the number of rows of the right matrix.
-
For two 2 × 2 2\times2 2×2 matrices:
( a b c d ) ( e f g h ) = ( a e + b g a f + b h c e + d g c f + d h ) \left( \begin{array}{c} a & b \\ c & d \end{array}\right) \left( \begin{array}{c} e & f \\ g & h \end{array}\right) = \left( \begin{array}{c} ae+bg & af+bh \\ ce+dg & cf+dh \end{array}\right) (acbd)(egfh)=(ae+bgce+dgaf+bhcf+dh) -
For more general cases: if A is an m × n m\times n m×n matrix while B is an n × p n \times p n×p matrix. Then C = AB is an m × p m \times p m×p matrix. Its i j ij ij element can be written as:
c i j = ∑ k = 1 n a i k b k j . c_{ij} = \sum_{k=1}^n a_{ik}b_{kj}. cij=k=1∑naikbkj.
📣 Hint:
[ A ( B C ) ] i j = ∑ k = 1 n a i k [ B C ] k j = ∑ k = 1 n ∑ l = 1 p a i k b k l c l j = ∑ l = 1 p ∑ k = 1 n a i k b k l c l j = ∑ l = 1 p [ A B ] i l c l j = [ ( A B ) C ] i j [A(BC)]_{ij}=\sum_{k=1}^na_{ik}[BC]_{kj}= \sum_{k=1}^n \sum_{l=1}^p a_{ik}b_{kl}c_{lj}= \sum_{l=1}^p \sum_{k=1}^n a_{ik}b_{kl}c_{lj}=\sum_{l=1}^p[AB]_{il}c_{lj}=[(AB)C]_{ij} [A(BC)]ij=k=1∑naik[BC]kj=k=1∑nl=1∑paikbklclj=l=1∑pk=1∑naikbklclj=l=1∑p[AB]ilclj=[(AB)C]ij
1.2.4. Transpose Matrix
-
Denoted by A T A^T AT
-
It switches the rows and columns of A A A.
-
That is,
if A = ( a 11 a 12 ⋯ a 1 n a 21 a 22 ⋯ a 2 n ⋮ ⋮ ⋱ ⋮ a m 1 a m 2 ⋯ a m n ) , then A T = ( a 11 a 21 ⋯ a m 1 a 12 a 22 ⋯ a m 2 ⋮ ⋮ ⋱ ⋮ a 1 n a 2 n ⋯ a m n ) \text{if } A=\left(\begin{array}{c} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{array}\right), \text{ then } A^T=\left(\begin{array}{c} a_{11} & a_{21} & \cdots & a_{m1} \\ a_{12} & a_{22} & \cdots & a_{m2} \\ \vdots & \vdots & \ddots & \vdots \\ a_{1n} & a_{2n} & \cdots & a_{mn} \end{array}\right) if A=⎝⎜⎜⎜⎛a11a21⋮am1a12a22⋮am2⋯⋯⋱⋯a1na2n⋮amn⎠⎟⎟⎟⎞, then AT=⎝⎜⎜⎜⎛a11a12⋮a1na21a22⋮a2n⋯⋯⋱⋯am1am2⋮amn⎠⎟⎟⎟⎞ -
Or to say: a i j T = a j i a_{ij}^T=a_{ji} aijT=aji.
-
Useful properties:
- ( A T ) T = A (A^T)^T=A (AT)T=A
- ( A + B ) T = A T + B T (A+B)^T=A^T+B^T (A+B)T=AT+BT
- ( A B ) T = B T A T (AB)^T=B^TA^T (AB)T=BTAT
1.2.5. Inner and Outer Products
- Inner Product:
- Also known as dot product or scalar product.
- = Matrix product of a row vector times a column vector.
- For example,
u T v = ( u 1 u 2 u 3 ) ( v 1 v 2 v 3 ) = u 1 v 1 + u 2 v 2 + u 3 v 3 u^Tv = \left(\begin{array}{c} u_1 & u_2 & u_3 \end{array}\right)\left(\begin{array}{c} v_1 \\ v_2 \\ v_3 \end{array}\right)=u_1v_1+u_2v_2+u_3v_3 uTv=(u1u2u3)⎝⎛v1v2v3⎠⎞=u1v1+u2v2+u3v3
✎ Note
- Orthogonal: the inner product of two non-zero vectors is zero ⇒ \Rightarrow ⇒ these two vectors are orthogonal.
- Normalized: the norm of a vector = 1.
∣ ∣ u ∣ ∣ = ( u u T ) 1 2 = ( u 1 2 + u 2 2 + u 2 2 ) 1 2 ||u||=(uu^T)^\frac{1}{2}=(u_1^2+u_2^2+u_2^2)^\frac{1}{2} ∣∣u∣∣=(uuT)21=(u12+u22+u22)21- Orthonormal: orthogonal + normalized.
- Outer Product:
- = Matrix product of a column vector times a row vector.
- For example,
u T v = ( u 1 u 2 u 3 ) ( v 1 v 2 v 3 ) = ( u 1 v 1 u 1 v 2 u 1 v 3 u 2 v 1 u 2 v 2 u 2 v 3 u 3 v 1 u 3 v 2 u 3 v 3 ) u^Tv = \left(\begin{array}{c} u_1 \\ u_2 \\ u_3 \end{array}\right)\left(\begin{array}{c} v_1 & v_2 & v_3 \end{array}\right)=\left(\begin{array}{c} u_1v_1 & u_1v_2 & u_1v_3 \\ u_2v_1 & u_2v_2 & u_2v_3 \\ u_3v_1 & u_3v_2 & u_3v_3 \end{array}\right) uTv=⎝⎛u1u2u3⎠⎞(v1v2v3)=⎝⎛u1v1u2v1u3v1u1v2u2v2u3v2u1v3u2v3u3v3⎠⎞
📣 Hint!!!
- If A = ( a d b e c f ) A=\left(\begin{array}{c} a & d \\ b & e \\ c & f \end{array}\right) A=⎝⎛abcdef⎠⎞, then A T A A^TA ATA is symmetric.
- T r ( A T A ) Tr(A^TA) Tr(ATA) is the sum of the squares of all the elements of A A A.
1.2.6. Inverse Matrix
- Square matrices may have inverses. When a matrix A has an inverse, we say it is invertible.
- Denoted by A − 1 A^{−1} A−1.
- Satisfies: A A − 1 = A − 1 A = I AA^{-1}=A^{-1}A=I AA−1=A−1A=I
- Properties:
- ( A B ) − 1 = B − 1 A − 1 (AB)^{-1}=B^{-1}A^{-1} (AB)−1=B−1A−1
- If A is invertible ⇒ \Rightarrow ⇒ A T A^T AT is invertible: ( A T ) − 1 = ( A − 1 ) T (A^T)^{-1}=(A^{-1})^{T} (AT)−1=(A−1)T
📣 Hint: if a matrix is invertible, then its inverse is unique.
Proof. Assume there exists A B = I and A C = I and B ≠ C However, B = B I = B ( A C ) = I C = C . \textbf{Proof. } \text{Assume there exists }AB=I \text{ and } AC = I \text{ and }B\neq C \\\text{However, }B=BI=B(AC)=IC=C. Proof. Assume there exists AB=I and AC=I and B=CHowever, B=BI=B(AC)=IC=C.
1.3. Special Matrices
1.3.1. Zero Matrix
- Denoted by 0 0 0
- Can be any size and is a matrix consisting of all zero elements.
- Multiplication by a zero matrix results in a zero matrix.
- For example, a
2
×
2
2 \times 2
2×2 zero matrix:
0 = ( 0 0 0 0 ) 0 = \left(\begin{array}{c} 0 & 0 \\ 0 & 0 \end{array}\right) 0=(0000)
1.3.2. Identity Matrix
- Denoted by I I I
- A I = I A = A AI=IA=A AI=IA=A, A A A and I I I are the same sized square matrices.
- For example, a
2
×
2
2 \times 2
2×2 identity matrix:
I = ( 1 0 0 1 ) I = \left(\begin{array}{c} 1 & 0 \\ 0 & 1 \end{array}\right) I=(1001)
1.3.3. Diagonal Matrix
- Nonzero elements only on the diagonal.
- For example, a
2
×
2
2 \times 2
2×2 diagonal matrix:
D = ( d 1 0 0 d 2 ) D = \left(\begin{array}{c} d_1 & 0 \\ 0 & d_2 \end{array}\right) D=(d100d2)
1.3.4. Banded (Band) Matrix
- Nonzero elements only on diagonal bands.
- For example, a
3
×
3
3 \times 3
3×3 diagonal matrix:
B = ( d 1 a 1 0 b 1 d 2 a 2 0 b 2 d 3 ) B = \left(\begin{array}{c} d_1 & a_1 & 0 \\ b_1 & d_2 & a_2 \\ 0 & b_2 & d_3 \end{array}\right) B=⎝⎛d1b10a1d2b20a2d3⎠⎞
1.3.5. Triangular Matrix
- Upper Triangular Matrix:
U = ( a b c 0 d e 0 0 f ) U = \left(\begin{array}{c} a & b & c \\ 0 & d & e \\ 0 & 0 & f \end{array}\right) U=⎝⎛a00bd0cef⎠⎞ - Lower Triangular Matrix:
L = ( a 0 0 b c 0 d e f ) L = \left(\begin{array}{c} a & 0 & 0 \\ b & c & 0 \\ d & e & f \end{array}\right) L=⎝⎛abd0ce00f⎠⎞
1.3.6. Symmetric Matrices
- A T = A A^T=A AT=A
- For example,
( a b c b d e c e f ) \left(\begin{array}{c} a & b & c \\ b & d & e \\ c & e & f \end{array}\right) ⎝⎛abcbdecef⎠⎞
1.3.7. Skew Symmetric Matrices
- A T = − A A^T=-A AT=−A
- For example,
( 0 b c − b 0 e − c − e 0 ) \left(\begin{array}{c} 0 & b & c \\ -b & 0 & e \\ -c & -e & 0 \end{array}\right) ⎝⎛0−b−cb0−ece0⎠⎞ - Notice that the diagonal elements of a skew-symmetric matrix must be zero.
📣 Hint!!!
- Any square matrix A can be written as the sum of a symmetric and a skew-symmetric matrix.
Proof. A = 1 2 ( A + A T ) + 1 2 ( A − A T ) \textbf{Proof. }A = \frac{1}{2}(A+A^T)+\frac{1}{2}(A-A^T) Proof. A=21(A+AT)+21(A−AT)- A T A A^TA ATA is symmetric.
1.3.8. Orthogonal Matrices
- Definition:
- Q − 1 = Q T Q^{-1}=Q^T Q−1=QT. Equivalently, Q Q T = Q T Q = I QQ^T=Q^TQ=I QQT=QTQ=I.
- The columns of Q form an orthonormal set of vectors. The same argument can also be made for the rows of Q.
- An orthogonal matrix is a matrix that preserves lengths:
∣ ∣ Q x ∣ ∣ 2 = ( Q x ) T ( Q x ) = x T Q T Q x = x T I x = x T x = ∣ ∣ x ∣ ∣ 2 ||Qx||^2=(Qx)^T(Qx)=x^TQ^TQx=x^TIx=x^Tx=||x||^2 ∣∣Qx∣∣2=(Qx)T(Qx)=xTQTQx=xTIx=xTx=∣∣x∣∣2
- Examples of Orthogonal Matrices
- Rotation Matrices
A matrix that rotates a vector in space doesn’t change the vector’s length and so should be an orthogonal matrix.( x ′ y ′ z ′ ) = ( c o s θ − s i n θ 0 s i n θ c o s θ 0 0 0 1 ) ( x y z ) , \left(\begin{array}{c}x' \\ y' \\ z'\end{array}\right)= \left(\begin{array}{c}cos\theta & -sin\theta & 0 \\ sin\theta &cos\theta & 0 \\ 0 & 0 & 1\end{array}\right)\left(\begin{array}{c}x \\ y \\ z\end{array}\right), ⎝⎛x′y′z′⎠⎞=⎝⎛cosθsinθ0−sinθcosθ0001⎠⎞⎝⎛xyz⎠⎞,
it rotates a three-dimensional vector an angle q counterclockwise around the z-axis. - Permutation Matxrices
When multiplying on the left, permutes the rows of a matrix, and when multiplying on the right, permutes the columns.( 0 0 1 1 0 0 0 1 0 ) ( a b c d e f g h i ) = ( g h i a b c d e f ) , \left(\begin{array}{c}0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0\end{array}\right) \left(\begin{array}{c}a & b & c \\ d & e & f \\ g & h & i\end{array}\right)=\left(\begin{array}{c}g & h & i \\ a & b & c \\ d & e & f\end{array}\right), ⎝⎛010001100⎠⎞⎝⎛adgbehcfi⎠⎞=⎝⎛gadhbeicf⎠⎞,
it is the row permutaion {3, 1, 2}.- A good way to understand and remenber this: P A = ( P I ) A PA=(PI)A PA=(PI)A
- Rotation Matrices
📣 Hint!!!
- The product of two orthogonal matrices is orthogonal.
- Identity matrix is orthogonal.
参考资料:
[1] Coursera: Matrix Algebra for Engineers