A tutorial on Matrix

最新推荐文章于 2020-12-23 14:45:22 发布

数清风

最新推荐文章于 2020-12-23 14:45:22 发布

阅读量701

点赞数 2

分类专栏：数学文章标签： matrix 数学矩阵

本文链接：https://blog.csdn.net/chai_zheng/article/details/78608855

版权

数学专栏收录该内容

2 篇文章 0 订阅

订阅专栏

These pages are a collection of my personal review on Matrix Analysis, mainly about matrices and something relating to them, such like the Space, Norm, etc. They are really the things that matter in data science and almost all the machine learning algorithms. Hence, I collected them in this form for the convenience of anyone who wants a quick desktop or mobile reference.

1 Algebraic and Analytic Structures

1.1 Group

1) 2) 3) (a \circ b) \circ c = a \circ (b \circ c) e \circ a = a \circ e = a a \circ x = x \circ a = e

$\begin{eqnarray*} &1)& (a\circ b)\circ c = a \circ (b \circ c) \\ &2)& e \circ a = a \circ e = a \\ &3)& a \circ x = x \circ a = e \end{eqnarray*}$

1.2 Abelian

1) 4) 2) 3) a \circ b = b \circ a

$\begin{eqnarray*} &1)&2)3) \\ &4)&a \circ b = b \circ a \end{eqnarray*}$

1.3 Ring $(R,+)$ or $(R,\cdot )$

1) 2) 3) (R, +) i s a n A b e l i a n g r o u p . (a b) c = a (b c) a (b + c) = a b + a c, (b + c) a = b a + c a

$\begin{eqnarray*} &1)&(R,+) is \; an \; Abelian \;group.\\ &2)&(ab)c = a(bc)\\ &3)&a(b+c) = ab+ac,(b+c)a = ba+ca \end{eqnarray*}$

1.4 Equivalence Relation $\equiv$

1) 2) 3) a \equiv a a \equiv b i m p l i e s b \equiv a a \equiv b a n d b \equiv c i m p l i e s a \equiv c

$\begin{eqnarray*} &1)&a \equiv a\\ &2)&a \equiv b \; implies \; b \equiv a \\ &3)&a \equiv b \; and \; b \equiv c \; implies \; a \equiv c \end{eqnarray*}$

1.5 Partial Order $\preceq$

1) 2) 3) a ⪯ a a ⪯ b a n d b ⪯ c i m p l i e s a ⪯ c a ⪯ b a n d b ⪯ a i m p l i e s a = b

$\begin{eqnarray*} &1)&a \preceq a \\ &2)&a \preceq b \; and \; b \preceq c \; implies \; a \preceq c \\ &3)&a \preceq b \; and \; b \preceq a \; implies \; a=b \end{eqnarray*}$

1.6 Majorization and Weak majorization

1)Marjorization

x ↓ y ↓ = = [x 1 ↓, x 2 ↓, . . ., x n ↓] \in R n [y 1 ↓, y 2 ↓, . . ., y n ↓] \in R n

$\begin{eqnarray*} x^{\downarrow} &=& [{x_{1}}^{\downarrow},{x_{2}}^{\downarrow},...,{x_{n}}^{\downarrow}] \in \mathbb{R}^{n}\\ y^{\downarrow} &=& [{y_{1}}^{\downarrow},{y_{2}}^{\downarrow},...,{y_{n}}^{\downarrow}] \in \mathbb{R}^{n} \end{eqnarray*}$

For $x,y \in \mathbb{R}^{n}$ , we say that $x$ is majorized by $y$ , denoted by $x\prec y$ , if

\sum j = 1 k x j ↓ \sum j = 1 n x j ↓ \leq = \sum j = 1 k y j ↓ f o r k i n [1 : n - 1] \sum j = 1 n y j ↓

$\begin{eqnarray*} \sum_{j=1}^{k}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{k}{y_{j}}^{\downarrow} \; for \; k \; in \; [1:n-1] \\ \sum_{j=1}^{n}{x_{j}}^{\downarrow} &=& \sum_{j=1}^{n}{y_{j}}^{\downarrow} \end{eqnarray*}$

2)Weak Majorization

For $x,y \in \mathbb{R}^{n}$ , we say that $x$ is weak majorized by $y$ , denoted by $x\prec y$ , if

\sum j = 1 k x j ↓ \sum j = 1 n x j ↓ \leq \leq \sum j = 1 k y j ↓ f o r k i n [1 : n - 1] \sum j = 1 n y j ↓

$\begin{eqnarray*} \sum_{j=1}^{k}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{k}{y_{j}}^{\downarrow} \; for \; k \; in \; [1:n-1] \\ \sum_{j=1}^{n}{x_{j}}^{\downarrow} &\leq& \sum_{j=1}^{n}{y_{j}}^{\downarrow} \end{eqnarray*}$

1.7 Supremum and Infimum

$T$ is a subset of poset $(S,\preceq )$ , $a$ is said to be a supremum of $T$ , denoted by sup $T$ , if

1) 2) 3) a \in S b ⪯ a f o r a l l b \in T a ⪯ c f o r a n y o t h e r u p p e r b o u n d c

$\begin{eqnarray*} &1)& a\in S \\ &2)& b \preceq a \; for \; all \; b \in T \\ &3)& a \preceq c \; for \; any \; other \; upper \; bound \; c \end{eqnarray*}$

$a$ is said to be a infimum of $T$ , denoted by inf $T$ , if

1) 2) 3) a \in S a ⪯ b f o r a l l b \in T c ⪯ a f o r a n y o t h e r l o w e r b o u n d c

$\begin{eqnarray*} &1)& a\in S \\ &2)& a \preceq b \; for \; all \; b \in T \\ &3)& c \preceq a \; for \; any \; other \; lower \; bound \; c \end{eqnarray*}$

1.8 Lattice

Let $a,b \in S$ , then inf{ $a,b$ } is also denoted by $a \wedge b$ , called the meet of a,b ; and sup { $a,b$ } is denoted by $a \vee b$ , called the join of a,b. Then, a poset $(S, \preceq)$ is called a lattice if $a \wedge b$ and $a \vee b$ exist for all $a,b \in S$ .

2 Linear Spaces

2.1 Linear Space

A set $\chi$ is said to be a linear space(or vector space) over a filed $\mathbb{F}$ , if

1) 2) 3) 4) 5) α x \in χ, w h e n α \in F, x \in F, i t i s a c l o s u r e p r o p e r t y (α β) x = α (β x) α (x + y) = α x + α y (α + β) x = α x + β x 1 \cdot x = x

$\begin{eqnarray*} &1)& \alpha x \in \chi, when \alpha \in \mathbb{F}, x \in \mathbb{F}, it \; is \; a \; closure \; property\\ &2)& (\alpha \beta) x = \alpha (\beta x)\\ &3)& \alpha (x+y) = \alpha x + \alpha y\\ &4)& (\alpha+\beta)x = \alpha x+\beta x\\ &5)& 1 \cdot x = x \end{eqnarray*}$

2.2 Dimension and Basis

Several vectors $x_{1},x_{2},...x_{m} \in \chi$ are said to be linear independent if

α 1 x 1 + α 2 x 2 + . . . + α m x m = 0

$\alpha_{1}x_{1}+\alpha_{2}x_{2}+...+\alpha_{m}x_{m} = 0$
implies

α1=α2=...=αm=0 $\alpha_{1}=\alpha_{2}=...=\alpha_{m}=0$ . ‘m’ is the dimension of

χ $\chi$ ,

x1,x2,...,xm $x_{1},x_{2},...,x_{m}$ is the basis of

χ $\chi$ .

d i m d i m d i m d i m d i m R n = n R m \times n = m n C n = n C m \times n = m n H n = n 2

$\begin{eqnarray*} &dim& \mathbb{R}^n = n\\ &dim& \mathbb{R}^{m\times n} = mn\\ &dim& \mathbb{C}^n = n\\ &dim& \mathbb{C}^{m\times n} = mn\\ &dim& \mathbb{H}^n = n^2 \end{eqnarray*}$

2.3 Null Space and Range Space

N (A) R (A) = = {x \in χ : A x = 0} {a x : x \in χ}

$\begin{eqnarray*} N(A)&=&\left \{x \in \chi:Ax = 0 \right \}\\ R(A)&=&\left \{ax:x \in \chi \right \} \end{eqnarray*}$

2.4 Normed Linear Space

For vectors:

∥ x ∥ p ∥ x ∥ \infty = = (\sum i = 1 n | x i | p) 1 p m a x | x i |

$\begin{eqnarray*} \left \| x \right \|_{p} &=& \left ( \sum_{i=1}^{n}\left | x_i \right |^p \right )^\frac{1}{p} \\ \left \| x \right \|_{\infty } &=& max \left | x_{i} \right | \end{eqnarray*}$

For matrices:

∥ A ∥ 1 ∥ A ∥ 2 ∥ A ∥ p ∥ A ∥ \infty = m a x 1 \leq j \leq m \sum i = 1 n ∣ ∣ a i j ∣ ∣ = σ 1 (A) = s u p ∥ x ∥ p = 1 ∥ A x ∥ p = m a x 1 \leq i \leq n \sum j = 1 m ∣ ∣ a i j ∣ ∣

$\begin{eqnarray*} &\left \| A \right \|_{1}& = \underset{1\leq j\leq m}{max} \sum_{i=1}^{n}\left | a_{ij} \right | \\ &\left \| A \right \|_{2}& = \sigma _{1}(A) \\ &\left \| A \right \|_{p}& = \underset{\left \| x \right \|p = 1}{sup}\left \| Ax \right \|_p \\ &\left \| A \right \|_{\infty }& = \underset{1\leq i\leq n}{max} \sum_{j=1}^{m}\left | a_{ij} \right | \end{eqnarray*}$
where

σ1 $\sigma_{1}$ is the maximum sigular value of A.

2.5 Inner Prouduct Space

⟨ x, y ⟩ = x * y

$\left \langle x,y \right \rangle = x^*y$
A linear space with an inner product is called an inner product space.

2.6 Gram Schimidt Orthonormalization

q 1 q 2 ⋮ q i = = = a 1 ∥ a 1 ∥ a 2 - ⟨ a 2 , q 1 ⟩ q 1 ∥ a 2 - ⟨ a 2 , q 1 ⟩ q 1 ∥ a i - \sum i - 1 j = 1 ⟨ a i , q j ⟩ q j ∥ ∥ a i - \sum i - 1 j = 1 ⟨ a i , q j ⟩ q j ∥ ∥

$\begin{eqnarray*} q_{1}&=&\frac{a_{1}}{\left \| a_{1} \right \|}\\ q_{2}&=&\frac{a_{2}-\left \langle a_{2},q_{1} \right \rangle q_{1}}{\left \| a_{2}-\left \langle a_{2},q_{1} \right \rangle q_{1} \right \|} \\ \vdots \\ q_{i}&=&\frac{a_{i}-\sum_{j=1}^{i-1} \left \langle a_{i},q_{j} \right \rangle q_{j}}{\left \|a_{i}-\sum_{j=1}^{i-1} \left \langle a_{i},q_{j} \right \rangle q_{j} \right \|} \end{eqnarray*}$

3 Matrix Factorization and Decompositions

3.1 Eigenvalues and Eigenvectors

The characteristic polynomial of A is defined to be

C A (z) = d e t (z I - A)

$C_{A}(z)=det(zI-A)$
A complex number

λ $\lambda$ satisfying

CA(λ)=0 $C_{A}(\lambda)=0$ is called an eigenvalue of A, and the vector

x∈Cn $x \in \mathbb{C}^n$ such that

Ax=λx $Ax= \lambda x$ is called the right eigenvector of A corresponding to the eigenvalue

λ $\lambda$ .

3.2 Spectrum

Spectrum is the set of eigenvalues of A.
Spectral Radius $\rho (A)$ is the maximum modulus of the eigenvalues of A, i.e., $\rho(A) = max \left | \lambda_{i} \right |$ .

3.3 Diagonalization

c A (z) = (z - λ 1) n 1 (z - λ 2) n 2 . . . (z - λ n l) n l

$c_{A}(z) = (z-\lambda_{1})^{n_1}(z-\lambda_{2})^{n_2}...(z-\lambda_{n_{l}})^{n_l}$
where

ni≥1 $n_{i}\geq 1$ and

∑li=1ni=n $\sum_{i=1}^{l}n_{i}=n$ ,

ni $n_{i}$ is the algebraic multiplicity of

λi $\lambda_{i}$ .

Eigenspace: $\varepsilon _{i}=N(A - \lambda_{i}I)$
Generalized Eigenspace: $\widetilde{\varepsilon _{i}} = N[(A-\lambda_{i}I)^{n_{i}}]$

3.4 Jordan Canonical Form

Choosing arbitrary basis from $\widetilde{\varepsilon _{i}}$ to form P, and tranfer A by $P^{-1}AP$ to get a Jordan Canonical Form. We can also get P from:

A μ 1 A μ 2 A μ 3 ⋮ = = = λ μ 1 λ μ 2 + μ 1 λ μ 3 + μ 2

$\begin{eqnarray*} A \mu_{1} &=& \lambda \mu_{1} \\ A \mu_{2} &=& \lambda \mu_{2}+\mu{1} \\ A \mu_{3} &=& \lambda \mu_{3}+\mu{2}\\ \vdots \end{eqnarray*}$

3.5 QR Factorization

A n \times m = Q R

$A^{n \times m}=QR$

Q R = = [q 1 q 1 . . . q m] Q' A

$\begin{eqnarray*} Q&=&[q_{1} \; q_{1} \; ... \;q_{m}]\\ R&=&Q’A \end{eqnarray*}$

3.6 Schur Factorization

T n \times n = U * A n \times n U

$T^{n \times n}=U^*A^{n \times n}U$
U: unitary matrix
A: with eigenvalues

λ1,...,λn $\lambda_{1},..., \lambda{n}$
T: an upper triangular matrix

3.7 SVD Decomposition

A m \times n = U S m \times n V *

$A^{m \times n}=US^{m \times n}V^*$
The left -singular vectors of A(columns of U) are a set of orthonormal eigenvectors of

AA∗ $AA^*$ .
The right-singular vectors of A(columns of V) are a set of orthonormal eigenvectors of

A∗A $A^*A$ .
The diagnal entries of S are the square roots of the non-negative eigenvalues of both

A∗A $A^*A$ and

AA∗ $AA^*$ , known as the singular values.
e.g. For a square matrix T

T = = = Q * A Q Q * U S V * Q (Q * U) S (Q * V) *

$\begin{eqnarray*} T &=& Q^*AQ \\ &=& Q^*USV^*Q \\ &=& (Q^*U)S(Q^*V)^* \end{eqnarray*}$

3.8 Spectral Decompostion

A = \sum i = 1 k λ i G i

$A = \sum_{i=1}^{k}\lambda_{i}G_{i}$

P - 1 A P = d i a g {λ 1, λ 2, . . ., λ k} \Rightarrow

$P^{-1}AP=diag \left \{ \lambda_{1},\lambda_{2},...,\lambda_{k} \right \}\Rightarrow$

A = = = = P d i a g {λ 1, λ 2, . . ., λ k} P - 1 [α 1, α 2, . . . α k] ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ λ 1 λ 2 ⋱ λ k ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ β T 1 β T 2 ⋮ β T k ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ λ 1 α 1 β T 1 + λ 2 α 2 β T 2 + . . . + λ k α k β T k λ 1 G 1 + λ 2 G 2 + . . . + λ k G k

$\begin{eqnarray*} A &=& P diag \left \{ \lambda_{1},\lambda_{2},...,\lambda_{k} \right \} P^{-1}\\ &=& [\alpha_{1},\alpha_{2},...\alpha_{k}]\begin{bmatrix} \lambda_{1}& & & \\ &\lambda_{2} & & \\ & &\ddots & \\ & & &\lambda_{k} \end{bmatrix} \begin{bmatrix} \beta_1^T\\ \beta_2^T\\ \vdots \\ \beta_k^T \end{bmatrix} \\ &=& \lambda_1 \alpha_1 \beta_1^T + \lambda_2 \alpha_2 \beta_2^T +... +\lambda_k \alpha_k \beta_k^T\\ &=& \lambda_1 G_1 +\lambda_2 G_2 +... +\lambda_k G_k \end{eqnarray*}$
where

Gk=αiβTi $G_k = \alpha_i \beta_i^T$ . There are some properties of

Gi $G_i$ :

\sum G i G 2 i G i G j = = = I G i 0

$\begin{eqnarray*} \sum G_i &=& I \\ G_i^2 &=&G_i\\ G_i G_j &=& 0 \end{eqnarray*}$

3.9 Matrix Functions

⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ s i n (A + B) s i n 2 A c o s (A + B) c o s 2 A = = = = s i n A c o s B + c o s A s i n B 2 s i n A c o s A c o s A s i n B + s i n A c o s B c o s 2 A - s i n 2 A 1

$\begin{eqnarray*} \left\{\begin{matrix} sin(A+B) &=& sinAcosB + cosA sinB\\ sin2A &=& 2sinAcosA\\ cos(A+B)&=&cosAsinB+sinAcosB\\ cos2A &=& cos^2A-sin^2A1 \end{matrix}\right. \end{eqnarray*}$
holds when

AB=BA,A,B∈Cm×n $AB=BA,A,B \in \mathbb{C}^{m \times n}$ .

4 Matrix Analysis

4.1 Positive Definite

a) b) c) d) p o s i t i v e d e f i n i t e : p o s i t i v e s e m i - d e f i n i t e : n e g a t i v e d e f i n i t e : n e g a t i v e s e m i - d e f i n i t e : x * x * x * x * A x > 0 \Leftrightarrow A > 0 A x \geq 0 \Leftrightarrow A \geq 0 A x < 0 \Leftrightarrow A < 0 A x \leq 0 \Leftrightarrow A \leq 0

$\begin{eqnarray*} &a)& \; positive definite: &x^*&Ax> 0\Leftrightarrow A> 0 \\ &b)& \; positive semi-definite: &x^*&Ax\geq 0\Leftrightarrow A\geq 0 \\ &c)& \; negative definite: &x^*&Ax< 0\Leftrightarrow A< 0 \\ &d)& \; negative semi-definite: &x^*&Ax \leq 0\Leftrightarrow A\leq 0 \end{eqnarray*}$

The following three statements are equivalent.

a) b) c) A > 0 σ (A) > 0 d e t ⎡ ⎣ ⎢ a 1 1 . . . a i 1 . . . . . . a 1 i . . . a i 1 ⎤ ⎦ ⎥

$\begin{eqnarray*} &a)& A>0\\ &b)& \sigma (A)>0\\ &c)& det\begin{bmatrix} a_11&... &a_{1i}\\ ...& &... \\ a_{i1}&... &a_{i1} \end{bmatrix} \end{eqnarray*}$

4.2 Rayleigh Quotient

For A, let $\lambda _{min} = \lambda_1\leq \lambda_2 \leq ... \leq \lambda_n =\lambda_{max}$ , $1\leq i_1 \leq i_2\leq ...\leq i_k\leq n$ are integers, $x_{i1},x_{i2},...x_{ik}$ are orthonormal vectors such that $Ax_{ip} = \lambda_{ip}x_{ip}$ , $S = span\left \{ x_{i1},x_{i2},...,x_{ik} \right \}$ ,then we have

a) b) λ i 1 \leq x * A x \leq λ i k f o r x \in S λ m i n \leq x * A x \leq λ m a x f o r x \in C n

$\begin{eqnarray*} &a)&\lambda_{i1} \leq x^*Ax \leq \lambda_{ik} \; for \; x \in S\\ &b)&\lambda_{min} \leq x^*Ax \leq \lambda_{max} \; for \; x \in \mathbb{C}^n \end{eqnarray*}$

4.3 Hermitian Matrix

Hermitian Matrix: $A = A^*$
Skew-Hermitian: $A=-A^*$
Theorem: If $A$ is a Hermitian Matrix, then
a) $x^*Ax$ is real for all $x \in \mathbb{C}^n$ .
b) $\lambda(A)$ are real.
c) $S^*AS$ is Hermitian.

5 Special Topics

5.1 Stochastic Matrix

A nonnegative matrix $S^{n \times n}$ is said to be a stochastic matrix if each of its row sums is equal to one. S satisfies $Se = e$ , which means the eigenvalue and eigenvector of $S$ are respectively 1 and $\left [ 1 ... 1 \right ]^{T}$ .Obviously , if $S$ and $T$ are stochastic, so is $ST$ .

END

数清风

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
A tutorial on Matrix

These pages are a collection of my personal review on Matrix Analysis, mainly about matrices and something relating to them, such like the Space, Norm, etc.
复制链接

扫一扫