矩阵(Matrix)

猛火烹小鲜

已于 2023-12-20 11:21:19 修改

阅读量1.3k

点赞数 16

文章标签：矩阵算法线性代数

于 2023-12-16 16:27:16 首次发布

本文链接：https://blog.csdn.net/frenziedbird/article/details/134999905

版权

矩阵基础(Matrix Preliminary)

在数学意义上，存在多行多列数值的结构称为矩阵。
演化过程：集合(Sets) $\Longrightarrow$ 标量(Scalars) $\Longrightarrow$ 向量(Vectors) $\Longrightarrow$ 矩阵(Matrices) $\Longrightarrow$ 张量(Tensors)

矩阵类型(Matrix Types)

(1)从数据分布来说
行矩阵(行向量)(Row Matrix/Vector)
列矩阵(列向量)(Column Matrix/Vector)
方阵(Square Matrix)
对角矩阵(Diagonal Matrix)：主对角线之外的元素数值都为0；
标量矩阵(Scalar Matrix)：所有对角元素数值相同的对角矩阵；
单位矩阵(Identity Matrix)：所有对角元素数值都为1的对角矩阵，记为 $I$ ；
零矩阵(虚无矩阵)(Zero/Null Matrix)：所有元素都为0的矩阵；
对称矩阵(Symmetric Matrix)：沿主对角线对称的矩阵；
上三角矩阵(Upper Triangular Matrix)
下三角矩阵(Lower Triangular Matrix)
分块矩阵(Block Matrix)：每个元素都是一个矩阵(主要用于简化表示)；
分割矩阵(Partitioned Matrix)：与分块矩阵类似；
稀疏矩阵(Sparse Matrix)
稠密矩阵(Dense Matrix)
托普利兹矩阵(Toeplitz/Diagonal-Constant Matrix)：每条从左到右递减的对角线都是常数；
范特蒙德矩阵(Vandermonde Matrix)：每一行都有几何级数项；
置换矩阵(Permutation Matrix)：一个方阵，在每行和每列中都有一个元素1，其它元素为0，一般记为 $P$ ；
黑森贝格矩阵(Hessenberg Matrix)：一种特殊的方阵，与三角阵很相似；上黑森贝格矩阵与下黑森贝格矩阵；

(2)从运算属性来说
奇异矩阵(Singular Matrix)：
非奇异矩阵(Non-singular Matrix)：
伴随矩阵(Adjugate Matrix)：
共轭矩阵(Conjugate Matrix)：在复数域中，相对原始矩阵 $A$ ，存在 $\overline{A}$ ；
转置矩阵(Tranposed Matrix)：相对原始矩阵 $A$ ，行列互换后的矩阵，记为 $A^{T}$ ；
斜对称矩阵(反对称矩阵)(Skew-Symmetric Matrix)：相对原始矩阵 $A$ ，其为 $A^{T}$ ；
正交矩阵(Orthogonal Matrix)：相对原始方阵 $A$ ，存在 $A^{T}=A^{-1}$ 以及 $A^{T}A=AA^{T}=I$ ；
埃尔米特矩阵(米特矩阵)(Hermitian Matrix)：在复数域中，相对原始矩阵 $A$ ，存在 $A=\overline{A^{T}}=A^{H}$ 关系；
酉矩阵(Unitary Matrix)：在复数域中，相对原始方阵 $U$ ，存在 $U^{H}U=UU^{H}=UU^{-1}=I$ 关系；
定矩阵(Definite Matrix)：
半正定矩阵(Positive Semi-Definite Matrix)：
拉普拉斯矩阵(Laplacian Matrix) ：
雅可比矩阵/一阶导数矩阵(Jacobian Matrix)：
海森矩阵/黑塞矩阵/二阶导数矩阵(Hessian Matrix)：

矩阵运算(Matrix Operations)

(1)基本运算(Fundamental Operations)
矩阵加法(Matrix Addition)
矩阵减法(Matrix Subtraction)
标量乘法(Scalar Multiplication)： $B = k A$
矩阵乘法(Matrix Mulitplication)： $C = A B$
矩阵转置(Matrix Transposition)： $B=A^{T}$
矩阵求逆(Matrix Inversion)：一般通过 $A^{-1}=\text{adj}(A)/\text{det}(A)$ 运算，存在 $AA^{-1}=A^{-1}A=I$ 关系；
哈马达乘积/逐元素乘法(Hadamard Product/Element-wise Multiplication)： $C=A\bigodot B$
矩阵的迹(Matrix Trace)：针对原始方阵 $A$ ， $\text{Tr}=\sum^{n}_{i=1}A_{ii}$
矩阵行列式(Matrix Determinant)： $\text{det}(A)$
矩阵的秩(Matrix Rank)：矩阵中线性无关的行或列的最大数目；
矩阵的特征值与特征向量(Eigenvalue & Eigenvector)： $Av=\lambda v$ ， $uA=\lambda u$ ；左特征向量 $v$ 与右特征向量 $u$ ；

(2)矩阵范数(Matrix Norm)：量化矩阵大小或长度的数学度量
L1范数(L1-Norm/Column-Sum Norm)： $\Vert A \Vert_{1}=\sum^{n}_{i,j=1}\vert a_{ij} \vert$
欧式范数(Frobenius/Euclidean Norm)，属于L2范数： $\Vert \mathbf{A} \Vert = \sqrt{\sum_{ij}\vert A_{ij}\vert^2}=\sqrt{\text{Tr}(\mathbf{AA^H})}$
Schur范数，属于L2范数：
Hilbert-Schmidt范数，属于L2范数：
谱范数(Spectral Norm)，属于L2范数：
无穷范数(Infinite Norm，或 $\infty$ -Norm)：
Schatten Norm：
核范数(Ky Fan Norm/Nuclear Norm)：

(3)线性变换(2D Linear Transformation)[可扩展至三维]
欧式变换(Euclidean Transformation)：
平移矩阵(Translation Matrix)：
缩放矩阵(Scaling Matrix)：
旋转矩阵(Rotation Matrix)：
剪切矩阵(Shearing Matrix)：
辅助变换(Householder transformation)：

(4)其它变换(Other Transformations)
傅里叶/傅立叶变换(Fourier Transformation)：由法国数学家(Jean Baptiste Joseph Fourier)提出，最初是作为研究热传导理论的工具而被提出；
傅里叶变换是拉普拉斯变换的特例；线性变换；
能将满足一定条件的函数表示为三角函数或它们的积分的线性组合；
傅里叶矩阵；
离散余弦变换(Discrete Cosine Transformation)：它是傅里叶变换的一种变形；应用于音频和图像压缩(Data Compression)；MDCT(一种DCT的变形)是目前音频文件压缩的黄金标准；
DCT变体：DCT-I, DCT-II, DCT-III, DCT-IV；
阿达玛变换(Hadamard Transformation)：
小波/子波变换(Wavelet Transformation)：

矩阵微积分(Matrix Calculus)

(1)求导法则
矩阵微分(Matrix Differentiation)：
Basic Rules
Product Rules
Derivatives of Determinants
The Chain Rule
Derivatives of Traces
Derivatives of Frobenius Norm

(2)导数公式(Derivative Formula)
Derivatives of a Determinant
Derivatives of an Inverse
Derivatives of Eigenvalues
Derivatives of Matrices, Vectors and Scalar Forms
Derivatives of Traces
Derivatives of vector norms
Derivatives of matrix norms
Derivatives of Structured Matrices

$\begin{aligned} \frac{\partial\mathbf{a^TXb}}{\partial\mathbf{X}}=\mathbf{a}\mathbf{b^T} \end{aligned}$ $\begin{aligned} \frac{\partial\mathbf{a^TX^Tb}}{\partial\mathbf{X}}=\mathbf{ba^T} \end{aligned}$ $\begin{aligned} \partial\mathbf{A}=0 \end{aligned}$ $\begin{aligned} \partial(\mathbf{X+Y})= \partial\mathbf{X}+\partial\mathbf{Y}\end{aligned}$ $\begin{aligned} \partial(\alpha\mathbf{X})=\alpha\partial\mathbf{X} \end{aligned}$ $\begin{aligned} \partial(\alpha\mathbf{X\circ Y})=(\partial\mathbf{X})\circ\mathbf{Y}+\mathbf{X}\circ(\partial\mathbf{Y}) \end{aligned}$ $\begin{aligned} \partial(\mathbf{XY})=(\partial\mathbf{X})\mathbf{Y}+\mathbf{X}(\partial\mathbf{Y}) \end{aligned}$ $\begin{aligned} \partial(\mathbf{X\otimes Y})=(\partial\mathbf{X})\otimes\mathbf{Y}+\mathbf{X}\otimes(\partial\mathbf{Y}) \end{aligned}$ $\begin{aligned} \partial(\text{det}(\mathbf{X}))=\text{det}(\mathbf{X})\text{Tr}(\mathbf{X}^{-1}\partial\mathbf{X}) \end{aligned}$ $\begin{aligned} \partial(\text{ln}(\text{det}(\mathbf{X})))=\text{Tr}(\mathbf{X}^{-1}\partial\mathbf{X}) \end{aligned}$

Assume $\begin{aligned} F(\mathbf{X}) \end{aligned}$ is an element-wise differentiable function.
f() is the scalar derivative of F().

$\begin{aligned} \frac{\partial\text{Tr}(F(\mathbf{X}))}{\partial\mathbf{X}}=f(\mathbf{X})^T \end{aligned}$ $\begin{aligned} \frac{\partial}{\partial\mathbf{X}}\text{Tr}(\mathbf{XA})=\mathbf{A}^T \end{aligned}$ $\begin{aligned} \frac{\partial}{\partial\mathbf{X}}\text{Tr}(\mathbf{AXB})=\mathbf{A}^T\mathbf{B}^T \end{aligned}$ $\begin{aligned} \frac{\partial}{\partial\mathbf{X}}\text{Tr}(\mathbf{AX^TB})=\mathbf{BA} \end{aligned}$ $\begin{aligned} \frac{\partial}{\partial\mathbf{X}}\text{Tr}(\mathbf{X^TA})=\mathbf{A} \end{aligned}$ $\begin{aligned} \frac{\partial}{\partial\mathbf{X}}\text{Tr}(\mathbf{X^2})=2\mathbf{X^T} \end{aligned}$

矩阵分解(Matrix Decomposition/Factorization)

分解目的与分解手段

(1)求解线性系统(Sovling Linear Systems)
可以简化求解线性方程系统，即 $A x = b$ 的线性系统。
这个线性方程也称为 $n$ 元线性方程组或举证方程。
其中， $A$ 矩阵一般不是方阵，假设大小为 $m\times n$ ，存在 $m = n$ 、 $m < n$ 或 $m > n$ 等形式； $x$ 为为列向量，大小一般为 $n\times 1$ ，称为决策变量或未知向量； $b$ 为列向量，大小一般为 $n\times 1$ 。
分解手段：LU分解(LU Decomposition), LU Reduction, Block LU Decomposition, QR分解(QR Decomposition), RRQR Factorization, 乔里斯基/柯列斯基分解(Cholesky Decomposition), Rank Factorization, Interpolative Decomposition；

(2)分析特征值与特征向量(Eigenvalue & Eigenvector)
设A是n阶矩阵，若存在常数 $\lambda$ 和 $n$ 维非零向量X，使得 $AX=\lambda X$ ，则称 $\lambda$ 为矩阵 $A$ 的一个特征值， $X$ 为矩阵 $A$ 对应于当前特征值的一个特征向量( $n\times 1$ )。
分解手段：Eigen Decomposition, Jordan Decomposition, Schur Decomposition, Real Schur Decompostion, QZ Decomposition, Takagi’s Factorization, 奇异值分解(Singular Value Decomposition (SVD)), Scale-Invariant Decomposition;

(3)其它分解手段
Polar-Decomposition, Algebraic Polar Decomposition, Mostow’s Decomposition, Sinkhorn Normal Form, Sectoral Decomposition, William’s Normal Form, 双共轭分解(Biconjugate Decomposition);

(4)其它分解目的：
数值稳定性(Numerical Stability)[Matrix Decompositions can enhance the numerical stability of algorithms], 数据压缩与降维(Data Compression & Dimensionality Reduction)[help in capturing the most significant features of a dataset], 优化与二次型(Optimization & Quadratic Form), 信号处理(Signal Processings), 高效计算(Computational Efficiency), 数值分析与估计(Numerical Analysis & Approximation), 应用于数值分析(Numerical Analysis)、数据分析与统计(Data Analysis & Statistics)、图论(Graph Theory)、优化(Optimization)、物理与工程(Physics & Engineering)。

分解手段细说

(1)LU分解(LU Decomposition)
LU分解是高斯消元法(Gaussian Elimination Method)的矩阵形式。
并不是所有的矩阵都可以进行LU分解。
此时假定矩阵 $A$ 为方阵，则需要分解为两个矩阵的乘积记为 $A = LU$ 。
其中， $L$ 为一个下三角矩阵，而 $U$ 为一个上三角矩阵。
针对线性方程组而言，可以写为 $Ax=b\Longrightarrow LUx=b$ 。
若引入部分主元(Partial Pivoting)的方法，在数值线性代数中，该方法用于避免因主对角线元素过小而导致的数值不稳定问题。
此时，针对线性方程组而言，可以写为 $PAx=Pb\Longrightarrow Ax=PLUx=Pb$ ，其中 $P$ 为置换矩阵。
若引入完整主元(Full Pivoting)的方法，在 $A$ 前后都乘上一个置换矩阵，即 $P A Q = LU$ ，其中 $P$ 与 $Q$ 为置换矩阵。
还存在一种LDU分解方法，即将 $A$ 分解为 $A = L D U$ 的形式，其中D为对角矩阵。

(2)乔里斯基/柯列斯基分解(Cholesky Decomposition)
将一个米特正定矩阵(Hermitian Positive-Definite Matrix) $A$ 分解为一个下三角矩阵L与L的共轭转置矩阵的乘积，即 $A=LL^{H}=LL^{*}$ 。每个米特正定矩阵(或实数域的对称正定矩阵)都有唯一的乔里斯基分解。
类似的LDL分解方法，即将 $A$ 分解为 $A=LDL^{H}=LDL^{*}$ 的形式卡其棕D为对角矩阵，也可以写成 $A=LDL^{*}=LD^{1/2}(D^{1/2})^{*}L^{*}=LD^{1/2}(LD^{1/2})^{*}$ 。

(3)QR分解(QR Decomposition)
将一个矩阵 $A$ 分解为 $A = QR$ 的形式，其中， $Q$ 为一个正交矩阵， $R$ 为一个上三角矩阵。
QR分解方法主要用于求解线性最小二乘法问题(Linear Least Square Problem)。

(4)奇异值分解(Singular Value Decomposition (SVD))
将针对方阵求取特征值与特征向量的方法推广到任意矩阵。
将一个 $m\times n$ 的矩阵 $M$ 分解为 $M=U\Sigma V^{*}$ ，其中， $U$ 为一个 $m\times m$ 的酉矩阵， $\Sigma$ 为一个 $m\times n$ 的具有非负实数的矩形对角矩阵， $V$ 为一个 $n\times n$ 的酉矩阵。
$\Sigma$ 对角线元素即为 $M$ 的奇异值(Singular Value)，非零的奇异值个数等于 $M$ 的秩。
$U$ 与 $V$ 的所有列向量分别称为 $M$ 的左奇异向量与右奇异向量。

应用举例01 - Eigen库

网址：https://eigen.tuxfamily.org/
标语：Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
以下仅各个模块以及头文件信息进行摘录：

模块	头文件	内容
Core	#include <Eigen/Core>	Matrix与Array类，基本线性代数运算以及数组运算
Geometry	#include <Eigen/Geometry>	变换、平移、缩放、旋转(2D与3D)[需要三维几何、世界坐标、相机坐标等知识]
LU	#include <Eigen/LU>	求逆、行列式、LU分解等求解器
Cholesky	#include <Eigen/Cholesky>	LLT以及LDLT乔里斯基/柯列斯基分解等求解器
Householder	#include <Eigen/Householder>	辅助变换
SVD	#include <Eigen/SVD>	SVD分解以及最小二乘求解器
QR	#include <Eigen/QR>	QR分解等求解器
Eigenvalues	#include <Eigen/Eigenvalues>	特征值与特征向量求解器
Sparse	#include <Eigen/Sparse>	稀疏矩阵存储以及基本的线性代数运算
	#include <Eigen/Dense>	汇总头文件，包括Core、Geometry等头文件
	#include <Eigen/Eigen>	汇总头文件，包括针对稠密与稀疏矩阵操作的所有头文件(实际编程时仅需引用此文件)

应用举例02 - MATLAB如何求解线性方程Ax=B

网址：https://www.mathworks.com/help/matlab/ref/mldivide.html
MATLAB内置算法针对不同情况采用了不同的矩阵分解方式。
以下直接下载了针对稠密矩阵与稀疏矩阵的的算法流程。
(1)针对稠密矩阵(Dense Matrix)的算法流程：
针对稠密矩阵的算法流程
(2)针对稀疏矩阵(Sparse Matrix)的算法流程：
针对稀疏矩阵的算法流程

参考

[1] Wikipedia, https://en.wikipedia.org/wiki/Matrix_calculus
[2] Petersen K B, Pedersen M S. The matrix cookbook[J]. Technical University of Denmark, 2008, 7(15): 510.
[3] Thomas P. Minka, Old and New Matrix Algebra Useful for Statistics, December 28, 2000.

猛火烹小鲜

关注

16
点赞
踩
29

收藏

觉得还不错? 一键收藏
1
评论
矩阵(Matrix)

分解手段：LU分解(LU Decomposition), LU Reduction, Block LU Decomposition, QR分解(QR Decomposition), RRQR Factorization, 乔里斯基/柯列斯基分解(Cholesky Decomposition), Rank Factorization, Interpolative Decomposition；(4)奇异值分解(Singular Value Decomposition (SVD))
复制链接

扫一扫