Matrix calculus 矩阵微分

转载地址:https://en.wikipedia.org/wiki/Matrix_calculus

前提: 若 x为向量,则默认 x为列向量, xT 为行向量,后面提到的两种布局都是这样认为的。

Types of Matrix Derivatives
TypesScalarVectorMatrix
Scalar\frac{\partial y}{\partial x}\frac{\partial \mathbf{y}}{\partial x}\frac{\partial \mathbf{Y}}{\partial x}
Vector\frac{\partial y}{\partial \mathbf{x}}\frac{\partial \mathbf{y}}{\partial \mathbf{x}} 
Matrix\frac{\partial y}{\partial \mathbf{X}} 

 

书写说明:

 

let M(n,m) denote the space of real n×m matrices with n rows and m columns. Such matrices will be denoted using bold capital letters: A, X, Y, etc. An element of M(n,1), that is, a column vector, is denoted with a boldface lowercase letter: a, x, y, etc. An element of M(1,1) is a scalar, denoted with lowercase italic typeface: a, t, x, etc. XT denotes matrix transpose, tr(X) is the trace, and det(X) is the determinant

 

下面关于上面这6种微分形式的讨论,都遵循  numerator layout convention (分子布局)形式,有些论文和书籍中的数学推导公式是按照Denominator-layout (分母布局),有的为了方便,甚至在一篇论文里两种分布都有。分子布局的结果和 分母布局的结果刚好是转置的关系。

 

 

 

Vector-by-scalar[edit]

 

The derivative of a vector \mathbf{y} =\begin{bmatrix}y_1 \\y_2 \\\vdots \\y_m \\\end{bmatrix}, by a scalar x is written (in numerator layout notation) as

\frac{\partial \mathbf{y}}{\partial x} =\begin{bmatrix}\frac{\partial y_1}{\partial x}\\\frac{\partial y_2}{\partial x}\\\vdots\\\frac{\partial y_m}{\partial x}\\\end{bmatrix}.

In vector calculus the derivative of a vector y with respect to a scalar x is known as the tangent vector of the vector y\frac{\partial \mathbf{y}}{\partial x}. Notice here that y:R1 \rightarrow Rm.

Example Simple examples of this include the velocity vector in Euclidean space, which is the tangent vector of the position vector (considered as a function of time). Also, the acceleration is the tangent vector of the velocity.

 

 

Scalar-by-vector[edit]

 

The derivative of a scalar y by a vector \mathbf{x} =\begin{bmatrix}x_1 \\x_2 \\\vdots \\x_n \\\end{bmatrix}, is written (in numerator layout notation) as

\frac{\partial y}{\partial \mathbf{x}} =\left[\frac{\partial y}{\partial x_1} \ \ \frac{\partial y}{\partial x_2} \ \ \cdots \ \ \frac{\partial y}{\partial x_n}\right].

In vector calculus,the gradient of a scalar field y in the space Rn (whose independent coordinates are the components of x) is the transpose of the derivative of a scalar by a vector. In physics, the electric field is the vector gradient of the electric potential.

The directional derivative (方向导数)of a scalar function f(x) of the space vector x in the direction of theunit vector u is defined using the gradient as follows.

\nabla_{\bold{u}}{f}(\bold{x}) = \nabla f(\bold{x}) \cdot \bold{u}   (两个向量点积,内积)

Using the notation just defined for the derivative of a scalar with respect to a vector we can re-write the directional derivative as\nabla_\mathbf{u} f = \frac{\partial f}{\partial \mathbf{x}}\mathbf{u}. This type of notation will be nice when proving product rules and chain rules that come out looking similar to what we are familiar with for the scalar derivative.

 

 

Vector-by-vector[edit]

 

Each of the previous two cases can be considered as an application of the derivative of a vector with respect to a vector, using a vector of size one appropriately. Similarly we will find that the derivatives involving matrices will reduce to derivatives involving vectors in a corresponding way.

The derivative of a vector function (a vector whose components are functions) \mathbf{y} =\begin{bmatrix}y_1 \\y_2 \\\vdots \\y_m \\\end{bmatrix}, with respect to an input vector, \mathbf{x} =\begin{bmatrix}x_1 \\x_2 \\\vdots \\x_n \\\end{bmatrix}, is written (in numerator layout notation) as

\frac{\partial \mathbf{y}}{\partial \mathbf{x}} =\begin{bmatrix}\frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} & \cdots & \frac{\partial y_1}{\partial x_n}\\\frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} & \cdots & \frac{\partial y_2}{\partial x_n}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y_m}{\partial x_1} & \frac{\partial y_m}{\partial x_2} & \cdots & \frac{\partial y_m}{\partial x_n}\\\end{bmatrix}.

In vector calculus, the derivative of a vector function y with respect to a vector x whose components represent a space is known as the pushforward (or differential), or the Jacobian matrix.(雅克比矩阵)

 

 

 

 

Derivatives with matrices

Matrix-by-scalar

 

The derivative of a matrix function Y by a scalar x is known as the tangent matrix and is given (in numerator layout notation) by

\frac{\partial \mathbf{Y}}{\partial x} =\begin{bmatrix}\frac{\partial y_{11}}{\partial x} & \frac{\partial y_{12}}{\partial x} & \cdots & \frac{\partial y_{1n}}{\partial x}\\\frac{\partial y_{21}}{\partial x} & \frac{\partial y_{22}}{\partial x} & \cdots & \frac{\partial y_{2n}}{\partial x}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y_{m1}}{\partial x} & \frac{\partial y_{m2}}{\partial x} & \cdots & \frac{\partial y_{mn}}{\partial x}\\\end{bmatrix}.

Scalar-by-matrix[edit]

The derivative of a scalar y function of a p×q matrix X of independent variables, with respect to the matrix X, is given (in numerator layout notation) by

\frac{\partial y}{\partial \mathbf{X}} =\begin{bmatrix}\frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{21}} & \cdots & \frac{\partial y}{\partial x_{p1}}\\\frac{\partial y}{\partial x_{12}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{p2}}\\\vdots & \vdots & \ddots & \vdots\\\frac{\partial y}{\partial x_{1q}} & \frac{\partial y}{\partial x_{2q}} & \cdots & \frac{\partial y}{\partial x_{pq}}\\\end{bmatrix}.

Important examples of scalar functions of matrices include the trace of a matrix and the determinant.

In analog with vector calculus this derivative is often written as the following.

\nabla_\mathbf{X} y(\mathbf{X}) = \frac{\partial y(\mathbf{X})}{\partial \mathbf{X}}

 

总结表

 

Result of differentiating various kinds of aggregates with other kinds of aggregates
 Scalar yVector y (size m)Matrix Y (size m×n)
NotationTypeNotationTypeNotationType
Scalar x\frac{\partial y}{\partial x}scalar\frac{\partial \mathbf{y}}{\partial x}(numerator layout) size-column vector

(denominator layout) size-m row vector

\frac{\partial \mathbf{Y}}{\partial x}(numerator layout) m×matrix
Vector x(size n)\frac{\partial y}{\partial \mathbf{x}}(numerator layout) size-row vector

(denominator layout) size-column vector

\frac{\partial \mathbf{y}}{\partial \mathbf{x}}(numerator layout) m×matrix

(denominator layout) n×matrix

\frac{\partial \mathbf{Y}}{\partial \mathbf{x}} 
Matrix X(size p×q)\frac{\partial y}{\partial \mathbf{X}}(numerator layout) q×matrix

(denominator layout) p×matrix

\frac{\partial \mathbf{y}}{\partial \mathbf{X}} \frac{\partial \mathbf{Y}}{\partial \mathbf{X}} 

  

 

后面是这些矩阵,向量的乘积在关于求导时的具体形式:上面5个中每一个类型有一个表,比如:Vector-by-vector identities表
使用的时候查表即可。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值