转载地址:https://en.wikipedia.org/wiki/Matrix_calculus
前提: 若 x为向量,则默认 x为列向量, xT 为行向量,后面提到的两种布局都是这样认为的。
Types | Scalar | Vector | Matrix |
---|---|---|---|
Scalar | |||
Vector | |||
Matrix |
书写说明:
let M(n,m) denote the space of real n×m matrices with n rows and m columns. Such matrices will be denoted using bold capital letters: A, X, Y, etc. An element of M(n,1), that is, a column vector, is denoted with a boldface lowercase letter: a, x, y, etc. An element of M(1,1) is a scalar, denoted with lowercase italic typeface: a, t, x, etc. XT denotes matrix transpose, tr(X) is the trace, and det(X) is the determinant.
下面关于上面这6种微分形式的讨论,都遵循 numerator layout convention (分子布局)形式,有些论文和书籍中的数学推导公式是按照Denominator-layout (分母布局),有的为了方便,甚至在一篇论文里两种分布都有。分子布局的结果和 分母布局的结果刚好是转置的关系。
Vector-by-scalar[edit]
The derivative of a vector , by a scalar x is written (in numerator layout notation) as
In vector calculus the derivative of a vector y with respect to a scalar x is known as the tangent vector of the vector y, . Notice here that y:R1
Rm.
Example Simple examples of this include the velocity vector in Euclidean space, which is the tangent vector of the position vector (considered as a function of time). Also, the acceleration is the tangent vector of the velocity.
Scalar-by-vector[edit]
The derivative of a scalar y by a vector , is written (in numerator layout notation) as
In vector calculus,the gradient of a scalar field y in the space Rn (whose independent coordinates are the components of x) is the transpose of the derivative of a scalar by a vector. In physics, the electric field is the vector gradient of the electric potential.
The directional derivative (方向导数)of a scalar function f(x) of the space vector x in the direction of theunit vector u is defined using the gradient as follows.
(两个向量点积,内积)
Using the notation just defined for the derivative of a scalar with respect to a vector we can re-write the directional derivative as This type of notation will be nice when proving product rules and chain rules that come out looking similar to what we are familiar with for the scalar derivative.
Vector-by-vector[edit]
Each of the previous two cases can be considered as an application of the derivative of a vector with respect to a vector, using a vector of size one appropriately. Similarly we will find that the derivatives involving matrices will reduce to derivatives involving vectors in a corresponding way.
The derivative of a vector function (a vector whose components are functions) , with respect to an input vector,
, is written (in numerator layout notation) as
In vector calculus, the derivative of a vector function y with respect to a vector x whose components represent a space is known as the pushforward (or differential), or the Jacobian matrix.(雅克比矩阵)
Derivatives with matrices
Matrix-by-scalar
The derivative of a matrix function Y by a scalar x is known as the tangent matrix and is given (in numerator layout notation) by
Scalar-by-matrix[edit]
The derivative of a scalar y function of a p×q matrix X of independent variables, with respect to the matrix X, is given (in numerator layout notation) by
Important examples of scalar functions of matrices include the trace of a matrix and the determinant.
In analog with vector calculus this derivative is often written as the following.
总结表
Scalar y | Vector y (size m) | Matrix Y (size m×n) | ||||
---|---|---|---|---|---|---|
Notation | Type | Notation | Type | Notation | Type | |
Scalar x | scalar | (numerator layout) size-m column vector (denominator layout) size-m row vector | (numerator layout) m×n matrix | |||
Vector x(size n) | (numerator layout) size-n row vector (denominator layout) size-n column vector | (numerator layout) m×n matrix (denominator layout) n×m matrix | ||||
Matrix X(size p×q) | (numerator layout) q×p matrix (denominator layout) p×q matrix |
后面是这些矩阵,向量的乘积在关于求导时的具体形式:上面5个中每一个类型有一个表,比如:Vector-by-vector identities表
使用的时候查表即可。