雅克比矩阵学习笔记

前置

假设 f : R n → R m f:R_n\to R_m f:RnRm是从 n n n维欧氏空间线性映射到到 m m m维欧氏空间的函数,这个函数由 m m m个实函数组成,记作:
{ y 1 = f 1 ( x 1 , x 2 , . . . , x n ) y 2 = f 2 ( x 1 , x 2 , . . . , x n ) . . . y m = f n ( x 1 , x 2 , . . . , x n ) \left\{ \begin{array}{lcl} y_1=f_1(x_1,x_2,...,x_n)\\ y_2=f_2(x_1,x_2,...,x_n)\\ ...\\ y_m=f_n(x_1,x_2,...,x_n)\\ \end{array} \right. y1=f1(x1,x2,...,xn)y2=f2(x1,x2,...,xn)...ym=fn(x1,x2,...,xn)

我们对 f ( Z ) , Z = ( x 1 , x 2 , . . . , x n ) f(Z),Z=(x_1,x_2,...,x_n) f(Z),Z=(x1,x2,...,xn)进行一阶泰勒展开,
f ( Z ) = f ( Z 0 ) + J f ( Z 0 ) ( Z − Z 0 ) f(Z)=f(Z_0)+J_f(Z_0)(Z-Z_0) f(Z)=f(Z0)+Jf(Z0)(ZZ0)

其中 f ( Z 0 ) f(Z_0) f(Z0)就是 f ( Z ) f(Z) f(Z) Z 0 Z_0 Z0的导数,在这里就是 f ( Z ) f(Z) f(Z)的雅克比矩阵。
值得注意的是,由于只展开到一阶,因此存在误差,故这里的等号并不是严格意义上的相等。

定义

m ∗ n m*n mn的雅克比矩阵:
[ ∂ f 1 ∂ x 1 ∂ f 1 ∂ x 2 . . . ∂ f 1 ∂ x n ∂ f 2 ∂ x 1 ∂ f 2 ∂ x 2 . . . ∂ f 2 ∂ x n . . . . . . . . . . . . ∂ f m ∂ x 1 ∂ f m ∂ x 2 . . . ∂ f m ∂ x n ] \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&...&\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&...&\frac{\partial f_2}{\partial x_n}\\ ...&...&...&...\\ \frac{\partial f_m}{\partial x_1}&\frac{\partial f_m}{\partial x_2}&...&\frac{\partial f_m}{\partial x_n}\\ \end{matrix} \right] x1f1x1f2...x1fmx2f1x2f2...x2fm............xnf1xnf2...xnfm
当然也可以写成行矩阵的形式:
[ ∂ f ∂ x 1 ∂ f ∂ x 2 . . . ∂ f ∂ x n ] \left[ \begin{matrix} \frac{\partial f}{\partial x_1}&\frac{\partial f}{\partial x_2}&...&\frac{\partial f}{\partial x_n}\\ \end{matrix} \right] [x1fx2f...xnf]
这也正是梯度矩阵的转置矩阵,即 J f ( Z ) = ∇ f ( Z ) T J_f(Z)=\nabla f(Z)^T Jf(Z)=f(Z)T

雅克比行列式

n = m n=m n=m时,雅克比矩阵就变为了:
[ ∂ f 1 ∂ x 1 ∂ f 1 ∂ x 2 . . . ∂ f 1 ∂ x n ∂ f 2 ∂ x 1 ∂ f 2 ∂ x 2 . . . ∂ f 2 ∂ x n . . . . . . . . . . . . ∂ f n ∂ x 1 ∂ f n ∂ x 2 . . . ∂ f n ∂ x n ] \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&...&\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&...&\frac{\partial f_2}{\partial x_n}\\ ...&...&...&...\\ \frac{\partial f_n}{\partial x_1}&\frac{\partial f_n}{\partial x_2}&...&\frac{\partial f_n}{\partial x_n}\\ \end{matrix} \right] x1f1x1f2...x1fnx2f1x2f2...x2fn............xnf1xnf2...xnfn

对上面的展开式进行移项,得:
f ( Z ) − f ( Z 0 ) = J f ( Z 0 ) ( Z − Z 0 ) f(Z)-f(Z_0)=J_f(Z_0)(Z-Z_0) f(Z)f(Z0)=Jf(Z0)(ZZ0)

Z − Z 0 = Δ x , f ( Z ) − f ( Z 0 ) = Δ y Z-Z_0=\Delta x,f(Z)-f(Z_0)=\Delta y ZZ0=Δx,f(Z)f(Z0)=Δy,于是有:
Δ y = J f ( Z 0 ) Δ x \Delta y=J_f(Z_0)\Delta x Δy=Jf(Z0)Δx

展开,有:
[ d y 1 d y 2 . . . d y n ] = [ ∂ f 1 ∂ x 1 ∂ f 1 ∂ x 2 . . . ∂ f 1 ∂ x n ∂ f 2 ∂ x 1 ∂ f 2 ∂ x 2 . . . ∂ f 2 ∂ x n . . . . . . . . . . . . ∂ f n ∂ x 1 ∂ f n ∂ x 2 . . . ∂ f n ∂ x n ] ∗ [ d x 1 d x 2 . . . d x n ] \left[ \begin{matrix} \mathrm{d}y_1\\ \mathrm{d}y_2\\ ...\\ \mathrm{d}y_n\\ \end{matrix} \right]= \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&...&\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&...&\frac{\partial f_2}{\partial x_n}\\ ...&...&...&...\\ \frac{\partial f_n}{\partial x_1}&\frac{\partial f_n}{\partial x_2}&...&\frac{\partial f_n}{\partial x_n}\\ \end{matrix} \right]* \left[ \begin{matrix} \mathrm{d}x_1\\ \mathrm{d}x_2\\ ...\\ \mathrm{d}x_n\\ \end{matrix} \right] dy1dy2...dyn = x1f1x1f2...x1fnx2f1x2f2...x2fn............xnf1xnf2...xnfn dx1dx2...dxn
继续展开,有:
[ d y 1 d y 2 . . . d y n ] = [ ∂ f 1 ∂ x 1 d x 1 + ∂ f 1 ∂ x 2 d x 2 + . . . + ∂ f 1 ∂ x n d x n ∂ f 2 ∂ x 1 d x 1 + ∂ f 2 ∂ x 2 d x 2 + . . . + ∂ f 2 ∂ x n d x n . . . ∂ f n ∂ x 1 d x 1 + ∂ f n ∂ x 2 d x 2 + . . . + ∂ f n ∂ x n d x n ] \left[ \begin{matrix} \mathrm{d}y_1\\ \mathrm{d}y_2\\ ...\\ \mathrm{d}y_n\\ \end{matrix} \right]= \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1}\mathrm{d}x_1+\frac{\partial f_1}{\partial x_2}\mathrm{d}x_2+...+\frac{\partial f_1}{\partial x_n}\mathrm{d}x_n\\ \frac{\partial f_2}{\partial x_1}\mathrm{d}x_1+\frac{\partial f_2}{\partial x_2}\mathrm{d}x_2+...+\frac{\partial f_2}{\partial x_n}\mathrm{d}x_n\\ ...\\ \frac{\partial f_n}{\partial x_1}\mathrm{d}x_1+\frac{\partial f_n}{\partial x_2}\mathrm{d}x_2+...+\frac{\partial f_n}{\partial x_n}\mathrm{d}x_n \end{matrix} \right] dy1dy2...dyn = x1f1dx1+x2f1dx2+...+xnf1dxnx1f2dx1+x2f2dx2+...+xnf2dxn...x1fndx1+x2fndx2+...+xnfndxn
正交化一下,有:
[ d y 1 0 . . . 0 0 d y 2 . . . 0 . . . . . . . . . . . . 0 0 . . . d y n ] = [ ∂ f 1 ∂ x 1 d x 1 ∂ f 1 ∂ x 2 d x 2 . . . ∂ f 1 ∂ x n d x n ∂ f 2 ∂ x 1 d x 1 ∂ f 2 ∂ x 2 d x 2 . . . ∂ f 2 ∂ x n d x n . . . . . . . . . . . . ∂ f n ∂ x 1 d x 1 ∂ f n ∂ x 2 d x 2 . . . ∂ f n ∂ x n d x n ] \left[ \begin{matrix} \mathrm{d}y_1&0&...&0\\ 0&\mathrm{d}y_2&...&0\\ ...&...&...&...\\ 0&0&...&\mathrm{d}y_n\\ \end{matrix} \right]= \left[ \begin{matrix} \frac{\partial f_1}{\partial x_1}\mathrm{d}x_1&\frac{\partial f_1}{\partial x_2}\mathrm{d}x_2&...&\frac{\partial f_1}{\partial x_n}\mathrm{d}x_n\\ \frac{\partial f_2}{\partial x_1}\mathrm{d}x_1&\frac{\partial f_2}{\partial x_2}\mathrm{d}x_2&...&\frac{\partial f_2}{\partial x_n}\mathrm{d}x_n\\ ...&...&...&...\\ \frac{\partial f_n}{\partial x_1}\mathrm{d}x_1&\frac{\partial f_n}{\partial x_2}\mathrm{d}x_2&...&\frac{\partial f_n}{\partial x_n}\mathrm{d}x_n \end{matrix} \right] dy10...00dy2...0............00...dyn = x1f1dx1x1f2dx1...x1fndx1x2f1dx2x2f2dx2...x2fndx2............xnf1dxnxnf2dxn...xnfndxn

取两边的行列式(特别注意的是,由于 { d x } , { d y } \{\mathrm{d}x\},\{\mathrm{d}y\} {dx},{dy}均为正数,因此行列式需取绝对值),有:
d y 1 ⋅ d y 2 ⋅ . . . ⋅ d y n = ∣ ∣ ∂ f 1 ∂ x 1 ∂ f 1 ∂ x 2 . . . ∂ f 1 ∂ x n ∂ f 2 ∂ x 1 ∂ f 2 ∂ x 2 . . . ∂ f 2 ∂ x n . . . . . . . . . . . . ∂ f n ∂ x 1 ∂ f n ∂ x 2 . . . ∂ f n ∂ x n ∣ ∣ ⋅ d x 1 ⋅ d x 2 ⋅ . . . ⋅ d x n \mathrm{d}y_1\cdot \mathrm{d}y_2\cdot...\cdot\mathrm{d}y_n=\left| \begin{vmatrix} \frac{\partial f_1}{\partial x_1}&\frac{\partial f_1}{\partial x_2}&...&\frac{\partial f_1}{\partial x_n}\\ \frac{\partial f_2}{\partial x_1}&\frac{\partial f_2}{\partial x_2}&...&\frac{\partial f_2}{\partial x_n}\\ ...&...&...&...\\ \frac{\partial f_n}{\partial x_1}&\frac{\partial f_n}{\partial x_2}&...&\frac{\partial f_n}{\partial x_n}\\ \end{vmatrix} \right|\cdot\mathrm{d}x_1\cdot\mathrm{d}x_2\cdot...\cdot\mathrm{d}x_n dy1dy2...dyn= x1f1x1f2...x1fnx2f1x2f2...x2fn............xnf1xnf2...xnfn dx1dx2...dxn

即:
d y 1 ⋅ d y 2 ⋅ . . . ⋅ d y n = ∣ ∣ J f ( Z ) ∣ ∣ ⋅ d x 1 ⋅ d x 2 ⋅ . . . ⋅ d x n \mathrm{d}y_1\cdot \mathrm{d}y_2\cdot...\cdot\mathrm{d}y_n=||J_f(Z)||\cdot\mathrm{d}x_1\cdot\mathrm{d}x_2\cdot...\cdot\mathrm{d}x_n dy1dy2...dyn=∣∣Jf(Z)∣∣dx1dx2...dxn

上式在 n = 1 n=1 n=1时,也即在一元函数中,可以理解为原线段的长度经过 ∣ ∣ J f ( Z ) ∣ ∣ ||J_f(Z)|| ∣∣Jf(Z)∣∣的缩放得到新线段的长度;
上式在 n = 2 n=2 n=2时,也即在二元函数中,可以理解为原平面图形的面积经过 ∣ ∣ J f ( Z ) ∣ ∣ ||J_f(Z)|| ∣∣Jf(Z)∣∣的缩放得到新平面图形的面积;
上式在 n = 3 n=3 n=3时,也即在三元函数中,可以理解为原平面图形的体积经过 ∣ ∣ J f ( Z ) ∣ ∣ ||J_f(Z)|| ∣∣Jf(Z)∣∣的缩放得到新平面图形的体积;
n > 3 n>3 n>3时直观上不好描述其几何意义,姑且不做讨论。
换言之,雅可比矩阵的行列式可以理解为原几何图形所确定的某种几何关系经过线性变化得到新几何图形的一种缩放比例。这也正是仿射变换

应用

通过仿射变换解决圆锥曲线中一些问题(高中数学常用);
和黑塞矩阵一起作为各种牛顿法的基础,也是梯度下降等算法的基础(稍后会写);
和机器人以及运动学有关(大雾~~)。

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值