多元微分与偏微分总结

微分

x 0 x_0 x0处微分或导数: ∂ f ( x 0 ) , d f ( x 0 ) \partial f(x_0),df(x_0) f(x0),df(x0)

  • f ( x 0 + v ) = f ( x 0 ) + ∂ f ( x 0 ) ( v ) + o ( v ) f(x_0+v)=f(x_0)+\partial f(x_0)(v)+o(v) f(x0+v)=f(x0)+f(x0)(v)+o(v)

  • 线性映射

沿向量微分

x 0 x_0 x0处沿向量 v v v的导数: ∂ f ∂ v ( x 0 ) , ∂ v f ( x 0 ) \frac{\partial f}{\partial v}(x_0),\partial_vf(x_0) vf(x0),vf(x0)

  • lim ⁡ t → 0 + f ( x 0 + t v ) − f ( x 0 ) t \lim_{t\to 0^+}\frac{f(x_0+tv)-f(x_0)}{t} limt0+tf(x0+tv)f(x0)
  • ∥ v ∥ = 1 \Vert v \Vert=1 v=1时,称为沿方向 v v v的导数

f f f x 0 x_0 x0处可微时: ∂ f ( x 0 ) ( v ) = ∂ f ∂ v ( x 0 ) = ∂ v f ( x 0 ) \partial f(x_0)(v)=\frac{\partial f}{\partial v}(x_0)=\partial_vf(x_0) f(x0)(v)=vf(x0)=vf(x0)

  • 此时 ∂ f ∂ v ( x 0 ) \frac{\partial f}{\partial v}(x_0) vf(x0)关于 v v v是线性的

微分与沿向量微分都可以为值域多维的函数

偏导数

偏导数:对于坐标 ( x 1 , x 2 , . . . , x m ) (x_1,x_2,...,x_m) (x1,x2,...,xm),在 ( a 1 , a 2 , . . . , a m ) (a_1,a_2,...,a_m) (a1,a2,...,am)处,关于坐标 x k x_k xk偏导数 ∂ f ∂ x k ( X ) , f x k ′ ( X ) \frac{\partial f}{\partial x_k}(X),f'_{x_k}(X) xkf(X),fxk(X)

  • lim ⁡ t → 0 f ( a 1 , . . , a k + t , . . a m ) − f ( a 1 , . . . a m ) t \lim_{t\to 0}\frac{f(a_1,..,a_k+t,..a_m)-f(a_1,...a_m)}{t} limt0tf(a1,..,ak+t,..am)f(a1,...am)
  • 偏导数是 E → R E\to \R ER的函数,而前面定义的导数既可以是函数也可以是映射
  • 有时可以写成: f x k ( X ) , ∂ x k f ( X ) , ∂ k f ( X ) f_{x_k}(X),\partial_{x_k}f(X),\partial_kf(X) fxk(X),xkf(X),kf(X)
偏导数与沿向量微分的关系

对于 ∂ f ∂ x ( x 0 ) \frac{\partial f}{\partial x}(x_0) xf(x0),如果 x x x是一个向量,那么就看作沿向量的导数(可以为值域多元的映射),如果 x x x对应着一个坐标,那么就是偏导数(只能是值域一维的函数)。

  • e k e_k ek是坐标 x k x_k xk对应的基底向量,即 x k + 1 x_k+1 xk+1在空间中的向量,则可以将偏导与方向导数联系起来。

  • ∂ f ∂ x k ( X ) = ∂ f ∂ e k ( X ) = ∂ f ( X ) ( e k ) = λ ∂ f ( X ) ( e k λ ) \frac{\partial f}{\partial x_k}(X)=\frac{\partial f}{\partial e_k}(X)=\partial f(X)(e_k)=\lambda \partial f(X)(\frac{e_k}{\lambda}) xkf(X)=ekf(X)=f(X)(ek)=λf(X)(λek)

  • 偏导与范数无关,与坐标系有关(即与基底 e k e_k ek有关)

  • d f ( x 0 ) ( v ) = d f ( x 0 ) ( ∑ ξ i e i ) = ∑ ξ i d f ( x 0 ) ( e i ) = ∑ ξ i ∂ i f ( x 0 ) = ∑ ∂ i f ( x 0 ) d x i ( v ) = ( ∂ 1 f , ∂ 2 f , . . . , ∂ m f ) ( d 1 . . d m ) T ( v ) df(x_0)(v)=df(x_0)(\sum \xi_i e_i)=\sum \xi_idf(x_0)(e_i)=\sum \xi_i \partial_i f(x_0)=\sum \partial_if(x_0)dx_i(v)=(\partial_1f,\partial_2f,...,\partial_mf)(d_1..d_m)^T(v) df(x0)(v)=df(x0)(ξiei)=ξidf(x0)(ei)=ξiif(x0)=if(x0)dxi(v)=(1f,2f,...,mf)(d1..dm)T(v)

  • Jacobi矩阵:对于值域为 n n n维的映射的导数,将上式排成一列即可得到矩阵:
    F = ( f 1 , f 2 , . . . f n ) , J F ( x 0 ) = ( ∂ f 1 ∂ x 1 ( x 0 ) . . . ∂ f 1 ∂ x m ( x 0 ) ⋮ ⋱ ⋮ ∂ f n ∂ x 1 ( x 0 ) . . . ∂ f n ∂ x m ( x 0 ) ) F=(f_1,f_2,...f_n),JF(x_0)=\begin{pmatrix} \frac{\partial f_1}{\partial x_1}(x_0) & ... & \frac{\partial f_1}{\partial x_m}(x_0)\\ \vdots & \ddots & \vdots \\ \frac{\partial f_n}{\partial x_1}(x_0) & ... & \frac{\partial f_n}{\partial x_m}(x_0) \end{pmatrix} F=(f1,f2,...fn),JF(x0)=x1f1(x0)x1fn(x0)......xmf1(x0)xmfn(x0)

梯度

梯度只对于值域为一维的函数考虑

对于线性函数 L ( v ) : R n → R L(v):\R^n\to \R L(v):RnR,定义 ∇ L \nabla L L满足 L ( v ) = < ∇ L , v > L(v)=\left<\nabla L,v\right> L(v)=L,v

对于线性函数 d f ( x 0 ) ( v ) df(x_0)(v) df(x0)(v),记 ∇ f ( x 0 ) \nabla f(x_0) f(x0)为在 x 0 x_0 x0处的梯度向量, d f ( x 0 ) ( v ) = < ∇ f ( x 0 ) , v > df(x_0)(v)=\left<\nabla f(x_0),v\right> df(x0)(v)=f(x0),v

  • 定义 ∇ f ( x 0 ) \nabla f(x_0) f(x0) f f f x 0 x_0 x0的梯度向量, ∇ f ( x 0 ) = λ v ∗ , ∥ v ∗ ∥ = 1 \nabla f(x_0)=\lambda v^*,\Vert v^*\Vert=1 f(x0)=λv,v=1
    λ = d f ( x 0 ) v ∗ = max ⁡ ∥ v ∥ = 1 d f ( x 0 ) ( v ) \lambda=df(x_0)v^*=\max_{\Vert v\Vert=1}df(x_0)(v) λ=df(x0)v=v=1maxdf(x0)(v)

  • 形象理解,梯度向量 ∇ f \nabla f f是关于线性函数的在定义域内的向量,对于函数在一点 x 0 x_0 x0的导数就有梯度向量 ∇ f ( x 0 ) \nabla f(x_0) f(x0) ∇ \nabla 的方向是 d f df df最大的方向(即 f f f增长最快的方向),且 ∇ \nabla 的模长为这个方向上的导数,因此 < ∇ , v > = < λ v ∗ , v > = d f ( x 0 ) ( v ∗ ) ∗ ( v 在 v ∗ 方 向 上 投 影 的 模 长 ) \left<\nabla,v\right>=\left<\lambda v^*,v\right>=df(x_0)(v^*)*(v在v^*方向上投影的模长) ,v=λv,v=df(x0)(v)(vv)

基于此可以给出在 ( e 1 , e 2 , . . . , e n ) (e_1,e_2,...,e_n) (e1,e2,...,en)为标准正交基的梯度
∇ f ( x 0 ) : ( ∂ f ∂ x 1 ( x 0 ) , ∂ f ∂ x 2 ( x 0 ) , . . . ∂ f ∂ x n ( x 0 ) ) T 实 际 上 : ∇ f ( x 0 ) = ∑ i = 1 n ∂ f ∂ x i ( x 0 ) e i \nabla f(x_0):\bigg(\frac{\partial f}{\partial x_1}(x_0),\frac{\partial f}{\partial x_2}(x_0),...\frac{\partial f}{\partial x_n}(x_0)\bigg)^T\\ 实际上:\\ \nabla f(x_0)=\sum_{i=1}^n\frac{\partial f}{\partial x_i}(x_0)e_i f(x0):(x1f(x0),x2f(x0),...xnf(x0))Tf(x0)=i=1nxif(x0)ei

  • 前者由于是坐标所以是列向量

  • < ∇ f ( x 0 ) , v > = ( ∂ f ∂ x 1 ( x 0 ) , ∂ f ∂ x 2 ( x 0 ) , . . . ∂ f ∂ x n ( x 0 ) ) G ( e 1 . . e n ) ( d x 1 ( v ) d x 2 ( v ) ⋮ d x n ( v ) ) \left <\nabla f(x_0),v\right>=\bigg(\frac{\partial f}{\partial x_1}(x_0),\frac{\partial f}{\partial x_2}(x_0),...\frac{\partial f}{\partial x_n}(x_0)\bigg)G(e_1..e_n)\begin{pmatrix}dx_1(v)\\dx_2(v)\\ \vdots \\ dx_n(v)\end{pmatrix} f(x0),v=(x1f(x0),x2f(x0),...xnf(x0))G(e1..en)dx1(v)dx2(v)dxn(v)

  • G G G ( e 1 . . e n ) (e_1..e_n) (e1..en)为基的度量矩阵,如果为标准正交基则为 I I I,那么 < ∇ f , v > = d f ( x 0 ) ( v ) \left <\nabla f,v\right>=df(x_0)(v) f,v=df(x0)(v)

  • 如果 G ≠ I G\neq I G=I,则 ∇ f ( x 0 ) \nabla f(x_0) f(x0)的坐标为 X = G − 1 ( ∂ f ∂ x 1 ( x 0 ) , ∂ f ∂ x 2 ( x 0 ) , . . . ∂ f ∂ x n ( x 0 ) ) T X=G^{-1}\bigg(\frac{\partial f}{\partial x_1}(x_0),\frac{\partial f}{\partial x_2}(x_0),...\frac{\partial f}{\partial x_n}(x_0)\bigg)^T X=G1(x1f(x0),x2f(x0),...xnf(x0))T,如此 < ∇ f ( x 0 ) , v > = Y G X = Y G G − 1 X ′ = Y X ′ \left <\nabla f(x_0),v\right>=YGX=YGG^{-1}X'=YX' f(x0),v=YGX=YGG1X=YX

高阶偏导数

对于 i k ∈ [ 1 , n ] i_k\in[1,n] ik[1,n],依次对坐标 x i 1 , x i 2 . . . x i k x_{i_1},x_{i_2}...x_{i_k} xi1,xi2...xik求偏导
∂ k f ∂ x i k ∂ x i k − 1 . . . ∂ x i 1 ( x 0 ) ∂ x i k , x i k − 1 , . . . , x i 1 k f ∂ i k , i k − 1 , . . . , x i 1 k f f x i 1 , x i 2 , . . . , x k ( k ) f i 1 , i 2 , . . . , i k ( k ) \frac{\partial^k f}{\partial x_{i_k}\partial x_{i_{k-1}}...\partial x_{i_1}}(x_0)\\ \partial ^{k}_{x_{i_k},x_{i_{k-1}},...,x_{i_1}}f\\ \partial ^{k}_{i_k,i_{k-1},...,x_{i_1}}f\\ f^{(k)}_{x_{i_1},x_{i_2},...,x_{k}}\\ f^{(k)}_{i_1,i_2,...,i_k} xikxik1...xi1kf(x0)xik,xik1,...,xi1kfik,ik1,...,xi1kffxi1,xi2,...,xk(k)fi1,i2,...,ik(k)

  • f f f x 0 x_0 x0满足 k k k阶连续,那么任意 k k k次偏导,关于求导顺序无关

k k k阶微分是关于 k k k个向量 v 1 . . v k v_1..v_k v1..vk k k k重线性函数
d f k ( x ) ( v 1 . . v k ) = ∑ i 1 , i 2 . . i k ∂ k f ∂ x i k ∂ x i k − 1 . . . ∂ x i 1 ( x ) v 1 , i 1 v 2 , i 2 . . . v k , i k = ∂ k f ∂ v k . . ∂ v 1 ( x ) df^k(x)(v_1..v_k)=\sum_{i_1,i_2..i_k}\frac{\partial ^kf}{\partial x_{i_k}\partial x_{i_{k-1}}...\partial x_{i_1}}(x)v_{1,i_1}v_{2,i_2}...v_{k,i_k}=\frac{\partial ^kf}{\partial v_k..\partial v_1}(x) dfk(x)(v1..vk)=i1,i2..ikxikxik1...xi1kf(x)v1,i1v2,i2...vk,ik=vk..v1kf(x)

泰勒展开

一维泰勒公式:
f ( x 0 + t ) = ∑ i = 0 k − 1 f ( i ) ( x 0 ) i ! t i + o ( t k − 1 ) = ∑ i = 0 k − 1 f ( i ) ( x 0 ) i ! t i + f ( k ) ( x 0 + t θ ) k ! t k f(x_0+t)=\sum_{i=0}^{k-1}\frac{f^{(i)}(x_0)}{i!}t^i+o(t^{k-1})=\sum_{i=0}^{k-1}\frac{f^{(i)}(x_0)}{i!}t^i+\frac{f^{(k)}(x_0+t\theta)}{k!}t^k f(x0+t)=i=0k1i!f(i)(x0)ti+o(tk1)=i=0k1i!f(i)(x0)ti+k!f(k)(x0+tθ)tk
任意维泰勒公式:
f ( x 0 + t v ) = ∑ i = 0 k − 1 ∂ i f ∂ v i ( x 0 ) t i i ! + o ( t k − 1 ) = ∑ i = 0 k − 1 ∂ i f ∂ v i ( x 0 ) t i i ! + ∂ k f ∂ v k ( x 0 + t θ ) t k k ! f(x_0+tv)=\sum_{i=0}^{k-1}\frac{\partial^i f}{\partial v^i}(x_0)\frac{t^i}{i!}+o(t^{k-1})=\sum_{i=0}^{k-1}\frac{\partial^i f}{\partial v^i}(x_0)\frac{t^i}{i!}+\frac{\partial^k f}{\partial v^k}(x_0+t\theta)\frac{t^k}{k!} f(x0+tv)=i=0k1viif(x0)i!ti+o(tk1)=i=0k1viif(x0)i!ti+vkkf(x0+tθ)k!tk
根据高阶偏导数:
f ( x 0 + t v ) = ∑ i = 1 k − 1 ∂ i f ∂ v i ( x 0 ) t i i ! + o ( t k − 1 ) = ∑ i = 1 k − 1 ∑ α 1 + . . . + α n = i ∂ i f ∂ x 1 α 1 . . . ∂ x n α n ( x 0 ) ∏ j = 1 n ( t ξ j ) α j α j ! + o ( t k − 1 ) 令 v ∗ = t v , ∥ v ∥ = 1 , 则 ∥ v 0 ∥ = t f ( x 0 + v ∗ ) = ∑ i = 1 k − 1 ∑ α 1 + . . . + α n = i ∂ i f ∂ x 1 α 1 . . . ∂ x n α n ( x 0 ) ∏ j = 1 n ( ξ j ∗ ) α j α j ! + o ( ∥ v ∗ ∥ k − 1 ) f(x_0+tv)=\sum_{i=1}^{k-1}\frac{\partial^if}{\partial v^i}(x_0)\frac{t^i}{i!}+o(t^{k-1}) \\=\sum_{i=1}^{k-1}\sum_{\alpha_1+...+\alpha_n=i}\frac{\partial^if}{\partial x_1^{\alpha_1}...\partial x_n^{\alpha_n}}(x_0)\prod_{j=1}^n\frac{(t\xi_j)^{\alpha_j}}{\alpha_j!}+o(t^{k-1}) \\ 令v^*=tv,\Vert v\Vert=1,则\Vert v_0\Vert=t \\ f(x_0+v^*)=\sum_{i=1}^{k-1}\sum_{\alpha_1+...+\alpha_n=i}\frac{\partial^if}{\partial x_1^{\alpha_1}...\partial x_n^{\alpha_n}}(x_0)\prod_{j=1}^n\frac{(\xi^*_j)^{\alpha_j}}{\alpha_j!}+o(\Vert v^*\Vert^{k-1}) f(x0+tv)=i=1k1viif(x0)i!ti+o(tk1)=i=1k1α1+...+αn=ix1α1...xnαnif(x0)j=1nαj!(tξj)αj+o(tk1)v=tv,v=1v0=tf(x0+v)=i=1k1α1+...+αn=ix1α1...xnαnif(x0)j=1nαj!(ξj)αj+o(vk1)

隐函数定理

C k C^k Ck类函数 F F F以及向量 X ∈ R m , Y ∈ R n X\in \R ^m,Y\in \R ^n XRm,YRn,在 ( X 0 , Y 0 ) (X_0,Y_0) (X0,Y0)有(由于方程个数即未知数个数,所以 F F F Y Y Y的维数都是 n n n
F ( X 0 , Y 0 ) = ( F 1 ( X 0 , Y 0 ) , . . . , F n ( X 0 , Y 0 ) ) = 0 F(X_0,Y_0)=(F_1(X_0,Y_0),...,F_n(X_0,Y_0))=0 F(X0,Y0)=(F1(X0,Y0),...,Fn(X0,Y0))=0
并且在 ( X 0 , Y 0 ) (X_0,Y_0) (X0,Y0)对于 Y Y Y F F F J a c o b i Jacobi Jacobi矩阵有
d e t ( ∂ F ∂ Y ) = d e t ( ∂ ( F 1 , F 2 , . . . , F n ) ∂ ( y 1 , y 2 , . . . , y n ) ) ≠ 0 det(\frac{\partial F}{\partial Y})=det(\frac{\partial(F_1,F_2,...,F_n)}{\partial(y_1,y_2,...,y_n)})\neq 0 det(YF)=det((y1,y2,...,yn)(F1,F2,...,Fn))=0
可以得到在 ( X 0 , Y 0 ) (X_0,Y_0) (X0,Y0)的小邻域内有 C k C^k Ck类隐函数 Y = Y ( X ) Y=Y(X) Y=Y(X),且在 ( X 0 , Y 0 ) (X_0,Y_0) (X0,Y0)
∂ Y ∂ X = − ( ∂ F ∂ Y ) − 1 ∂ F ∂ X \frac{\partial Y}{\partial X}=-(\frac{\partial F}{\partial Y})^{-1}\frac{\partial F}{\partial X} XY=(YF)1XF

  • 对于二元函数 f ( x , y ) f(x,y) f(x,y),退化为 f ( x 0 , y 0 ) = 0 f(x_0,y_0)=0 f(x0,y0)=0,满足 ∂ f ∂ y ( x 0 , y 0 ) ≠ 0 \frac{\partial f}{\partial y}(x_0,y_0)\neq 0 yf(x0,y0)=0则有隐函数 y = y ( x ) y=y(x) y=y(x)

d y d x ( x 0 , y 0 ) = − ∂ f ∂ x ( x 0 , y 0 ) ∂ f ∂ y ( x 0 , y 0 ) \frac{dy}{dx}(x_0,y_0)=-\frac{\frac{\partial f}{\partial x}(x_0,y_0)}{\frac{\partial f}{\partial y}(x_0,y_0)} dxdy(x0,y0)=yf(x0,y0)xf(x0,y0)

注意 d e t ( ∂ F ∂ Y ) ≠ 0 det(\frac{\partial F}{\partial Y})\neq 0 det(YF)=0并不是必要条件而是充分条件

如何快速理解 ∂ Y ∂ X \frac{\partial Y}{\partial X} XY的式子?

  • 移项后得到 ∂ F ∂ X + ∂ F ∂ Y ∂ Y ∂ X = 0 \frac{\partial F}{\partial X}+\frac{\partial F }{\partial Y}\frac{\partial Y}{\partial X}=0 XF+YFXY=0

  • 这实际上是对 F ( X , Y ( X ) ) = 0 F(X,Y(X))=0 F(X,Y(X))=0,复合函数对于 X X X求偏导数

更加感性但直观的理解?

  • 对于 d y d x ( x 0 , y 0 ) = − ∂ f ∂ x ( x 0 , y 0 ) ∂ f ∂ y ( x 0 , y 0 ) \frac{dy}{dx}(x_0,y_0)=-\frac{\frac{\partial f}{\partial x}(x_0,y_0)}{\frac{\partial f}{\partial y}(x_0,y_0)} dxdy(x0,y0)=yf(x0,y0)xf(x0,y0)

  • 将左边的 d x dx dx移到右边,将右边的 ∂ f ∂ y \frac{\partial f}{\partial y} yf移到左边

  • ∂ f ∂ y d y + ∂ f ∂ x d x = 0 \frac{\partial f}{\partial y}dy+\frac{\partial f}{\partial x}dx=0 yfdy+xfdx=0

  • 直观来看就是对于 d y / d x dy/dx dy/dx满足这个式子,就有沿着 ( d x , d y ) (dx,dy) (dx,dy)的方向,满足 d f = 0 df=0 df=0,即 f ( x + d x , y + d y ) = f ( x , y ) = 0 f(x+dx,y+dy)=f(x,y)=0 f(x+dx,y+dy)=f(x,y)=0

对于 ∂ Y ∂ X = − ( ∂ F ∂ Y ) − 1 ∂ F ∂ X \frac{\partial Y}{\partial X}=-(\frac{\partial F}{\partial Y})^{-1}\frac{\partial F}{\partial X} XY=(YF)1XF,如果令 F ( X , Y ) = f ( X ) − Y = 0 F(X,Y)=f(X)-Y=0 F(X,Y)=f(X)Y=0,那么就有
∂ Y ∂ X = − ( ∂ F ∂ Y ) − 1 ∂ F ∂ X = ( ∂ X ∂ Y ) − 1 \frac{\partial Y}{\partial X}=-(\frac{\partial F}{\partial Y})^{-1}\frac{\partial F}{\partial X}=(\frac{\partial X}{\partial Y})^{-1} XY=(YF)1XF=(YX)1
即对于 Y = Y ( X ) Y=Y(X) Y=Y(X)以及 X = X ( Y ) X=X(Y) X=X(Y) J a c o b i Jacobi Jacobi矩阵有互为逆的关系

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值