前言
最优化问题中总会遇到多元函数的泰勒展开公式,没有推导过总是感觉很抽象,参考如下链接,文章讲得很清晰。
推导过程
- 一元函数在点
X
k
\mathcal{X}_{k}
Xk处的泰勒展开式为:
f ( x ) = f ( x k ) + ( x − x k ) f ′ ( x k ) + 1 2 ! ( x − x k ) 2 f ′ ′ ( x k ) + o n f(x)=f\left(x_{k}\right)+\left(x-x_{k}\right) f^{\prime}\left(x_{k}\right)+\frac{1}{2 !}\left(x-x_{k}\right)^{2} f^{\prime \prime}\left(x_{k}\right)+o^{n} f(x)=f(xk)+(x−xk)f′(xk)+2!1(x−xk)2f′′(xk)+on - 二元函数在点
(
x
k
,
y
k
)
\left(x_{k}, y_{k}\right)
(xk,yk)处的泰勒展开式为:
f ( x , y ) = f ( x k , y k ) + ( x − x k ) f x ′ ( x k , y k ) + ( y − y k ) f y ′ ( x k , y k ) + 1 2 ! ( x − x k ) 2 f x x ′ ′ ( x k , y k ) + 1 2 ! ( x − x k ) ( y − y k ) f x y ′ ′ ( x k , y k ) + 1 2 ! ( x − x k ) ( y − y k ) f y x ′ ′ ( x k , y k ) + 1 2 ! ( y − y k ) 2 f y y ′ ′ ( x k , y k ) + o n \begin{array}{c} f(x, y)=f\left(x_{k}, y_{k}\right)+\left(x-x_{k}\right) f_{x}^{\prime}\left(x_{k}, y_{k}\right)+\left(y-y_{k}\right) f_{y}^{\prime}\left(x_{k}, y_{k}\right) \\ +\frac{1}{2 !}\left(x-x_{k}\right)^{2} f_{x x}^{\prime \prime}\left(x_{k}, y_{k}\right)+\frac{1}{2 !}\left(x-x_{k}\right)\left(y-y_{k}\right) f_{x y}^{\prime \prime}\left(x_{k}, y_{k}\right) \\ +\frac{1}{2 !}\left(x-x_{k}\right)\left(y-y_{k}\right) f_{y x}^{\prime \prime}\left(x_{k}, y_{k}\right)+\frac{1}{2 !}\left(y-y_{k}\right)^{2} f_{y y}^{\prime \prime}\left(x_{k}, y_{k}\right) \\ +o^{n} \end{array} f(x,y)=f(xk,yk)+(x−xk)fx′(xk,yk)+(y−yk)fy′(xk,yk)+2!1(x−xk)2fxx′′(xk,yk)+2!1(x−xk)(y−yk)fxy′′(xk,yk)+2!1(x−xk)(y−yk)fyx′′(xk,yk)+2!1(y−yk)2fyy′′(xk,yk)+on
上公式化成矩阵形式如下,可以和后面的多元函数泰勒展开的矩阵形式相互对照理解。
f ( x , y ) = f ( x k , y k ) + [ f x ′ ( x k , y k ) f y ′ ( x k , y k ) ] [ x − x k y − y k ] f(x, y)=f\left(x_{k}, y_{k}\right)+\left[f_{x}^{\prime}\left(x_{k}, y_{k}\right) \quad f_{y}^{\prime}\left(x_{k}, y_{k}\right)\right]\left[\begin{array}{c} x-x_{k} \\ y-y_{k} \end{array}\right] f(x,y)=f(xk,yk)+[fx′(xk,yk)fy′(xk,yk)][x−xky−yk]
+ 1 2 ! [ x − x k y − y k ] [ f x x ′ ′ ( x k , y k ) f x y ′ ′ ( x k , y k ) f y x ′ ′ ( x k , y k ) f y y ′ ′ ( x k , y k ) ] [ x − x k y − y k ] +\frac{1}{2 !}\left[x-x_{k} \quad y-y_{k}\right]\left[\begin{array}{cc} f_{x x}^{\prime \prime}\left(x_{k}, y_{k}\right) & f_{x y}^{\prime \prime}\left(x_{k}, y_{k}\right) \\ f_{y x}^{\prime \prime}\left(x_{k}, y_{k}\right) & f_{y y}^{\prime \prime}\left(x_{k}, y_{k}\right) \end{array}\right]\left[\begin{array}{c} x-x_{k} \\ y-y_{k} \end{array}\right] +2!1[x−xky−yk][fxx′′(xk,yk)fyx′′(xk,yk)fxy′′(xk,yk)fyy′′(xk,yk)][x−xky−yk] - 多元函数在
(
x
k
1
,
x
k
2
,
…
,
x
k
n
)
\left(x_{k}^{1}, x_{k}^{2}, \dots, x_{k}^{n}\right)
(xk1,xk2,…,xkn) 处的泰勒展开公式:
f ( x 1 , x 2 , … , x n ) = f ( x k 1 , x k 2 , … , x k n ) + ∑ i = 1 n ( x i − x k i ) f x i ′ ( x k 1 , x k 2 , … , x k n ) + 1 2 ! ∑ i , j = 1 n ( x i − x k i ) ( x j − x k j ) f i j ′ ′ ( x k 1 , x k 2 , … , x k n ) + o n \begin{array}{c} f\left(x^{1}, x^{2}, \ldots, x^{n}\right)=f\left(x_{k}^{1}, x_{k}^{2}, \ldots, x_{k}^{n}\right)+\sum_{i=1}^{n}\left(x^{i}-x_{k}^{i}\right) f_{x^{i}}^{\prime}\left(x_{k}^{1}, x_{k}^{2}, \ldots, x_{k}^{n}\right) \\ +\frac{1}{2 !} \sum_{i, j=1}^{n}\left(x^{i}-x_{k}^{i}\right)\left(x^{j}-x_{k}^{j}\right) f_{i j}^{\prime \prime}\left(x_{k}^{1}, x_{k}^{2}, \ldots, x_{k}^{n}\right) \\ +o^{n} \end{array} f(x1,x2,…,xn)=f(xk1,xk2,…,xkn)+∑i=1n(xi−xki)fxi′(xk1,xk2,…,xkn)+2!1∑i,j=1n(xi−xki)(xj−xkj)fij′′(xk1,xk2,…,xkn)+on - 多元函数泰勒展开式写成矩阵的形式:
f ( x ) = f ( x k ) + [ ∇ f ( x k ) ] T ( x − x k ) + 1 2 ! [ x − x k ] T H ( x k ) [ x − x k ] + o n f(\mathbf{x})=f\left(\mathbf{x}_{k}\right)+\left[\nabla f\left(\mathbf{x}_{k}\right)\right]^{T}\left(\mathbf{x}-\mathbf{x}_{k}\right)+\frac{1}{2 !}\left[\mathbf{x}-\mathbf{x}_{k}\right]^{T} H\left(\mathbf{x}_{k}\right)\left[\mathbf{x}-\mathbf{x}_{k}\right]+o^{n} f(x)=f(xk)+[∇f(xk)]T(x−xk)+2!1[x−xk]TH(xk)[x−xk]+on
H ( x k ) = [ ∂ 2 f ( x k ) ∂ x 1 2 ∂ 2 f ( x k ) ∂ x 1 ∂ x 2 ⋯ ∂ 2 f ( x k ) ∂ x 1 ∂ x n ∂ 2 f ( x k ) ∂ x 2 ∂ x 1 ∂ 2 f ( x k ) ∂ x 2 2 ⋯ ∂ 2 f ( x k ) ∂ x 2 ∂ x n ⋮ ⋮ ⋱ ⋮ ∂ 2 f ( x k ) ∂ x n ∂ x 1 ∂ 2 f ( x k ) ∂ x n ∂ x 2 ⋯ ∂ 2 f ( x k ) ∂ x n 2 ] H\left(\mathbf{x}_{k}\right)=\left[\begin{array}{cccc} \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{1}^{2}} & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{1} \partial x_{2}} & \cdots & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{1} \partial x_{n}} \\ \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{2} \partial x_{1}} & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{2}^{2}} & \cdots & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{2} \partial x_{n}} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{n} \partial x_{1}} & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{n} \partial x_{2}} & \cdots & \frac{\partial^{2} f\left(x_{k}\right)}{\partial x_{n}^{2}} \end{array}\right] H(xk)=⎣⎢⎢⎢⎢⎢⎡∂x12∂2f(xk)∂x2∂x1∂2f(xk)⋮∂xn∂x1∂2f(xk)∂x1∂x2∂2f(xk)∂x22∂2f(xk)⋮∂xn∂x2∂2f(xk)⋯⋯⋱⋯∂x1∂xn∂2f(xk)∂x2∂xn∂2f(xk)⋮∂xn2∂2f(xk)⎦⎥⎥⎥⎥⎥⎤ - 当为二元时,
∇ f ( x k ) = [ f x ′ ( x k , y k ) f y ′ ( x k , y k ) ] \nabla f\left(x_{k}\right)=\left[\begin{array}{l} f_{x}^{\prime}\left(x_{k}, y_{k}\right) \\ f_{y}^{\prime}\left(x_{k}, y_{k}\right) \end{array}\right] ∇f(xk)=[fx′(xk,yk)fy′(xk,yk)]
[
x
−
x
k
]
=
[
x
−
x
k
y
−
y
k
]
\left[x-x_{k}\right]=\left[\begin{array}{l} x-x_{k} \\ y-y_{k} \end{array}\right]
[x−xk]=[x−xky−yk]
H
(
x
k
)
=
[
f
x
x
′
′
(
x
k
,
y
k
)
f
x
y
′
′
(
x
k
,
y
k
)
f
y
x
′
′
(
x
k
,
y
k
)
f
y
y
′
′
(
x
k
,
y
k
)
]
H\left(x_{k}\right)=\left[\begin{array}{ll} f_{x x}^{\prime \prime}\left(x_{k}, y_{k}\right) & f_{x y}^{\prime \prime}\left(x_{k}, y_{k}\right) \\ f_{y x}^{\prime \prime}\left(x_{k}, y_{k}\right) & f_{y y}^{\prime \prime}\left(x_{k}, y_{k}\right) \end{array}\right]
H(xk)=[fxx′′(xk,yk)fyx′′(xk,yk)fxy′′(xk,yk)fyy′′(xk,yk)]