一、牛顿法回顾
- 上一篇牛顿法(Newton Method)中介绍了牛顿法的基本思路,牛顿法具有二阶收敛性,相比较最速下降法,收敛的速度更快。
- 但是牛顿法也有一个缺点就是:求解Hessian矩阵复杂度比较大
1、下面是第k+1步的牛顿迭代:
- 对于函数 f ( X ) f(X) f(X),其中 X = [ x 1 , x 2 , … , x n ] T X=[x_1,x_2,…,x_n ]^T X=[x1,x2,…,xn]T为向量。在牛顿法的求解过程中,首先是将 f ( X ) f(X) f(X)函数在 X k + 1 X^{k+1} Xk+1处展开,并且令 f ( X ) f(X) f(X)函数在 X k + 1 X^{k+1} Xk+1处的梯度为: ∇ f ( X k + 1 ) = [ ∂ f ∂ x 1 , ∂ f ∂ x 2 , … , ∂ f ∂ x n ] T ∇f(X^{k+1} )=[\frac{∂f}{∂x_1},\frac{∂f}{∂x_2},…,\frac{∂f}{∂x_n}]^T ∇f(Xk+1)=[∂x1∂f,∂x2∂f,…,∂xn∂f]T
- 泰勒展开为: f ( X ) = f ( X k + 1 ) + ∇ f ( X k + 1 ) T ( X − X k + 1 ) + 1 2 ( X − X k + 1 ) T G k + 1 ( X − X k + 1 ) + ⋯ + o f(X)=f(X^{k+1})+∇f(X^{k+1} )^T (X-X^{k+1})+\frac{1}{2} (X-X^{k+1} )^T G_{k+1} (X-X^{k+1})+⋯+o f(X)=f(Xk+1)+∇f(Xk+1)T(X−Xk+1)+21(X−Xk+1)TGk+1(X−Xk+1)+⋯+o
- G k + 1 为 X = X k + 1 G_{k+1}为X=X^{k+1} Gk+1为X=Xk+1的Hesse矩阵,省略高价无穷小量: f ( X ) = f ( X k + 1 ) + ∇ f ( X k + 1 ) T ( X − X k + 1 ) + 1 2 ( X − X k + 1 ) T G k + 1 ( X − X k + 1 ) f(X)=f(X^{k+1})+∇f(X^{k+1} )^T (X-X^{k+1})+\frac{1}{2} (X-X^{k+1} )^T G_{k+1} (X-X^{k+1}) f(X)=f(Xk+1)+∇f(Xk+1)T(X−Xk+1)+21(X−Xk+1)TGk+1(X−Xk+1)
- 对 X X X求导,并令导数为 0 0 0: ∇ f ( X ) = ∇ f ( X k + 1 ) T + G k + 1 ( X − X k + 1 ) = 0 ∇f(X)=∇f(X^{k+1} )^T+G_{k+1} (X-X^{k+1})=0 ∇f(X)=∇f(Xk+1)T+Gk+1(X−Xk+1)=0
- 求出 X X X: X = X k + 1 − ∇ f ( X k + 1 ) G k + 1 = X k + 1 − G k + 1 − 1 ∇ f ( X k + 1 )