高斯-牛顿法

Gauss-Newton Method
考虑
min ⁡ x ∈ R n { g ( x ) ≡ ∑ i = 1 m ( f i ( x ) − c i ) 2 } \min _{\mathbf{x} \in \mathbb{R}^{n}}\left\{g(\mathbf{x}) \equiv \sum_{i=1}^{m}\left(f_{i}(\mathbf{x})-c_{i}\right)^{2}\right\} xRnmin{g(x)i=1m(fi(x)ci)2}
假设 f 1 , ⋯   , f m f_1,\cdots,f_m f1,,fm R n \mathbb{R}^n Rn上连续可微
c 1 , ⋯   , c m ∈ R c_1,\cdots,c_m\in\mathbb{R} c1,,cmR
有时候会向量化
F ( x ) = ( f 1 ( x ) − c 1 f 2 ( x ) − c 2 ⋮ f m ( x ) − c m ) F\left(\mathbf{x}\right)=\begin{pmatrix} f_1\left(\mathbf{x}\right)-c_1\\ f_2\left(\mathbf{x}\right)-c_2\\ \vdots\\ f_m\left(\mathbf{x}\right)-c_m\\ \end{pmatrix} F(x)=f1(x)c1f2(x)c2fm(x)cm
于是问题就是
min ⁡ ∥ F ( x ) ∥ 2 \min \|F\left(\mathbf{x}\right)\|^2 minF(x)2
每一次迭代
x k + 1 = arg ⁡ min ⁡ x ∈ R n { ∑ i = 1 m [ f i ( x k ) + ∇ f i ( x k ) T ( x − x k ) − c i ] 2 } \mathbf{x}_{k+1}=\arg\min_{\mathbf{x}\in\mathbb{R}^n}\left\{\sum_{i=1}^{m}\left[f_i\left(\mathbf{x}_{k}\right)+\nabla f_i\left(\mathbf{x}_k\right)^T\left(\mathbf{x}-\mathbf{x}_{k}\right)-c_i\right]^2\right\} xk+1=argxRnmin{i=1m[fi(xk)+fi(xk)T(xxk)ci]2}
这个问题可以转化为
min ⁡ x ∈ R n ∥ A k x − b k ∥ 2 \min_{\mathbf{x}\in\mathbb{R}^n}\|\mathbf{A}_{k}\mathbf{x}-\mathbf{b}_{k}\|^2 xRnminAkxbk2
其中
A k = ( ∇ f 1 ( x k ) T ∇ f 2 ( x k ) T ⋮ ∇ f m ( x k ) T ) = J ( x k ) \mathbf{A}_{k}=\left(\begin{array}{c} \nabla f_{1}\left(\mathbf{x}_{k}\right)^{T} \\ \nabla f_{2}\left(\mathbf{x}_{k}\right)^{T} \\ \vdots \\ \nabla f_{m}\left(\mathbf{x}_{k}\right)^{T} \end{array}\right)=J\left(\mathbf{x}_{k}\right) Ak=f1(xk)Tf2(xk)Tfm(xk)T=J(xk)
也叫做雅可比矩阵
b k = ( ∇ f 1 ( x k ) T x k − f 1 ( x k ) + c 1 ∇ f 2 ( x k ) T x k − f 2 ( x k ) + c 2 ⋮ ∇ f m ( x k ) T x k − f m ( x k ) + c m ) = J ( x k ) x k − F ( x k ) \mathbf{b}_{k}=\left(\begin{array}{c} \nabla f_{1}\left(\mathbf{x}_{k}\right)^{T} \mathbf{x}_{k}-f_{1}\left(\mathbf{x}_{k}\right)+c_{1} \\ \nabla f_{2}\left(\mathbf{x}_{k}\right)^{T} \mathbf{x}_{k}-f_{2}\left(\mathbf{x}_{k}\right)+c_{2} \\ \vdots \\ \nabla f_{m}\left(\mathbf{x}_{k}\right)^{T} \mathbf{x}_{k}-f_{m}\left(\mathbf{x}_{k}\right)+c_{m} \end{array}\right)=J\left(\mathbf{x}_{k}\right) \mathbf{x}_{k}-F\left(\mathbf{x}_{k}\right) bk=f1(xk)Txkf1(xk)+c1f2(xk)Txkf2(xk)+c2fm(xk)Txkfm(xk)+cm=J(xk)xkF(xk)
然后最小二乘法一下
x k + 1 = ( J ( x k ) T J ( x k ) ) − 1 J ( x k ) b k = ( J ( x k ) T J ( x k ) ) − 1 J ( x k ) ( J ( x k ) x k − F ( x k ) ) = x k − ( J ( x k ) T J ( x k ) ) − 1 J ( x k ) T F ( x k ) \begin{aligned} \mathbf{x}_{k+1}&=\left(J\left(\mathbf{x}_{k}\right)^TJ\left(\mathbf{x}_{k}\right)\right)^{-1}J\left(\mathbf{x}_{k}\right)\mathbf{b}_k\\ &=\left(J\left(\mathbf{x}_{k}\right)^TJ\left(\mathbf{x}_{k}\right)\right)^{-1}J\left(\mathbf{x}_{k}\right)\left(J\left(\mathbf{x}_{k}\right) \mathbf{x}_{k}-F\left(\mathbf{x}_{k}\right)\right)\\ &=\mathbf{x}_{k}-\left(J\left(\mathbf{x}_{k}\right)^{T} J\left(\mathbf{x}_{k}\right)\right)^{-1} J\left(\mathbf{x}_{k}\right)^{T} F\left(\mathbf{x}_{k}\right) \end{aligned} xk+1=(J(xk)TJ(xk))1J(xk)bk=(J(xk)TJ(xk))1J(xk)(J(xk)xkF(xk))=xk(J(xk)TJ(xk))1J(xk)TF(xk)
于是下降方向 d k = ( J ( x k ) T J ( x k ) ) − 1 J ( x k ) T F ( x k ) \mathbf{d}_{k}=\left(J\left(\mathbf{x}_{k}\right)^{T} J\left(\mathbf{x}_{k}\right)\right)^{-1} J\left(\mathbf{x}_{k}\right)^{T} F\left(\mathbf{x}_{k}\right) dk=(J(xk)TJ(xk))1J(xk)TF(xk)
注意到 ∇ g ( x ) = 2 J ( x ) T F ( x ) \nabla g\left(\mathbf{x}\right)=2J\left(\mathbf{x}\right)^T F\left(\mathbf{x}\right) g(x)=2J(x)TF(x)
d k = 1 2 ( J ( x k ) T J ( x k ) ) − 1 ∇ g ( x k ) \mathbf{d}_{k}=\frac{1}{2}\left(J\left(\mathbf{x}_{k}\right)^{T} J\left(\mathbf{x}_{k}\right)\right)^{-1} \nabla g\left(\mathbf{x}_{k}\right) dk=21(J(xk)TJ(xk))1g(xk)
这个迭代时没有步长的,如果用了步长就是阻尼高斯-牛顿法(Damped Gauss-Newton Method)

具体步骤:
输入: ϵ > 0 \epsilon>0 ϵ>0
初始化:选择任意 x 0 ∈ R n \mathbf{x}_0\in\mathbb{R}^n x0Rn
步骤:
(a) d k = ( J ( x k ) T J ( x k ) ) − 1 J ( x k ) T F ( x k ) \mathbf{d}_{k}=\left(J\left(\mathbf{x}_{k}\right)^{T} J\left(\mathbf{x}_{k}\right)\right)^{-1} J\left(\mathbf{x}_{k}\right)^{T} F\left(\mathbf{x}_{k}\right) dk=(J(xk)TJ(xk))1J(xk)TF(xk)
(b)选择步长 t k t_k tk
h ( t ) = g ( x k − t d k ) h\left(t\right)=g\left(\mathbf{x}_k-t\mathbf{d}_k\right) h(t)=g(xktdk)
( c ) (c) (c) x k + 1 = x k − t k d k \mathbf{x}_{k+1}=\mathbf{x}_k-t_k\mathbf{d}_k xk+1=xktkdk
(d)如果 ∥ ∇ g ( x k + 1 ) ∥ ≤ ϵ \|\nabla g\left(\mathbf{x}_{k+1}\right)\|\le \epsilon g(xk+1)ϵ,就停止,并输出 x k + 1 \mathbf{x}_{k+1} xk+1

代码:这里步长用回溯法选的

function [x,fun_val]=damped_Gauss_Newtow(g,grad,J,F,x0,s,alpha,...
beta,epsilon)
% Gradient method with backtracking stepsize rule
%
% INPUT
%=======================================
% g ......... objective function
% grad ...... gradient of the objective function
% J ......... Jacobian matrix
% F ......... vector-valued function
% x0......... initial point
% s ......... initial choice of stepsize
% alpha ..... tolerance parameter for the stepsize selection
% beta ...... the constant in which the stepsize is multiplied
%             at each backtracking step (0<beta<1)
% epsilon ... tolerance parameter for stopping rule
% OUTPUT
%=======================================
% x ......... optimal solution (up to a tolerance)
%             of min f(x)
% fun_val ... optimal function value
x=x0;
J_val=J(x);
F_val=F(x);
d=(J_val'*J_val)\(J_val'*F_val);
fun_val=g(x);
gval=grad(x);
iter=0;
while (norm(gval)>epsilon&&(iter<10000))
    iter=iter+1;
    t=s;
    while (fun_val-g(x-t*d)<alpha*t*norm(d)^2)
        t=beta*t;
    end
    x=x-t*d;
    J_val=J(x);
    F_val=F(x);
    d=(J_val'*J_val)\(J_val'*F_val);
    fun_val=g(x);
    gval=grad(x);
    fprintf('iter_number = %3d norm_grad = %2.6f fun_val = %2.6f \n',...
        iter,norm(gval),fun_val);
end
if (iter==10000)
    fprintf('did not converge\n')
end
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Nightmare004

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值