梯度-牛顿-拟牛顿优化算法和实现

1.要求解的问题

求解无约束非线性优化问题 

minxR2f(x)=100(x21x2)2+(x11)2

该问题有精确解 x=(1,1)T,f(x)=0

其梯度向量

g(x)=(400x1(x21x2)+2(x11),200(x21x2))

其海森矩阵
G(x)=[1200x21400x2+2400x1400x1200]

下面定义了三个Python语言编写的函数:函数表达式fun,梯度向量gfun,和海森矩阵hess。这三个表达式在后面各个算法的实现中会用到。

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 函数表达式fun</span>
fun = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span>*(x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>-x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span> + (x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>

<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 梯度向量 gfun</span>
gfun = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x:np.array([<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">400</span>*x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]*(x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>-x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])+<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>*(x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>), -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>*(x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>-x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])])

<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 海森矩阵 hess</span>
hess = <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">lambda</span> x:np.array([[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1200</span>*x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]**<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">400</span>*x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>]+<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>, -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">400</span>*x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]],[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">400</span>*x[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>],<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>]])</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

2.线搜索技术和Armijo准则

线搜索技术是求解许多优化问题下降算法的基本组成部分,但精确线性搜索需要计算很多的函数值和梯度值,从而耗费较多资源。特别是当迭代点远离最优点时,精确线搜索技术通常不是有效和合理的。对于许多优化算法,其收敛速度并不依赖于精确搜索过程。因此既能保证目标函数具有可接受的下降量,又能使最终形成的迭代序列收敛的非精确先搜索越来越流行。

符号约定:

  • gk  :  f(xk) ,即第目标函数关于 k 次迭代值 xk 的导数。
  • Gk  :  G(xk)=2f(xk) ,即海森矩阵。
  • dk  : 第 k 次迭代的优化方向。在最速下降算法中,有 dk=gk
  • αk  : 第 k 次迭代的步长因子,有 xk+1=xk+αkdk

在精确线性搜索中,步长因子 αk 由下面的式子确定: 

αk=argminαf(xk+αdk)

而对于非精确线性搜索,选取的 αk 只要使得目标函数 f 得到可接受的下降量,即 

Δfk=f(xk)f(xk+αkdk)>0

Armijo准则用于非精确线性搜索中步长因子 α 的确定,内容如下:

Armijo准则: 
已知当前位置 xk 和优化的方向 dk  ,参数 β(0,1) , δ(0,0.5) .令步长因子 αk=βmk ,其中 mk 为满足下列不等式的最小非负整数 m : 

f(xk+βmdk)f(xk)+δβmgTkdk

由此确定的下一个位置 xk+1=xk+αkdk  
NOTE: 上面的公式仅仅适用于梯度下降(Gradient Descent),对于梯度上升,则略有不用。首先公式变成了
f(xkβmdk)f(xk)δβmgTkdk,
然后确定下一个位置 xk+1=xkαkdk

该准则在接下来的介绍的几个算法中多次使用。

3.最速下降法及其Python实现

算法描述

step 1 :选取初始点 x0Rn ,容许误差 0ϵ1  .令  k1
step 2 :计算 gk=f(xk) . 若 ||gk||ϵ ,停止迭代,输出 xk 作为近似最优解。 
step 3 :取方向 dk=gk
step 4 :由线搜索技术确定步长因子 αk
step 5 :令 xk+1xk+αkdk ,转step 2.

上述step 3中步长因子 αk 的确定既可以使用精确线搜索方法,也可以使用非精确线搜索方法,在理论上都能保证其全局收敛性。若采用精确线搜索方法,则 αk 满足 

ddαf(xk+αdk)|α=αk=f(xk+αkdk)Tdk=0

从而有

f(xk+1)Tf(xk)=0

xk+1 处的梯度与其前驱 xk 处的梯度是正交的,也就是说,迭代点序列所走的路线是锯齿形的,这造成其收敛速度很缓慢。如下图所示,其中绿色折线所显示的路线就是由最速下降法得到的,红色曲线是由共轭梯度下降法确定的,通过对比可以看出梯度下降法所走的路线是锯齿形的,经测试,其收敛速度非常慢。

来自于wiki百科

最速下降法的Python实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">gradient</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#最速下降法求解无约束问题</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># x0是初始点,fun和gfun分别是目标函数和梯度</span>
    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5000</span>
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k<maxk:
        gk = gfun(x0)
        dk = -gk
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(dk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        m = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        mk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m< <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0) + sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 += rho**mk*dk
        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li></ul>

性能测试 
这里写图片描述

可以看到大约有一半的样本的迭代次数要超过2000次。

4.牛顿法

牛顿法的基本思想是用迭代点 xk 处的一阶导数(梯度 gk )和二阶倒数(海森矩阵 Gk )对目标函数进行二次函数近似,然后把二次模型的极小点作为新的迭代点。

牛顿法推导

函数 f(x) xk 处的泰勒展开式的前三项得 

T(f,xk,3)=fk+gTk(xxk)+12(xxk)TGk(xxk)

则其稳定点
T=gk+Gk(xxk)=0

Gk 非奇异,那么解上面的线性方程(标记其解为 xk+1 )即得到牛顿法的迭代公式
xk+1=xkG1kgk

即优化方向为 dk=G1kgk

求稳定点是用到了以下公式:

考虑 y=xTAx ,根据矩阵求导法则,有 

dydx=Ax+ATxd2ydx2=A+AT

注意实际中 dk 的是通过求解线性方程组 Gkd=gk 获得的。

阻尼牛顿法及其Python实现

阻尼牛顿法采用了牛顿法的思想,唯一的差别是阻尼牛顿法并不直接采用 xk+1=xk+dk ,而是用线搜索技术获得合适的步长因子 αk ,用公式 xk+1=xk+δmdk 计算 xk+1

阻尼牛顿法的算法描述

step 1: 给定终止误差 0ϵ1,δ(0,1),σ(0,0.5) . 初始点 x0Rn . 令 k0  
step 2: 计算 gk=f(xk) . 若 ||gk||ϵ ,停止迭代,输出 xxk  
step 3: 计算 Gk=2f(xk) ,并求解线性方程组得到解 dk ,

Gkd=gk

step 4: 记 mk 是满足下列不等式的最小非负整数 m .
f(xk+βmdk)f(xk)+δβmgTkdk

step 5: 令 αk=δmk,xk+1=xk+αkdk,kk+1 ,转 step 2

阻尼牛顿法的Python实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">dampnm</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,hess,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用阻尼牛顿法求解无约束问题</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#x0是初始点,fun,gfun和hess分别是目标函数值,梯度,海森矩阵的函数</span>
    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">500</span>
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k < maxk:
        gk = gfun(x0)
        Gk = hess(x0)
        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*np.linalg.solve(Gk,gk)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(dk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        m = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        mk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0) + sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 += rho**mk*dk
        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li></ul>

性能展示 
这里写图片描述
作图代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">iters = []
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]):
    rt = dampnm(fun,gfun,hess,data[i])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>] <=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>:
        iters.append(rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i%<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> i,rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]

plt.hist(iters,bins=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>)
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'阻尼牛顿法迭代次数分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'迭代次数'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'频率分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>

修正牛顿法法及其Python实现

牛顿法要求目标函数的海森矩阵 G(x)=2f(x) 在每个迭代点 xk 处是正定的,否则难以保证牛顿方向 dk=G1kgk f xk 处的下降方向。为克服这一缺陷,需要对牛顿法进行修正。 
下面介绍两种修正方法,分别是“牛顿-最速下降混合算法”和“修正牛顿法”。 
牛顿-最速下降混合算法 
该方法将牛顿法和最速下降法结合起来,基本思想是:当海森矩阵正定时,采用牛顿法确定的优化方向作为搜索方向;否则,即海森矩阵为非正定矩阵时,则采用负梯度方向作为搜索方向。

修正牛顿法 
引入阻尼因子 μk0 ,即在每一迭代步适当选取参数 μk ,使得矩阵 Ak=Gk+μkI 正定,用 Ak 代替 Gk 来求解 dk

算法描述

step 1: 给定终止误差 0ϵ1,δ(0,1),σ(0,0.5) . 初始点 x0Rn . 令 k0  
step 2: 计算 gk=f(xk) μk=||gk||1+τ . 若 ||gk||ϵ ,停止迭代,输出 xxk  
step 3: 计算 Gk=2f(xk) ,并求解线性方程组得到解 dk ,

(Gk+μkI)d=gk

step 4: 记 mk 是满足下列不等式的最小非负整数 m .
f(xk+βmdk)f(xk)+δβmgTkdk

step 5: 令 αk=δmk,xk+1=xk+αkdk,kk+1 ,转 step 2

修正牛顿法的Python实现代码

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">revisenm</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,hess,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用修正牛顿法求解无约束问题</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#x0是初始点,fun,gfun和hess分别是目标函数值,梯度,海森矩阵的函数</span>

    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e5</span>
    n = np.shape(x0)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    tau = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.0</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k < maxk:
        gk = gfun(x0)      
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span>  np.linalg.norm(gk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        muk = np.power(np.linalg.norm(gk),<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>+tau)
        Gk = hess(x0)
        Ak = Gk + muk*np.eye(n)
        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*np.linalg.solve(Ak,gk)
        m = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        mk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0) + sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 += rho**mk*dk
        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li></ul>

性能展示 
这里写图片描述

5.拟牛顿法

牛顿法的优点是具有二阶收敛速度,缺点是:

  • 但当海森矩阵 G(xk)=2f(x)  不正定时,不能保证所产生的方向是目标函数在 xk 处的下降方向。
  • 特别地,当 G(xk) 奇异时,算法就无法继续进行下去。尽管修正牛顿法可以克服这一缺陷,但修正参数的取值很难把握,过大或过小都会影响到收敛速度。
  • 牛顿法的每一步迭代都需要目标函数的海森矩阵 G(xk) ,对于大规模问题其计算量是惊人的。

拟牛顿法的基本思想是用海森矩阵 Gk 的某个近似矩阵 Bk 取代 Gk Bk 通常具有下面三个特点:

  • 在某种意义下有 BkGk  ,使得相应的算法产生的方向近似于牛顿方向,确保算法具有较快的收敛速度。
  • 对所有的 k Bk 是正定的,从而使得算法所产生的方向是函数 f xk 处下降方向。
  • 矩阵 Bk 更新规则比较简单

设  f:RnR 在开集 DRn 上二次连续可微,那么 f xk+1 处的二次近似模型为: 

f(x)f(xk+1)+gTk+1(xxk+1)+12(xxk+1)TGk+1(xxk+1)

对上式求导得

g(x)gk+1+Gk+1(xxk+1)

x=xk ,位移 sk=xk+1xk ,梯度差 yk=gk+1gk ,则有
Gk+1skyk
.

因此,拟牛顿法中近似矩阵 Bk 要满足关系式

Bk+1sk=yk

Hk+1=B1k+1 ,得到拟牛顿法的另一形式
Hk+1yk=sk

其中 Hk+1 为海森矩阵逆矩阵的近似。上面两个公式称为 拟牛顿方程 。 
搜索方向由 dk=Hkgk Bkdk=gk 确定。根据近似矩阵的第三个特点,有 
Bk+1=Bk+Ek,Hk+1=Hk+Dk

其中 Ek,Dk 为秩1或秩2矩阵。该公式称为 校正规则 。 
通常将上面的三个式子(拟牛顿方程和校正规则)所确立的方法称为拟牛顿法。从下面的拟牛顿法的几个变种可以看出,不同的拟牛顿法的主要区别在于更新公式的不同。

DFP算法及其Python实现

DFP算法的校正公式如下: 

Hk+1=HkHkykyTkHkyTkHkyk+sksTksTkyk

注意公式中的两个分数结构,分母 yTkHkyk sTkyk 是标量,分子 HkykyTkHk sksTk 是与 Hk 同型的矩阵,而且都是对称矩阵。若 Hk 正定,且 sTkyk>0 ,则 Hk+1 也正定。

当采用精确线搜索时,矩阵序列 Hk 的正定新条件 sTkyk>0 可以被满足。但对于 Armijo 搜索准则来说,不能满足这一条件,需要做如下修正: 

Hk+1=HkHkHkykyTkHkyTkHkyk+sksTksTkyksTkyk0sTkyk>0

DFP算法描述

step 1: 给定参数 δ(0,1),σ(0,0.5) ,初始点 x0R ,终止误差 0ϵ1 .初始对称正定矩阵 H0 (通常取为 G(x0)1 或单位矩阵 In ).令 k0  
step 2: 计算 gk=f(xk) . 若 ||gk||ϵ ,停止迭代,输出 xxk  
step 3: 计算搜索方向

dk=Hkgk

step 4: 记 mk 是满足下列不等式的最小非负整数 m .
f(xk+βmdk)f(xk)+δβmgTkdk
αk=δmk,xk+1=xk+αkdk  
step 5: 由校正公式确定 Hk+1  
step 6:  kk+1 ,转  step 2

DFP算法的代码实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">dfp</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,hess,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#功能:用DFP族算法求解无约束问题:min fun(x)</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输入:x0是初始点,fun,gfun分别是目标函数和梯度</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输出:x,val分别是近似最优点和最优解,k是迭代次数</span>
    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e5</span>
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    n = np.shape(x0)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#海森矩阵可以初始化为单位矩阵</span>
    Hk = np.linalg.inv(hess(x0)) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#或者单位矩阵np.eye(n)</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k < maxk :
        gk = gfun(x0)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(gk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*np.dot(Hk,gk)

        m=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;
        mk=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用Armijo搜索求步长</span>
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0)+sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#print mk</span>
        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#DFP校正</span>
        x = x0 + rho**mk*dk
        sk = x - x0
        yk = gfun(x) - gk   

        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.dot(sk,yk) > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
            Hy = np.dot(Hk,yk)
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> Hy
            sy = np.dot(sk,yk) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#向量的点积</span>
            yHy = np.dot(np.dot(yk,Hk),yk) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># yHy是标量</span>
            <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#表达式Hy.reshape((n,1))*Hy 中Hy是向量,生成矩阵</span>
            Hk = Hk - <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*Hy.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*Hy/yHy + <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*sk.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*sk/sy

        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 = x


    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#分别是最优点坐标,最优值,迭代次数</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li></ul>

由以上代码可知,拟牛顿法只需要在迭代开始前计算一次海森矩阵,更一般的,根本就不计算海森矩阵,而是初始化为单位矩阵,在每次迭代过程中是不需按传统方法(二阶导数)计算海森矩阵的。

实际性能 
这里写图片描述
相关代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">n = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>
x = y = np.linspace(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>,n)
xx,yy = np.meshgrid(x,y)
data = [[xx[i][j],yy[i][j]] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n)]
iters = []
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]):
    rt = dfp(fun,gfun,hess,np.array(data[i]))
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>] <=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 提出迭代次数过大的</span>
        iters.append(rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i%<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> i,rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]

plt.hist(iters,bins=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>)
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'DFP迭代次数分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'迭代次数'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'频率分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li></ul>

BFGS算法及其Python实现

BFGS算法与DFP步骤基本相同,区别在于更新公式的差异 

Bk+1=BkBkBkykyTkBkyTkBkyk+sksTksTkyksTkyk0sTkyk>0

BFGS算法的Python实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">bfgs</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,hess,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#功能:用BFGS族算法求解无约束问题:min fun(x)</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输入:x0是初始点,fun,gfun分别是目标函数和梯度</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输出:x,val分别是近似最优点和最优解,k是迭代次数  </span>
    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e5</span>
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    n = np.shape(x0)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#海森矩阵可以初始化为单位矩阵</span>
    Bk = eye(n)<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#np.linalg.inv(hess(x0)) #或者单位矩阵np.eye(n)</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k < maxk:
        gk = gfun(x0)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(gk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*np.linalg.solve(Bk,gk)
        m = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        mk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用Armijo搜索求步长</span>
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0)+sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>

        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#BFGS校正</span>
        x = x0 + rho**mk*dk
        sk = x - x0
        yk = gfun(x) - gk   

        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.dot(sk,yk) > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:    
            Bs = np.dot(Bk,sk)
            ys = np.dot(yk,sk)
            sBs = np.dot(np.dot(sk,Bk),sk) 

            Bk = Bk - <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*Bs.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*Bs/sBs + <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*yk.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*yk/ys

        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 = x

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#分别是最优点坐标,最优值,迭代次数 </span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li></ul>

实际性能 
这里写图片描述
相关代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">iters = []
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]):
    rt = bfgs(fun,gfun,hess,np.array(data[i]))
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>] <=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>:
        iters.append(rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i%<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> i,rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]

plt.hist(iters,bins=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>)
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'BFGS迭代次数分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'迭代次数'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'频率分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>

Broyden族算法及其Python实现

之前的BFGS和DFP校正都是由 yk Bksk (或  sk Hkyk )组成的秩2矩阵。而Broyden族算法采用了BFGS和DFP校正公式的凸组合,如下: 

Hϕk+1=ϕkHBFGSk+1+(1ϕk)HDFPk+1=HkHkykyTkHkyTkHkyk+sksTksTkyk+ϕkvkvTk

其中 ϕk[0,1] vk 由下式定义: 
vk=yTkHkyk(skyTkskHkykyTkHkyk)

Broyden族算法Python实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">broyden</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,hess,x0)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#功能:用Broyden族算法求解无约束问题:min fun(x)</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输入:x0是初始点,fun,gfun分别是目标函数和梯度</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#输出:x,val分别是近似最优点和最优解,k是迭代次数</span>
    x0 = np.array(x0)

    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e5</span>
    rho = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>;
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>;
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>
    phi = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.5</span>;
    k=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;
    n = np.shape(x0)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]

    Hk = np.linalg.inv(hess(x0))

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k<maxk :
        gk = gfun(x0)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(gk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>*np.dot(Hk,gk)

        m=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;mk=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用Armijo搜索求步长</span>
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rho**m*dk) < fun(x0)+sigma*rho**m*np.dot(gk,dk):
                mk = m
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#Broyden族校正</span>
        x = x0 + rho**mk*dk
        sk = x - x0
        yk = gfun(x) - gk

        Hy = np.dot(Hk,yk)
        sy = np.dot(sk,yk)
        yHy = np.dot(np.dot(yk,Hk),yk)

        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span>(sy < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.2</span> *yHy):
            theta = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.8</span>*yHy/(yHy-sy)
            sk = theta*sk + (<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>-theta)*Hy
            sy = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.2</span>*yHy

        vk = np.sqrt(yHy)*(sk/sy-Hy/yHy)
        Hk = Hk - Hy.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*Hy/yHy +sk.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*sk/sy + phi*vk.reshape((n,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>))*vk
        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 = x
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#分别是最优点坐标,最优值,迭代次数</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li></ul>

实际性能 
这里写图片描述
相关代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">iters = []
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]):
    rt = broyden(fun,gfun,hess,np.array(data[i]))
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>] <=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">200</span>:
        iters.append(rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i%<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> i,rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]

plt.hist(iters,bins=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>)
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'Broyden迭代次数分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'迭代次数'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'频率分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li></ul>

L-BFGS算法及其Python实现

拟牛顿法(如BFGS算法)需要计算和存储海森矩阵,其空间复杂度是 n2 ,当 n 很大时,其需要的内存量是非常大的。为了解决该内存问题,有限内存BFGS(即传说中的L-BFGS算法)横空出世。

基本BFGS算法的校正公式可以改写成 

Hk+1=(IskyTksTkyk)Hk(IyksTksTkyk)+sksTksTkyk

ρk=1/sTkyk ,以及 Vk=(IρkyksTk) ,则上式可以写成 
Hk+1=VTkHkVk+ρksksTk

给定一个初始矩阵 H0 (其取值后面有提到),则由上式的递推关系可以得到

H1=VT0H0V0+ρ0s0sT0

H2=VT1H1V1+ρ1s1sT1=VT1(VT0H0V0+ρ0s0sT0)V1+ρ1s1sT1=VT1VT0H0V0V1+VT1ρ0s0sT0V1+ρ1s1sT1


更一般的,我们有

Hm+1=(VTKVTm1VT1VT0)H0(V0V1Vm1Vk)+(VTmVTm1VT2VT1)ρ0s0sT0(V1V2Vm1Vk)+(VTmVTm1VT3VT2)ρ1s1sT1(V2V3Vm1Vk)++(VTmVTm1)ρm2sm2sTm2(Vm1Vk)VTkρm1sm1sTm1Vk+ρmsmsTm

在上式的右边中, H0 是由我们设置的,其余变量可由保存的最近 m 次迭代所形成的向量序列,如位移向量序列 {sk} 和一阶导数差所形成的向量序列 {yk} 获得。也就是说,可由最近 m 次迭代的信息计算当前的海森矩阵的近似矩阵。

补充几点

  • H0=IsTmymyTmym ,第一次迭代时,序列{ sk }和{ yk }为空,则 H0=I
  • 最初的若干次迭代, 序列{ sk }和{ yk }的元素个数较小,会有 n<m ,则将 Hm+1 计算公式右边的 m 换成 n 即可。
  • 随着迭代次数增加, 序列{ sk }和{ yk }的元素个数增大,会有 n>m 。由于我们只需要最近 m 次迭代产生的序列元素,所以序列{ sk }和{ yk }只需要保存最新的 m 个元素即可,如果再有新的元素进入,则同时扔掉最旧的元素,以保证序列元素个数恒为 m

这就是L-BFGS算法的思想。聪明的同学会问:你上面的公式不还是要计算海森矩阵的近似矩阵吗?那岂不是还是需要 n2 规模的内存? 
其实在实际中是不计算该矩阵的, 而且不是采用上面的方法,而是直接得到 Hkgk ,而搜索方向就是 dk=Hkgk 。下面给出了计算的函数twoloop(s,y, ρ ,gk)的伪代码,可知该函数内部的计算仅限于标量和向量,未出现矩阵。

函数参数 s,y 即向量序列{ sk }和{ yk }, ρ 为元素序列{ ρk },其元素 ρk=1/sTkyk gk 是向量,为当前的一阶导数,输出为向量 z=Hkgk ,即搜索方向的反方向 

Functiontwoloop(s,y,ρ,gk)q=gkFori=k1,k2,,kmαi=ρisTiqq=qαiyiHk=yTk1sk1/yTk1yk1z=HkqFori=km,km+1,,k1βi=ρiyTizz=z+si(αiβi)Returnz

给出L-BFGS的算法

可以看到其搜索方向 dk 是根据函数 twoloop 计算得到的。 
这里写图片描述

L-BFGS算法Python实现

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;"><span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">twoloop</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(s, y, rho,gk)</span>:</span>

    n = len(s) <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#向量序列的长度</span>

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.shape(s)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] >= <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>:
        <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#h0是标量,而非矩阵</span>
        h0 = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*np.dot(s[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],y[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])/np.dot(y[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>],y[-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">else</span>:
        h0 = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>

    a = empty((n,))

    q = gk.copy() 
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n - <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>, -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>): 
        a[i] = rho[i] * dot(s[i], q)
        q -= a[i] * y[i]
    z = h0*q

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n):
        b = rho[i] * dot(y[i], z)
        z += s[i] * (a[i] - b)

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> z   

<span class="hljs-function" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">def</span> <span class="hljs-title" style="box-sizing: border-box;">lbfgs</span><span class="hljs-params" style="color: rgb(102, 0, 102); box-sizing: border-box;">(fun,gfun,x0,m=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">5</span>)</span>:</span>
    <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># fun和gfun分别是目标函数及其一阶导数,x0是初值,m为储存的序列的大小</span>
    maxk = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2000</span>
    rou = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.55</span>
    sigma = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0.4</span>
    epsilon = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1e-5</span>
    k = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
    n = np.shape(x0)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#自变量的维度</span>

    s, y, rho = [], [], []

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> k < maxk :
        gk = gfun(x0)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.linalg.norm(gk) < epsilon:
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>

        dk = -<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>*twoloop(s, y, rho,gk)

        m0=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>;
        mk=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">while</span> m0 < <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">20</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># 用Armijo搜索求步长</span>
            <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> fun(x0+rou**m0*dk) < fun(x0)+sigma*rou**m0*np.dot(gk,dk):
                mk = m0
                <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">break</span>
            m0 += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>


        x = x0 + rou**mk*dk
        sk = x - x0
        yk = gfun(x) - gk   

        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.dot(sk,yk) > <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#增加新的向量</span>
            rho.append(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.0</span>/np.dot(sk,yk))
            s.append(sk)
            y.append(yk)
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> np.shape(rho)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>] > m: <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#弃掉最旧向量</span>
            rho.pop(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>)
            s.pop(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>)
            y.pop(<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>)

        k += <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>
        x0 = x

    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">return</span> x0,fun(x0),k<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">#分别是最优点坐标,最优值,迭代次数</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li><li style="box-sizing: border-box; padding: 0px 5px;">18</li><li style="box-sizing: border-box; padding: 0px 5px;">19</li><li style="box-sizing: border-box; padding: 0px 5px;">20</li><li style="box-sizing: border-box; padding: 0px 5px;">21</li><li style="box-sizing: border-box; padding: 0px 5px;">22</li><li style="box-sizing: border-box; padding: 0px 5px;">23</li><li style="box-sizing: border-box; padding: 0px 5px;">24</li><li style="box-sizing: border-box; padding: 0px 5px;">25</li><li style="box-sizing: border-box; padding: 0px 5px;">26</li><li style="box-sizing: border-box; padding: 0px 5px;">27</li><li style="box-sizing: border-box; padding: 0px 5px;">28</li><li style="box-sizing: border-box; padding: 0px 5px;">29</li><li style="box-sizing: border-box; padding: 0px 5px;">30</li><li style="box-sizing: border-box; padding: 0px 5px;">31</li><li style="box-sizing: border-box; padding: 0px 5px;">32</li><li style="box-sizing: border-box; padding: 0px 5px;">33</li><li style="box-sizing: border-box; padding: 0px 5px;">34</li><li style="box-sizing: border-box; padding: 0px 5px;">35</li><li style="box-sizing: border-box; padding: 0px 5px;">36</li><li style="box-sizing: border-box; padding: 0px 5px;">37</li><li style="box-sizing: border-box; padding: 0px 5px;">38</li><li style="box-sizing: border-box; padding: 0px 5px;">39</li><li style="box-sizing: border-box; padding: 0px 5px;">40</li><li style="box-sizing: border-box; padding: 0px 5px;">41</li><li style="box-sizing: border-box; padding: 0px 5px;">42</li><li style="box-sizing: border-box; padding: 0px 5px;">43</li><li style="box-sizing: border-box; padding: 0px 5px;">44</li><li style="box-sizing: border-box; padding: 0px 5px;">45</li><li style="box-sizing: border-box; padding: 0px 5px;">46</li><li style="box-sizing: border-box; padding: 0px 5px;">47</li><li style="box-sizing: border-box; padding: 0px 5px;">48</li><li style="box-sizing: border-box; padding: 0px 5px;">49</li><li style="box-sizing: border-box; padding: 0px 5px;">50</li><li style="box-sizing: border-box; padding: 0px 5px;">51</li><li style="box-sizing: border-box; padding: 0px 5px;">52</li><li style="box-sizing: border-box; padding: 0px 5px;">53</li><li style="box-sizing: border-box; padding: 0px 5px;">54</li><li style="box-sizing: border-box; padding: 0px 5px;">55</li><li style="box-sizing: border-box; padding: 0px 5px;">56</li><li style="box-sizing: border-box; padding: 0px 5px;">57</li><li style="box-sizing: border-box; padding: 0px 5px;">58</li><li style="box-sizing: border-box; padding: 0px 5px;">59</li><li style="box-sizing: border-box; padding: 0px 5px;">60</li><li style="box-sizing: border-box; padding: 0px 5px;">61</li><li style="box-sizing: border-box; padding: 0px 5px;">62</li><li style="box-sizing: border-box; padding: 0px 5px;">63</li><li style="box-sizing: border-box; padding: 0px 5px;">64</li><li style="box-sizing: border-box; padding: 0px 5px;">65</li><li style="box-sizing: border-box; padding: 0px 5px;">66</li><li style="box-sizing: border-box; padding: 0px 5px;">67</li><li style="box-sizing: border-box; padding: 0px 5px;">68</li></ul>

实际性能 
这里写图片描述

相关代码:

<code class="language-python hljs  has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">n = <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">50</span>
x = y = np.linspace(-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>,<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">10</span>,n)
xx,yy = np.meshgrid(x,y)
data = [[xx[i][j],yy[i][j]] <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n) <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> j <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> range(n)]

iters = []
<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">for</span> i <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">in</span> xrange(np.shape(data)[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>]):
    rt = lbfgs(fun,gfun,data[i])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>] <=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1000</span>:
        iters.append(rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>])
    <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">if</span> i%<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span> == <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>:
        <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">print</span> i,rt[<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">2</span>]

plt.hist(iters,bins=<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">100</span>)
plt.title(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'L-BFGS法迭代次数分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.xlabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'迭代次数'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})
plt.ylabel(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">u'频率分布'</span>,{<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontname'</span>:<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'STFangsong'</span>,<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">'fontsize'</span>:<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">18</span>})</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li><li style="box-sizing: border-box; padding: 0px 5px;">11</li><li style="box-sizing: border-box; padding: 0px 5px;">12</li><li style="box-sizing: border-box; padding: 0px 5px;">13</li><li style="box-sizing: border-box; padding: 0px 5px;">14</li><li style="box-sizing: border-box; padding: 0px 5px;">15</li><li style="box-sizing: border-box; padding: 0px 5px;">16</li><li style="box-sizing: border-box; padding: 0px 5px;">17</li></ul>

参考文献

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值