首先整理下两个概念:
Definition:given that u , v ∈ R n u,v∈R^n u,v∈Rn,then u u u and v v v are said to be mutually orthogonal if ( u , v ) = u T v = 0 (u,v)=u^Tv=0 (u,v)=uTv=0(where ( u , v u,v u,v)is our notation for the scalar product)
Definition:given that
u
,
v
∈
R
n
u,v∈R^n
u,v∈Rn,then
u
u
u and
v
v
v are said to be mutually conjugate with respect to a symmetric positive definite matrix
A
A
A if
u
u
u and
A
v
Av
Av are mutually orthogonal,i.e.
u
T
A
v
=
(
u
,
A
v
)
=
0
u^TAv=(u,Av)=0
uTAv=(u,Av)=0
所以根据定义来看很容易理解共轭梯度下降,其实就是两个向量之间插入了一个正定矩阵,然后依然以乘积为0作为判断依据.
------------------------------------所谓的共轭梯度体现在
β
k
\beta_k
βk上面---------------------------------------------------------------------
关于
β
k
\beta_k
βk的证明来自[5],证明过程中体现了共轭的概念.
---------------共轭梯度法在论文中的表述形式------------------
-------------------------------------------------------------
上面的1964年版本的
α
k
\alpha_k
αk(也就是本文的
β
k
\beta_k
βk)的推导过程[5]:
算法伪代码比较见[4]
---------------------------------------------------------------接下来是代码------------------------------------------------------------------------------
接下来是代码:
[2]中给出了具体代码,但是呢…既然scipy里面已经有这个共轭梯度算法了,我们就不要自己手写捣腾了吧,嘻嘻~
它们分别是:
scipy.optimize.fmin_cg函数
以及
scipy.sparse.linalg.cg函数
但是后者没有给出具体的example,
我们还是使用前者吧,代码如下:
import numpy as np
args = (2, 3, 7, 8, 9, 10) #原函数的参数
def f(x, *args):#原函数
u, v = x
a, b, c, d, e, f = args
return a*u**2 + b*u*v + c*v**2 + d*u + e*v + f
def gradf(x, *args):
u, v = x
a, b, c, d, e, f = args
gu = 2*a*u + b*v + d # u-component of the gradient
gv = b*u + 2*c*v + e # v-component of the gradient
return np.asarray((gu, gv))
x0 = np.asarray((0, 0)) # Initial guess.
#下面开始调包计算~!
from scipy import optimize
res1 = optimize.fmin_cg(f, x0, fprime=gradf, args=args)
print("最小值所在的向量是:",res1)
-----------------------------------------------------------------------------------------
二次终结法 (quadratic temination)并不是说迭代两次就结束,千万不要弄错.
另外有个很重要的资料[6],我最后没看,但是写得真的很详细,以后有时间并且有需要的情况下,可以看下.
[1]Conjugate Directions
[2]共轭梯度法的python实现
[3]官方案例
[4]Fletcher-Reevers Conjugate Descent和Steepest Descent两种算法伪代码的区别
[5]最优化共轭梯度法
[6]《An Introduction to the Conjugate Gradient Method Without the Agonizing Pain》-Jonathan Richard Shewchuk