BFGS求解凸二次规划
f ( x ) = 1 2 x T A x + b T x g = A x + b f(x)=\frac{1}{2}x^TAx+b^Tx \\ g=Ax+b f(x)=21xTAx+bTxg=Ax+b
def f(x):
return 0.5*np.dot(np.dot(x,A),x)+np.dot(b,x)+c
def g(x):
return np.dot(A,x)+b
采用精确线搜索求步长:
alpha_k = -np.dot(pk.T, g(xk))/np.dot(pk.T,np.dot(A,pk))
样正HK 采用如下公式
from scipy.optimize import minimize
from numpy import *
import numpy
import numpy as np
def minimize_bfgs(f, x0, jac=None,gtol=1e-5,disp=False):
# 第一步,给出初始点x0, Ho, 计算go
gfk = g(x0)
k = 0
N = len(x0)
I = numpy.eye(N, dtype=int)
Hk = I
# 初始化Hessian 矩阵
xk = x0
gnorm = numpy.amax(numpy.abs(gfk))
while (gnorm > gtol):
# 第二步 : 求下降方向
pk = -numpy.dot(Hk, gfk)
# 第三步: 这里使用了精确线搜索确定步长,并更新xk+1
alpha_k = -np.dot(pk.T, g(xk))/np.dot(pk.T,np.dot(A,pk))
xkp1 = xk + alpha_k * pk
# 第四步: 校正Hk
sk = xkp1 - xk
xk = xkp1
gfkp1 = g(xkp1)
yk = gfkp1 - gfk
gfk = gfkp1
k += 1
gnorm = numpy.amax(numpy.abs(gfk))
if (gnorm <= gtol):
break
rhok = 1.0 / (numpy.dot(yk, sk))
# A1 = I - np.dot(sk[:,None],yk[:,None].T)* rhok
A1= I - sk[:, None] * yk[None, :] * rhok
A2 = I - yk[:, None] * sk[None, :] * rhok
Hk = np.dot(A1, np.dot(Hk, A2)) + (rhok * np.dot(sk[:,None] , sk[:,None].T))
# Hk = numpy.dot(A1, numpy.dot(Hk, A2)) + (rhok * sk[:, None] *
# sk[None, :])
# numpy.newaxis是None
# 都是二维np.dot 相当于矩阵乘
print(" Current function value xk:[%f %f]"% (xk[0],xk[1]))
print(" Current function value: %f" % f(xk))
测试如下
if __name__ == '__main__':
x0 = [1, 1]
A = np.array([[1, -1], [-1, 2]])
b = [-2, 0]
g0 = g(x0)
c = 0
print(minimize_bfgs(f, x0, jac=g,gtol=1e-6, disp= True))
res = minimize(f, x0, method='BFGS', jac=g, options={'gtol': 1e-6, 'disp': True})
print(res)
超线性收敛,次于牛顿方法
# sk=np.array([-0.00835751 ,-0.00901759 ,-0.02819486, -0.1701776 ])
# yk=np.array([ -5.30212743 ,3.92364059 , 46.13550229 ,-11.46023375])
# (fun, x0, args=(), jac = None, gtol = 1e-5, disp = False):
# res = minimize(rosen, x0, method='BFGS', jac=rosen_der, options={'gtol': 1e-6, 'disp': True})
# res = minimize(f, x0, method='BFGS', jac=g, options={'gtol': 1e-6, 'disp': True})
# print(res)
# I = numpy.eye(4, dtype=int)
# A1 = I - sk[:, None] * yk[None, :]
# 不能真接用sk*yk.T,先得将数组变为二维的才行,所以等价于
# a=sk[:,None]
# b=yk[:,None]
# print(a)
# print(I-np.dot(a,b.T)==A1)
# *用于array 表示对应元素相乘,np.dot()点乘