Python 的优化问题 Scipy.optimize 和 CVXOPT

最新推荐文章于 2024-10-05 13:51:50 发布

Varalpha

最新推荐文章于 2024-10-05 13:51:50 发布

阅读量2.6k

点赞数 5

分类专栏： Python 文章标签： python scipy optimization cvxopt 线性代数

本文链接：https://blog.csdn.net/varalpha/article/details/106079698

版权

Python 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

看 Python 量化金融投资，摘录的一些统计函数。为了以后更好的查找。

优化问题

- 优化问题

import numpy as np
import scipy.optimize as opt

优化问题

这里讨论的问题全部是凸优化问题，即目标函数为凸函数，其自变量的可行集为凸集（详细定义可见斯坦福大学教材剑桥课件）

无约束优化

无约束优化问题是指一个优化问题的寻优可行集合是目标函数自变量的定义域，即没有外部的限制条件。
例如，求解优化问题：
- $\text{Minimize } f(x) = x^2 - 4.8x + 1.2$
- 为无约束优化问题
转为带约束的优化问题
- $\text{min } f(x) = x^2 - 4.8x + 1.2$
- $\text{s.t } x\geq 0$
以 Rosenbrock 函数举例
$\sum^{N-1}_{i=1}100(x_i-x_{i-1}^2 )^2+(1-x_{i-1})^2$

# Rosenbrock
def rosen(x):
    """
    The Rosenbrock Function
    """
    return sum(100.0*(x[1:] - x[:-1]**2.)**2.+(1-x[:-1])**2.)

Nelder-Mead 单纯形法

单纯形法为运筹学中求解线性规划问题的通用方法，这里的 Nelder-Mead 单纯形法与其并不相同，只是用到单纯形的概念。
设定起始点 $x_0 = [0.5,1.6,1.1,0.8,1.2]$ 并行性最小化寻优。
xtol 表示迭代收敛的容忍误差上界。
其全域最小值位于 $x_i=1$ , 数值为 $f(x_i)=0$

x_0 = np.array([0.5,1.6,1.1,0.8,1.2])
opt.minimize(rosen,x_0,method='nelder-mead',options={'xtol':1e-6,'disp':True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 386
         Function evaluations: 622

 final_simplex: (array([[1.        , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ],
       [1.        , 0.99999999, 1.00000004, 1.0000001 , 1.00000022],
       [0.99999999, 0.99999999, 1.00000004, 1.00000007, 1.00000011],
       [0.99999998, 0.99999993, 0.99999991, 0.99999982, 0.9999996 ],
       [1.00000008, 1.00000016, 1.00000033, 1.00000067, 1.00000127],
       [1.00000002, 1.00000004, 1.00000015, 1.00000033, 1.0000006 ]]), array([4.14390143e-13, 4.94153342e-13, 5.19631652e-13, 5.35708759e-13,
       9.80969042e-13, 9.81943884e-13]))
           fun: 4.143901431865225e-13
       message: 'Optimization terminated successfully.'
          nfev: 622
           nit: 386
        status: 0
       success: True
             x: array([1.        , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ])

Rosenbrock 函数性能比较好，简单的优化方法就可以处理了。
还可以使用 Powell 方法，method="powell"。

opt.minimize(rosen,x_0,method='powell',options={'xtol':1e-6,'disp':True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 24
         Function evaluations: 1618

   direc: array([[ 4.55588497e-04,  1.36409332e-03,  2.24683410e-03,
         4.12376042e-03,  7.99776305e-03],
       [-1.91331747e-03, -3.00268845e-03, -6.76968505e-03,
        -1.34778007e-02, -2.66903472e-02],
       [-3.76306326e-02, -2.30543912e-02,  1.05016733e-02,
         3.42182501e-05,  7.33576548e-05],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
         1.00000000e+00,  0.00000000e+00],
       [ 4.02021587e-06,  1.15777807e-05,  2.01895943e-05,
         5.10192097e-05,  1.09407425e-04]])
     fun: 2.051360957724131e-21
 message: 'Optimization terminated successfully.'
    nfev: 1618
     nit: 24
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])

这两种方法并不使用函数的梯度，在略微复杂的情形下收敛速度比较慢。接下来介绍利用函数梯度寻优。

Broyden-Fletcher-Goldfarb-Shanno

Broyden-Fletcher-Goldfarb-Shanno(BFGS) 修正拟牛顿法
首先求 Rosenbrock 函数的梯度：

$\begin{aligned}\frac{\partial f}{\partial x_j} &= \sum^{N}_{i=1}200(x_i-x_{i-1})+(\delta_{i,j}-2 x_{i-1}\delta_{i-1,j})\\&=200(x_j -x_{j-1}^2)-400x_j(x_{j+1}-x_j^2)-2(1-x_j)\end{aligned}$

$\begin{aligned}\begin{cases}\delta_{i,j}=1 & i=j \\ \delta_{i,j} =0 &\text{else}\end{cases}\end{aligned}$
特例:
- $\frac{\partial f}{\partial x_0} = -400 x_0(x_1-x_0^2)-2(1-x_0)$
- $\frac{\partial f}{\partial x_{N-1}} = 200(x_{N}-x^2_{N-1})$

def rosen_der(x):
    xj = x[1:-1]
    xj_m1 = x[:-2]
    xj_p1 = x[2:]
    der = np.zeros_like(x)
    der[1:-1] = 200*(xj-xj_m1**2) -400*xj*(xj_p1-xj**2)-2*(1-xj)
    der[0] = -400*x[0]*(x[1]-x[0]**2)-2*(1-x[0])
    der[-1] = 200*(x[-1]-x[-2]**2)
    return der

opt.minimize(rosen,x_0,method='BFGS',jac=rosen_der,options={'disp':True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 39
         Function evaluations: 47
         Gradient evaluations: 47

      fun: 1.569191726013783e-14
 hess_inv: array([[0.00742883, 0.01251316, 0.02376685, 0.04697638, 0.09387584],
       [0.01251316, 0.02505532, 0.04784533, 0.094432  , 0.18862433],
       [0.02376685, 0.04784533, 0.09594869, 0.18938093, 0.37814437],
       [0.04697638, 0.094432  , 0.18938093, 0.37864606, 0.7559884 ],
       [0.09387584, 0.18862433, 0.37814437, 0.7559884 , 1.51454413]])
      jac: array([-3.60424798e-06,  2.74743159e-06, -1.94696995e-07,  2.78416205e-06,
       -1.40985001e-06])
  message: 'Optimization terminated successfully.'
     nfev: 47
      nit: 39
     njev: 47
   status: 0
  success: True
        x: array([1.        , 1.00000001, 1.00000002, 1.00000004, 1.00000007])

这里梯度信息的引入在 minimize 函数中通过 jac 指定。
可以看到迭代次数比之前小了不少。
- Iterations 迭代次数
- Function evaluations 函数调用次数

牛顿共轭梯度法

Newton-Conjugate-Gradient algorithm
简称牛顿法
收敛速度最快的方法，缺点在于求解 Hessian 矩阵（二阶导数矩阵）
大致思路：采用泰勒展开的二阶近似，使用共轭梯度近似 Hessian 矩阵的逆矩阵。
Rosenbrock 函数的 Hessian 矩阵元素通式：
$\frac{\partial^2 f}{\partial x_i\partial x_j} = 200(\delta_{i,j}-2x_{i-1}\delta_{i-1,j})-400x_i(\delta_{i+1,j}-2x_i\delta_{i,j})-400\delta_{i,j}(x_{i+1,j}-x_i^2)+2\delta_{i,j}$

$(202+1200x_i^2-400x_{i+1})\delta_{i,j}-400x_i\delta_{i+1,j}-400x_{i-1}\delta_{i-1,j}$

边界条件：
- $\frac{\partial^2 f}{\partial x_0^2} = 1200x_0^2 -400 x_1 +2$
- $\frac{\partial^2 f}{\partial x_0\partial x_1} = -400x_0$
- $\frac{\partial^2 f}{\partial x_{N-1}\partial x_{N-2}} = -400x_{N-2}$
- $\frac{\partial^2 f}{\partial x_{N-1}^2} = 200$

def rosen_hess(x):
    H = np.diag(-400*x[:-1],1)+np.diag(-400*x[:-1],-1)
    diagonal = np.zeros_like(x)
    diagonal[0] = 1200*x[0]**2 -400*x[1]+2
    diagonal[-1] = 200
    diagonal[1:-1] = 202+1200*x[1:-1]**2 -400*x[2:]
    H = H+np.diag(diagonal)
    return H

opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hess=rosen_hess,options={'xtol':1e-6,'disp':True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 22
         Gradient evaluations: 41
         Hessian evaluations: 20
         
     fun: 1.47606641102778e-19
     jac: array([-3.62847530e-11,  2.68148992e-09,  1.16637362e-08,  4.81693414e-08,
       -2.76999090e-08])
 message: 'Optimization terminated successfully.'
    nfev: 22
    nhev: 20
     nit: 20
    njev: 41
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])

对于一些大型的优化问题，Hessian 矩阵将异常大，牛顿法用到的仅是Hessian 矩阵的一个任意向量的乘积。
因此只要提供一个向量p，就减少存储的开销。

def rosen_hessp(x,p):
    Hp = np.zeros_like(x)
    Hp[0] = (1200*x[0]**2-400*x[1]+2)*p[0] -400*x[0]*p[1]
    Hp[1:-1] = -400*x[:-2]*p[:-2]+(202+1200*x[1:-1]**2-400*x[2:])*p[1:-1]-400*x[1:-1]*p[2:]
    Hp[-1] = -400*x[-2]*p[-2]+200*p[-1]
    return Hp

opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hessp=rosen_hessp,options={'xtol':1e-6,'disp':True})

Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 20
         Function evaluations: 22
         Gradient evaluations: 41
         Hessian evaluations: 58

     fun: 1.47606641102778e-19
     jac: array([-3.62847530e-11,  2.68148992e-09,  1.16637362e-08,  4.81693414e-08,
       -2.76999090e-08])
 message: 'Optimization terminated successfully.'
    nfev: 22
    nhev: 58
     nit: 20
    njev: 41
  status: 0
 success: True
       x: array([1., 1., 1., 1., 1.])

有约束优化问题

标准形式为：
$\begin{aligned}\text{min } &f(x)\\\text{s.t. }&g_i(x)\leq0, \ i=1,2,\cdots, m\\ Ax&=b\end{aligned}$
其中 $g_1,\cdots, g_m: \mathbb{R}^n\to \mathbb{R}$ 为 $\mathbb{R}^n$ 空间上的二次可微分的凸函数； $A$ 为 $p\times n$ 矩阵且秩 $\text{rank}(A)=p< n$
例：
$\begin{aligned}\text{Minimize } &f(x,y) = 2xy+2x-x^2-2y^2\\\text{subject to }& x^3-y = 0\\ y-1\geq0\end{aligned}$

def func(x,sign=1.):
    """
    objective function
    """
    return sign*(2*x[0]*x[1]+2*x[0] -x[0]**2-2*x[1]**2)

def func_deriv(x,sign=1.):
    """
    derivative of objective function
    """
    dfdx0 = sign * (-2*x[0]+2*x[1]+2)
    dfdx1 = sign * (-4*x[1]+2*x[0])
    return np.array([dfdx0,dfdx1])

其中 sign 表示求解最大或最小值。进一步定义约束条件：

cons = ({'type':'eq', 'fun':lambda x:np.array([x[0]**3. -x[1]]),\
         'jac':lambda x: np.array([3.*x[0]**2.,-1.])},\
        {'type':'ineq','fun':lambda x: np.array([x[1]-1]),\
        'jac':lambda x: np.array([0,1])})

最后使用 SLSQP (sequential Least SQuares Programming optimization algorithm)

opt.minimize(func,[-1.,1.],\
             args=(-1.,),\
             jac=func_deriv, \
             constraints=cons,method='SLSQP',\
             options={'disp':True})

Optimization terminated successfully.    (Exit mode 0)
            Current function value: -1.0000001831052137
            Iterations: 9
            Function evaluations: 14
            Gradient evaluations: 9

     fun: -1.0000001831052137
     jac: array([-1.99999982,  1.99999982])
 message: 'Optimization terminated successfully.'
    nfev: 14
     nit: 9
    njev: 9
  status: 0
 success: True
       x: array([1.00000009, 1.        ])

print('无约束优化进行比较')
print('Result of unconstrained optimization')
opt.minimize(func,[-1.,1.],\
             args=(-1.,),\
             jac=func_deriv, \
             method='SLSQP',\
             options={'disp':True})

无约束优化进行比较
Result of unconstrained optimization
Optimization terminated successfully.    (Exit mode 0)
            Current function value: -2.0
            Iterations: 4
            Function evaluations: 5
            Gradient evaluations: 4

     fun: -2.0
     jac: array([-0., -0.])
 message: 'Optimization terminated successfully.'
    nfev: 5
     nit: 4
    njev: 4
  status: 0
 success: True
       x: array([2., 1.])

CVXOPT 解二次规划问题

Python 中除了可以使用 scipy.minimize 工具处理优化问题以外，也有专门处理优化的扩展模块，例如 CVXOPT
二次规划问题标准形式：

$\begin{aligned}\text{min } &\frac{1}{2} x^T P x + q^T x\\\text{s.t. }&Gx\leq h, \ Ax=b\end{aligned}$

例：求：
$\begin{aligned}\text{min } &2x^2 + xy + y^2 + x + y\\ \text{s.t. } &x\geq 0,\ y\geq 0\\& x+y=1\end{aligned}$

$\begin{aligned} p &= \begin{bmatrix} 4 & 1 \\ 1 & 2 \end{bmatrix}\\ q &= \begin{bmatrix} 1 & 1\end{bmatrix}\\ G &= \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}\\ \cdots\end{aligned}$

from cvxopt import solvers,matrix

p = matrix([[4., 1.], [1., 2.]])
q = matrix([1., 1.])
G = matrix([[-1.,0.],[0.,-1.]])
h = matrix([0.,0.]) # matrix里区分int和double，所以数字后面都需要加小数点
A = matrix([1., 1.], (1,2)) # A必须是一个1行2列
b = matrix(1.)

sol=solvers.qp(p, q, G, h, A, b)
print(sol['x'])

     pcost       dcost       gap    pres   dres
 0:  1.8889e+00  7.7778e-01  1e+00  3e-16  2e+00
 1:  1.8769e+00  1.8320e+00  4e-02  2e-16  6e-02
 2:  1.8750e+00  1.8739e+00  1e-03  2e-16  5e-04
 3:  1.8750e+00  1.8750e+00  1e-05  6e-17  5e-06
 4:  1.8750e+00  1.8750e+00  1e-07  2e-16  5e-08
Optimal solution found.
[ 2.50e-01]
[ 7.50e-01]

解得 $x = 0.25, y = 0.75$
以下是 solvers 内的其他参数：

sol

{'x': <2x1 matrix, tc='d'>,
 'y': <1x1 matrix, tc='d'>,
 's': <2x1 matrix, tc='d'>,
 'z': <2x1 matrix, tc='d'>,
 'status': 'optimal',
 'gap': 1.0527028380515569e-07,
 'relative gap': 5.6144154514915067e-08,
 'primal objective': 1.8750000000000182,
 'dual objective': 1.8749998947297344,
 'primal infeasibility': 2.482534153247273e-16,
 'dual infeasibility': 5.3147593337403756e-08,
 'primal slack': 0.2500000952702474,
 'dual slack': 1.0000000000000035e-08,
 'iterations': 4}

投资组合中的应用

给出下列3个资产数据
- s1 , s2 , b

s1 = [0.,.04,.13,.19,-.15,-.27,.37,.24,-.07,.07,.19,.33,-.05,.22,.23,.06,.32,.19,.05,.17]
s2 = [.07,.13,.14,.43,.67,.64,0.,-.22,.18,.31,.59,.99,-.25,.04,-.11,-.15,-.12,.16,.22,-.02]
b = [.06,.07,.05,.04,.07,.08,.06,.04,.05,.07,.1,.11,.15,.11,.09,.1,.08,.06,.05,.07]
x = np.array([s1,s2,b]) # 资产数据
p = matrix(np.cov(x)) # 求得协方差矩阵
q = matrix([0.,0.,0.])
A = matrix([[1.],[1.],[1.]])
b = matrix([1.])
G = matrix([[-1.,0.,0.,1.,0.,0.,-.113],\
            [0.,-1.,0.,0.,1.,0.,-.185],\
            [0.,0.,-1.,0.,0.,1.,-.0755]])
h = matrix([0.,0.,0.,1.,1.,1.,-.13])
sol = solvers.qp(p,q,G,h,A,b)

     pcost       dcost       gap    pres   dres
 0:  6.1729e-03 -3.3446e+00  1e+01  2e+00  4e-01
 1:  7.4918e-03 -1.5175e+00  2e+00  2e-02  5e-03
 2:  9.0497e-03 -7.7600e-02  9e-02  1e-03  3e-04
 3:  8.4511e-03  2.0141e-03  6e-03  7e-05  2e-05
 4:  7.6044e-03  7.1661e-03  4e-04  3e-07  8e-08
 5:  7.5351e-03  7.5293e-03  6e-06  3e-09  8e-10
 6:  7.5340e-03  7.5340e-03  6e-08  3e-11  8e-12
Optimal solution found.

print(sol['x'])

[ 5.06e-01]
[ 3.24e-01]
[ 1.69e-01]

可见资产1 投资比例为 51%，资产2 投资比例为 32.4%, 资产3 投资比例为 16.9%

sol

{'x': <3x1 matrix, tc='d'>,
 'y': <1x1 matrix, tc='d'>,
 's': <7x1 matrix, tc='d'>,
 'z': <7x1 matrix, tc='d'>,
 'status': 'optimal',
 'gap': 5.750887504859599e-08,
 'relative gap': 7.633272007468671e-06,
 'primal objective': 0.007534031792304567,
 'dual objective': 0.007533974289443271,
 'primal infeasibility': 3.318376136551126e-11,
 'dual infeasibility': 8.328707321687642e-12,
 'primal slack': 3.811267544311688e-08,
 'dual slack': 1.88724298640649e-09,
 'iterations': 6}