看 Python 量化金融投资,摘录的一些统计函数。为了以后更好的查找。
优化问题
import numpy as np
import scipy.optimize as opt
优化问题
无约束优化
-
无约束优化问题是指一个优化问题的寻优可行集合是目标函数自变量的定义域,即没有外部的限制条件。
-
例如,求解优化问题:
- Minimize f ( x ) = x 2 − 4.8 x + 1.2 \text{Minimize } f(x) = x^2 - 4.8x + 1.2 Minimize f(x)=x2−4.8x+1.2
- 为无约束优化问题
-
转为带约束的优化问题
- min f ( x ) = x 2 − 4.8 x + 1.2 \text{min } f(x) = x^2 - 4.8x + 1.2 min f(x)=x2−4.8x+1.2
- s.t x ≥ 0 \text{s.t } x\geq 0 s.t x≥0
-
以 Rosenbrock 函数举例
f ( x ) = ∑ i = 1 N − 1 100 ( x i − x i − 1 2 ) 2 + ( 1 − x i − 1 ) 2 f(x) = \sum^{N-1}_{i=1}100(x_i-x_{i-1}^2 )^2+(1-x_{i-1})^2 f(x)=∑i=1N−1100(xi−xi−12)2+(1−xi−1)2
# Rosenbrock
def rosen(x):
"""
The Rosenbrock Function
"""
return sum(100.0*(x[1:] - x[:-1]**2.)**2.+(1-x[:-1])**2.)
Nelder-Mead 单纯形法
- 单纯形法为运筹学中求解线性规划问题的通用方法,这里的 Nelder-Mead 单纯形法 与其并不相同,只是用到单纯形的概念。
- 设定起始点 x 0 = [ 0.5 , 1.6 , 1.1 , 0.8 , 1.2 ] x_0 = [0.5,1.6,1.1,0.8,1.2] x0=[0.5,1.6,1.1,0.8,1.2] 并行性最小化寻优。
xtol
表示迭代收敛的容忍误差上界。- 其全域最小值位于 x i = 1 x_i=1 xi=1, 数值为 f ( x i ) = 0 f(x_i)=0 f(xi)=0
x_0 = np.array([0.5,1.6,1.1,0.8,1.2])
opt.minimize(rosen,x_0,method='nelder-mead',options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 386
Function evaluations: 622
final_simplex: (array([[1. , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ],
[1. , 0.99999999, 1.00000004, 1.0000001 , 1.00000022],
[0.99999999, 0.99999999, 1.00000004, 1.00000007, 1.00000011],
[0.99999998, 0.99999993, 0.99999991, 0.99999982, 0.9999996 ],
[1.00000008, 1.00000016, 1.00000033, 1.00000067, 1.00000127],
[1.00000002, 1.00000004, 1.00000015, 1.00000033, 1.0000006 ]]), array([4.14390143e-13, 4.94153342e-13, 5.19631652e-13, 5.35708759e-13,
9.80969042e-13, 9.81943884e-13]))
fun: 4.143901431865225e-13
message: 'Optimization terminated successfully.'
nfev: 622
nit: 386
status: 0
success: True
x: array([1. , 1.00000003, 1.00000007, 1.00000017, 1.0000003 ])
- Rosenbrock 函数性能比较好,简单的优化方法就可以处理了。
- 还可以使用 Powell 方法,
method="powell"
。
opt.minimize(rosen,x_0,method='powell',options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 24
Function evaluations: 1618
direc: array([[ 4.55588497e-04, 1.36409332e-03, 2.24683410e-03,
4.12376042e-03, 7.99776305e-03],
[-1.91331747e-03, -3.00268845e-03, -6.76968505e-03,
-1.34778007e-02, -2.66903472e-02],
[-3.76306326e-02, -2.30543912e-02, 1.05016733e-02,
3.42182501e-05, 7.33576548e-05],
[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
1.00000000e+00, 0.00000000e+00],
[ 4.02021587e-06, 1.15777807e-05, 2.01895943e-05,
5.10192097e-05, 1.09407425e-04]])
fun: 2.051360957724131e-21
message: 'Optimization terminated successfully.'
nfev: 1618
nit: 24
status: 0
success: True
x: array([1., 1., 1., 1., 1.])
- 这两种方法并不使用函数的梯度,在略微复杂的情形下收敛速度比较慢。接下来介绍 利用函数梯度寻优。
Broyden-Fletcher-Goldfarb-Shanno
-
Broyden-Fletcher-Goldfarb-Shanno(BFGS) 修正拟牛顿法
-
首先求 Rosenbrock 函数的梯度:
∂ f ∂ x j = ∑ i = 1 N 200 ( x i − x i − 1 ) + ( δ i , j − 2 x i − 1 δ i − 1 , j ) = 200 ( x j − x j − 1 2 ) − 400 x j ( x j + 1 − x j 2 ) − 2 ( 1 − x j ) \begin{aligned}\frac{\partial f}{\partial x_j} &= \sum^{N}_{i=1}200(x_i-x_{i-1})+(\delta_{i,j}-2 x_{i-1}\delta_{i-1,j})\\&=200(x_j -x_{j-1}^2)-400x_j(x_{j+1}-x_j^2)-2(1-x_j)\end{aligned} ∂xj∂f=i=1∑N200(xi−xi−1)+(δi,j−2xi−1δi−1,j)=200(xj−xj−12)−400xj(xj+1−xj2)−2(1−xj)
- { δ i , j = 1 i = j δ i , j = 0 else \begin{aligned}\begin{cases}\delta_{i,j}=1 & i=j \\ \delta_{i,j} =0 &\text{else}\end{cases}\end{aligned} {δi,j=1δi,j=0i=jelse
- 特例:
- ∂ f ∂ x 0 = − 400 x 0 ( x 1 − x 0 2 ) − 2 ( 1 − x 0 ) \frac{\partial f}{\partial x_0} = -400 x_0(x_1-x_0^2)-2(1-x_0) ∂x0∂f=−400x0(x1−x02)−2(1−x0)
- ∂ f ∂ x N − 1 = 200 ( x N − x N − 1 2 ) \frac{\partial f}{\partial x_{N-1}} = 200(x_{N}-x^2_{N-1}) ∂xN−1∂f=200(xN−xN−12)
def rosen_der(x):
xj = x[1:-1]
xj_m1 = x[:-2]
xj_p1 = x[2:]
der = np.zeros_like(x)
der[1:-1] = 200*(xj-xj_m1**2) -400*xj*(xj_p1-xj**2)-2*(1-xj)
der[0] = -400*x[0]*(x[1]-x[0]**2)-2*(1-x[0])
der[-1] = 200*(x[-1]-x[-2]**2)
return der
opt.minimize(rosen,x_0,method='BFGS',jac=rosen_der,options={'disp':True})
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 39
Function evaluations: 47
Gradient evaluations: 47
fun: 1.569191726013783e-14
hess_inv: array([[0.00742883, 0.01251316, 0.02376685, 0.04697638, 0.09387584],
[0.01251316, 0.02505532, 0.04784533, 0.094432 , 0.18862433],
[0.02376685, 0.04784533, 0.09594869, 0.18938093, 0.37814437],
[0.04697638, 0.094432 , 0.18938093, 0.37864606, 0.7559884 ],
[0.09387584, 0.18862433, 0.37814437, 0.7559884 , 1.51454413]])
jac: array([-3.60424798e-06, 2.74743159e-06, -1.94696995e-07, 2.78416205e-06,
-1.40985001e-06])
message: 'Optimization terminated successfully.'
nfev: 47
nit: 39
njev: 47
status: 0
success: True
x: array([1. , 1.00000001, 1.00000002, 1.00000004, 1.00000007])
- 这里 梯度信息的引入 在
minimize
函数中通过jac
指定。 - 可以看到迭代次数比之前小了不少。
Iterations
迭代次数Function evaluations
函数调用次数
牛顿共轭梯度法
- Newton-Conjugate-Gradient algorithm
- 简称 牛顿法
- 收敛速度最快的方法,缺点在于求解 Hessian 矩阵(二阶导数矩阵)
- 大致思路:采用泰勒展开的二阶近似,使用共轭梯度近似 Hessian 矩阵的逆矩阵。
- Rosenbrock 函数 的 Hessian 矩阵元素通式:
H ( i , j ) = ∂ 2 f ∂ x i ∂ x j = 200 ( δ i , j − 2 x i − 1 δ i − 1 , j ) − 400 x i ( δ i + 1 , j − 2 x i δ i , j ) − 400 δ i , j ( x i + 1 , j − x i 2 ) + 2 δ i , j H(i,j) = \frac{\partial^2 f}{\partial x_i\partial x_j} = 200(\delta_{i,j}-2x_{i-1}\delta_{i-1,j})-400x_i(\delta_{i+1,j}-2x_i\delta_{i,j})-400\delta_{i,j}(x_{i+1,j}-x_i^2)+2\delta_{i,j} H(i,j)=∂xi∂xj∂2f=200(δi,j−2xi−1δi−1,j)−400xi(δi+1,j−2xiδi,j)−400δi,j(xi+1,j−xi2)+2δi,j
H ( i , j ) = ( 202 + 1200 x i 2 − 400 x i + 1 ) δ i , j − 400 x i δ i + 1 , j − 400 x i − 1 δ i − 1 , j H(i,j) = (202+1200x_i^2-400x_{i+1})\delta_{i,j}-400x_i\delta_{i+1,j}-400x_{i-1}\delta_{i-1,j} H(i,j)=(202+1200xi2−400xi+1)δi,j−400xiδi+1,j−400xi−1δi−1,j
- 边界条件:
- ∂ 2 f ∂ x 0 2 = 1200 x 0 2 − 400 x 1 + 2 \frac{\partial^2 f}{\partial x_0^2} = 1200x_0^2 -400 x_1 +2 ∂x02∂2f=1200x02−400x1+2
- ∂ 2 f ∂ x 0 ∂ x 1 = − 400 x 0 \frac{\partial^2 f}{\partial x_0\partial x_1} = -400x_0 ∂x0∂x1∂2f=−400x0
- ∂ 2 f ∂ x N − 1 ∂ x N − 2 = − 400 x N − 2 \frac{\partial^2 f}{\partial x_{N-1}\partial x_{N-2}} = -400x_{N-2} ∂xN−1∂xN−2∂2f=−400xN−2
- ∂ 2 f ∂ x N − 1 2 = 200 \frac{\partial^2 f}{\partial x_{N-1}^2} = 200 ∂xN−12∂2f=200
def rosen_hess(x):
H = np.diag(-400*x[:-1],1)+np.diag(-400*x[:-1],-1)
diagonal = np.zeros_like(x)
diagonal[0] = 1200*x[0]**2 -400*x[1]+2
diagonal[-1] = 200
diagonal[1:-1] = 202+1200*x[1:-1]**2 -400*x[2:]
H = H+np.diag(diagonal)
return H
opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hess=rosen_hess,options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 20
Function evaluations: 22
Gradient evaluations: 41
Hessian evaluations: 20
fun: 1.47606641102778e-19
jac: array([-3.62847530e-11, 2.68148992e-09, 1.16637362e-08, 4.81693414e-08,
-2.76999090e-08])
message: 'Optimization terminated successfully.'
nfev: 22
nhev: 20
nit: 20
njev: 41
status: 0
success: True
x: array([1., 1., 1., 1., 1.])
- 对于一些大型的优化问题,Hessian 矩阵将异常大,牛顿法用到的仅是Hessian 矩阵的一个任意向量的乘积。
- 因此只要提供一个向量p,就减少存储的开销。
def rosen_hessp(x,p):
Hp = np.zeros_like(x)
Hp[0] = (1200*x[0]**2-400*x[1]+2)*p[0] -400*x[0]*p[1]
Hp[1:-1] = -400*x[:-2]*p[:-2]+(202+1200*x[1:-1]**2-400*x[2:])*p[1:-1]-400*x[1:-1]*p[2:]
Hp[-1] = -400*x[-2]*p[-2]+200*p[-1]
return Hp
opt.minimize(rosen,x_0,method='Newton-CG',jac=rosen_der,hessp=rosen_hessp,options={'xtol':1e-6,'disp':True})
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 20
Function evaluations: 22
Gradient evaluations: 41
Hessian evaluations: 58
fun: 1.47606641102778e-19
jac: array([-3.62847530e-11, 2.68148992e-09, 1.16637362e-08, 4.81693414e-08,
-2.76999090e-08])
message: 'Optimization terminated successfully.'
nfev: 22
nhev: 58
nit: 20
njev: 41
status: 0
success: True
x: array([1., 1., 1., 1., 1.])
有约束优化问题
-
标准形式为:
min f ( x ) s.t. g i ( x ) ≤ 0 , i = 1 , 2 , ⋯ , m A x = b \begin{aligned}\text{min } &f(x)\\\text{s.t. }&g_i(x)\leq0, \ i=1,2,\cdots, m\\ Ax&=b\end{aligned} min s.t. Axf(x)gi(x)≤0, i=1,2,⋯,m=b -
其中 g 1 , ⋯ , g m : R n → R g_1,\cdots, g_m: \mathbb{R}^n\to \mathbb{R} g1,⋯,gm:Rn→R 为 R n \mathbb{R}^n Rn 空间上的二次可微分的凸函数; A A A 为 p × n p\times n p×n 矩阵且秩 rank ( A ) = p < n \text{rank}(A)=p< n rank(A)=p<n
-
例:
Minimize f ( x , y ) = 2 x y + 2 x − x 2 − 2 y 2 subject to x 3 − y = 0 y − 1 ≥ 0 \begin{aligned}\text{Minimize } &f(x,y) = 2xy+2x-x^2-2y^2\\\text{subject to }& x^3-y = 0\\ y-1\geq0\end{aligned} Minimize subject to y−1≥0f(x,y)=2xy+2x−x2−2y2x3−y=0
def func(x,sign=1.):
"""
objective function
"""
return sign*(2*x[0]*x[1]+2*x[0] -x[0]**2-2*x[1]**2)
def func_deriv(x,sign=1.):
"""
derivative of objective function
"""
dfdx0 = sign * (-2*x[0]+2*x[1]+2)
dfdx1 = sign * (-4*x[1]+2*x[0])
return np.array([dfdx0,dfdx1])
- 其中
sign
表示求解 最大或最小值。进一步定义约束条件:
cons = ({'type':'eq', 'fun':lambda x:np.array([x[0]**3. -x[1]]),\
'jac':lambda x: np.array([3.*x[0]**2.,-1.])},\
{'type':'ineq','fun':lambda x: np.array([x[1]-1]),\
'jac':lambda x: np.array([0,1])})
- 最后使用 SLSQP (sequential Least SQuares Programming optimization algorithm)
opt.minimize(func,[-1.,1.],\
args=(-1.,),\
jac=func_deriv, \
constraints=cons,method='SLSQP',\
options={'disp':True})
Optimization terminated successfully. (Exit mode 0)
Current function value: -1.0000001831052137
Iterations: 9
Function evaluations: 14
Gradient evaluations: 9
fun: -1.0000001831052137
jac: array([-1.99999982, 1.99999982])
message: 'Optimization terminated successfully.'
nfev: 14
nit: 9
njev: 9
status: 0
success: True
x: array([1.00000009, 1. ])
print('无约束优化进行比较')
print('Result of unconstrained optimization')
opt.minimize(func,[-1.,1.],\
args=(-1.,),\
jac=func_deriv, \
method='SLSQP',\
options={'disp':True})
无约束优化进行比较
Result of unconstrained optimization
Optimization terminated successfully. (Exit mode 0)
Current function value: -2.0
Iterations: 4
Function evaluations: 5
Gradient evaluations: 4
fun: -2.0
jac: array([-0., -0.])
message: 'Optimization terminated successfully.'
nfev: 5
nit: 4
njev: 4
status: 0
success: True
x: array([2., 1.])
CVXOPT 解 二次规划问题
- Python 中 除了可以使用
scipy.minimize
工具处理优化问题以外,也有专门处理优化的扩展模块,例如 CVXOPT - 二次规划问题标准形式:
min 1 2 x T P x + q T x s.t. G x ≤ h , A x = b \begin{aligned}\text{min } &\frac{1}{2} x^T P x + q^T x\\\text{s.t. }&Gx\leq h, \ Ax=b\end{aligned} min s.t. 21xTPx+qTxGx≤h, Ax=b
- 例: 求:
min 2 x 2 + x y + y 2 + x + y s.t. x ≥ 0 , y ≥ 0 x + y = 1 \begin{aligned}\text{min } &2x^2 + xy + y^2 + x + y\\ \text{s.t. } &x\geq 0,\ y\geq 0\\& x+y=1\end{aligned} min s.t. 2x2+xy+y2+x+yx≥0, y≥0x+y=1
p = [ 4 1 1 2 ] q = [ 1 1 ] G = [ − 1 0 0 − 1 ] ⋯ \begin{aligned} p &= \begin{bmatrix} 4 & 1 \\ 1 & 2 \end{bmatrix}\\ q &= \begin{bmatrix} 1 & 1\end{bmatrix}\\ G &= \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}\\ \cdots\end{aligned} pqG⋯=[4112]=[11]=[−100−1]
from cvxopt import solvers,matrix
p = matrix([[4., 1.], [1., 2.]])
q = matrix([1., 1.])
G = matrix([[-1.,0.],[0.,-1.]])
h = matrix([0.,0.]) # matrix里区分int和double,所以数字后面都需要加小数点
A = matrix([1., 1.], (1,2)) # A必须是一个1行2列
b = matrix(1.)
sol=solvers.qp(p, q, G, h, A, b)
print(sol['x'])
pcost dcost gap pres dres
0: 1.8889e+00 7.7778e-01 1e+00 3e-16 2e+00
1: 1.8769e+00 1.8320e+00 4e-02 2e-16 6e-02
2: 1.8750e+00 1.8739e+00 1e-03 2e-16 5e-04
3: 1.8750e+00 1.8750e+00 1e-05 6e-17 5e-06
4: 1.8750e+00 1.8750e+00 1e-07 2e-16 5e-08
Optimal solution found.
[ 2.50e-01]
[ 7.50e-01]
- 解得 x = 0.25 , y = 0.75 x=0.25,y=0.75 x=0.25,y=0.75
- 以下是
solvers
内的其他参数:
sol
{'x': <2x1 matrix, tc='d'>,
'y': <1x1 matrix, tc='d'>,
's': <2x1 matrix, tc='d'>,
'z': <2x1 matrix, tc='d'>,
'status': 'optimal',
'gap': 1.0527028380515569e-07,
'relative gap': 5.6144154514915067e-08,
'primal objective': 1.8750000000000182,
'dual objective': 1.8749998947297344,
'primal infeasibility': 2.482534153247273e-16,
'dual infeasibility': 5.3147593337403756e-08,
'primal slack': 0.2500000952702474,
'dual slack': 1.0000000000000035e-08,
'iterations': 4}
投资组合中的应用
- 给出下列3个资产数据
s1
,s2
,b
s1 = [0.,.04,.13,.19,-.15,-.27,.37,.24,-.07,.07,.19,.33,-.05,.22,.23,.06,.32,.19,.05,.17]
s2 = [.07,.13,.14,.43,.67,.64,0.,-.22,.18,.31,.59,.99,-.25,.04,-.11,-.15,-.12,.16,.22,-.02]
b = [.06,.07,.05,.04,.07,.08,.06,.04,.05,.07,.1,.11,.15,.11,.09,.1,.08,.06,.05,.07]
x = np.array([s1,s2,b]) # 资产数据
p = matrix(np.cov(x)) # 求得协方差矩阵
q = matrix([0.,0.,0.])
A = matrix([[1.],[1.],[1.]])
b = matrix([1.])
G = matrix([[-1.,0.,0.,1.,0.,0.,-.113],\
[0.,-1.,0.,0.,1.,0.,-.185],\
[0.,0.,-1.,0.,0.,1.,-.0755]])
h = matrix([0.,0.,0.,1.,1.,1.,-.13])
sol = solvers.qp(p,q,G,h,A,b)
pcost dcost gap pres dres
0: 6.1729e-03 -3.3446e+00 1e+01 2e+00 4e-01
1: 7.4918e-03 -1.5175e+00 2e+00 2e-02 5e-03
2: 9.0497e-03 -7.7600e-02 9e-02 1e-03 3e-04
3: 8.4511e-03 2.0141e-03 6e-03 7e-05 2e-05
4: 7.6044e-03 7.1661e-03 4e-04 3e-07 8e-08
5: 7.5351e-03 7.5293e-03 6e-06 3e-09 8e-10
6: 7.5340e-03 7.5340e-03 6e-08 3e-11 8e-12
Optimal solution found.
print(sol['x'])
[ 5.06e-01]
[ 3.24e-01]
[ 1.69e-01]
- 可见资产1 投资比例为 51%,资产2 投资比例为 32.4%, 资产3 投资比例为 16.9%
sol
{'x': <3x1 matrix, tc='d'>,
'y': <1x1 matrix, tc='d'>,
's': <7x1 matrix, tc='d'>,
'z': <7x1 matrix, tc='d'>,
'status': 'optimal',
'gap': 5.750887504859599e-08,
'relative gap': 7.633272007468671e-06,
'primal objective': 0.007534031792304567,
'dual objective': 0.007533974289443271,
'primal infeasibility': 3.318376136551126e-11,
'dual infeasibility': 8.328707321687642e-12,
'primal slack': 3.811267544311688e-08,
'dual slack': 1.88724298640649e-09,
'iterations': 6}