量化投资 — 配对交易策略 (Pair Trading)

本文深入探讨了配对交易策略,包括策略开发思路、交易信号产生、年化收益计算及可视化。通过实例分析了小范围时间内的策略表现,并讨论了时间序列平稳性对策略的影响。同时,指出了策略面临的风险,如Spread不回归、中国市场做空限制、回归系数重平衡以及交易成本问题。
摘要由CSDN通过智能技术生成

配对交易策略 Pair Trading

0. 引库

import pandas as pd
import numpy as np
import tushare as ts
import seaborn
from matplotlib import pyplot as plt
plt.style.use('seaborn')
%matplotlib inline
data = pd.read_csv('pair-trade-data.csv')
data.set_index('date',inplace = True)
data.head()
000568000858
date
2010/1/427.48811826.117536
2010/1/527.33512326.391583
2010/1/626.94170725.694008
2010/1/726.38801124.913389
2010/1/826.82514024.863562
data.plot(figsize=(8, 6));

在这里插入图片描述

2. 策略开发思路

# 价差是回归的(不科学想法)
data['priceDelta'] = data['000568'] - data['000858']
data.head()
000568000858priceDelta
date
2010/1/427.48811826.1175361.370582
2010/1/527.33512326.3915830.943540
2010/1/626.94170725.6940081.247699
2010/1/726.38801124.9133891.474622
2010/1/826.82514024.8635621.961578
# 图示价差及其均值
data['priceDelta'].plot(figsize=(8, 6));
plt.ylabel('Spread')
plt.axhline(data['priceDelta'].mean());

在这里插入图片描述

# 对价差进行标准化
data['zscore'] = (data['priceDelta'] - np.mean(data['priceDelta']))/np.std(data['priceDelta'])
data.head()
000568000858priceDeltazscore
date
2010/1/427.48811826.1175361.3705820.569895
2010/1/527.33512326.3915830.9435400.500520
2010/1/626.94170725.6940081.2476990.549932
2010/1/726.38801124.9133891.4746220.586796
2010/1/826.82514024.8635621.9615780.665903
len(data[data['zscore'] > 1.5])
17
# 'position_1'是000568开平仓信号
data['position_1'] = np.where(data['zscore'] > 1.5, -1, np.nan)
data['position_1'] = np.where(data['zscore'] < -1.5, 1, data['position_1'])
data['position_1'] = np.where(abs(data['zscore']) < 0.5, 0, data['position_1'])
data.head()
000568000858priceDeltazscoreposition_1
date
2010/1/427.48811826.1175361.3705820.569895NaN
2010/1/527.33512326.3915830.9435400.500520NaN
2010/1/626.94170725.6940081.2476990.549932NaN
2010/1/726.38801124.9133891.4746220.586796NaN
2010/1/826.82514024.8635621.9615780.665903NaN
产生交易信号
data['position_1'] = data['position_1'].ffill().fillna(0)
data['position_1'].plot(ylim=[-1.1, 1.1], figsize=(10, 6));

在这里插入图片描述

# 'position_2'是000858开平仓信号(与000568符号相反)
data['position_2'] = -np.sign(data['position_1'])
data['position_2'].plot(ylim=[-1.1, 1.1], figsize=(10, 6));

在这里插入图片描述

3. 计算策略年化收益并可视化

data['returns_1'] = (np.log(data['000568'] / data['000568'].shift(1))).fillna(0)
data['returns_2'] = (np.log(data['000858'] / data['000858'].shift(1))).fillna(0)
data.head(10)
000568000858priceDeltazscoreposition_1position_2returns_1returns_2
date
2010/1/427.48811826.1175361.3705820.5698950.0-0.00.0000000.000000
2010/1/527.33512326.3915830.9435400.5005200.0-0.0-0.0055810.010438
2010/1/626.94170725.6940081.2476990.5499320.0-0.0-0.014497-0.026787
2010/1/726.38801124.9133891.4746220.5867960.0-0.0-0.020766-0.030852
2010/1/826.82514024.8635621.9615780.6659030.0-0.00.016430-0.002002
2010/1/1125.93631124.6310371.3052740.5592850.0-0.0-0.033696-0.009396
2010/1/1226.40986725.3369161.0729510.5215430.0-0.00.0180940.028255
2010/1/1326.57743325.1376091.4398240.5811430.0-0.00.006325-0.007897
2010/1/1428.42066026.1092312.3114280.7227380.0-0.00.0670540.037924
2010/1/1528.25309426.2088852.0442090.6793270.0-0.0-0.0059130.003810
data['strategy'] = 0.5*(data['position_1'].shift(1) * data['returns_1']) + 0.5*(data['position_2'].shift(1) * data['returns_2'])
# 计算累积收益率
data[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).tail(1)
returns_1returns_2strategy
date
2019/4/82.4701583.8376510.986754
# 可视化累积收益率
data[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6));

在这里插入图片描述

Pair trading 策略 - 小范围时间(2013.6-2014.12)

data2 = pd.read_csv('pair-trade-data2.csv')
data2.set_index('date',inplace = True)
data2.head()
000568000858
date
2013/6/320.71905620.343053
2013/6/420.35722020.060867
2013/6/520.51454020.274644
2013/6/620.11337420.172031
2013/6/719.70434219.667508
data2.plot(figsize=(8, 6));

在这里插入图片描述

# 价差是回归的(不科学想法)
data2['priceDelta'] = data['000568'] - data['000858']
data2.head()
000568000858priceDelta
date
2013/6/320.71905620.3430530.376004
2013/6/420.35722020.0608670.296353
2013/6/520.51454020.2746440.239896
2013/6/620.11337420.172031-0.058657
2013/6/719.70434219.6675080.036833
# 图示价差及其均值
data2['priceDelta'].plot(figsize=(8, 6));
plt.ylabel('Spread')
plt.axhline(data2['priceDelta'].mean());

在这里插入图片描述

# 对价差进行标准化
data2['zscore'] = (data2['priceDelta'] - np.mean(data2['priceDelta']))/np.std(data2['priceDelta'])
data2.head()
000568000858priceDeltazscore
date
2013/6/320.71905620.3430530.3760040.048513
2013/6/420.35722020.0608670.2963530.000596
2013/6/520.51454020.2746440.239896-0.033369
2013/6/620.11337420.172031-0.058657-0.212979
2013/6/719.70434219.6675080.036833-0.155532
len(data2[data2['zscore'] > 1.5])
40
len(data2[data2['zscore'] < -1.5])
16
# 'position_1'是000568开平仓信号
data2['position_1'] = np.where(data2['zscore'] > 1.5, -1, np.nan)
data2['position_1'] = np.where(data2['zscore'] < -1.5, 1, data2['position_1'])
data2['position_1'] = np.where(abs(data2['zscore']) < 0.5, 0, data2['position_1'])
data2.head()
000568000858priceDeltazscoreposition_1
date
2013/6/320.71905620.3430530.3760040.0485130.0
2013/6/420.35722020.0608670.2963530.0005960.0
2013/6/520.51454020.2746440.239896-0.0333690.0
2013/6/620.11337420.172031-0.058657-0.2129790.0
2013/6/719.70434219.6675080.036833-0.1555320.0
data2['position_1'] = data2['position_1'].ffill().fillna(0)
data2['position_1'].plot(ylim=[-1.1, 1.1], figsize=(10, 6));

在这里插入图片描述

# 'position_2'是000858开平仓信号(与000568符号相反)
data2['position_2'] = -np.sign(data2['position_1'])
data2['position_2'].plot(ylim=[-1.1, 1.1], figsize=(10, 6));

在这里插入图片描述

data2['returns_1'] = (np.log(data2['000568'] / data2['000568'].shift(1))).fillna(0)
data2['returns_2'] = (np.log(data2['000858'] / data2['000858'].shift(1))).fillna(0)
data2.head(10)
000568000858priceDeltazscoreposition_1position_2returns_1returns_2
date
2013/6/320.71905620.3430530.3760040.0485130.0-0.00.0000000.000000
2013/6/420.35722020.0608670.2963530.0005960.0-0.0-0.017618-0.013968
2013/6/520.51454020.2746440.239896-0.0333690.0-0.00.0076980.010600
2013/6/620.11337420.172031-0.058657-0.2129790.0-0.0-0.019749-0.005074
2013/6/719.70434219.6675080.036833-0.1555320.0-0.0-0.020546-0.025329
2013/6/1319.56275419.0125150.5502390.1533340.0-0.0-0.007212-0.033871
2013/6/1419.61781619.0125150.6053010.1864590.0-0.00.0028110.000000
2013/6/1719.25597918.7204230.5355560.1445010.0-0.0-0.018616-0.015482
2013/6/1819.40543418.8531920.5522410.1545390.0-0.00.0077310.007067
2013/6/1919.95605419.2692020.6868520.2355210.0-0.00.0279790.021826
data2['strategy'] = 0.5*(data2['position_1'].shift(1) * data2['returns_1']) + 0.5*(data2['position_2'].shift(1) * data2['returns_2'])
# 计算累积收益率
data2[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).tail(1)
returns_1returns_2strategy
date
2014/12/310.8929550.973471.12623
# 可视化累积收益率
data2[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6));

在这里插入图片描述

# 计算年化收益率
data2[['returns_1','returns_2','strategy']].dropna().mean() * 252
returns_1   -0.073915
returns_2   -0.017554
strategy     0.077608
dtype: float64
# 计算年化风险
data2[['returns_1','returns_2','strategy']].dropna().std() * 252 ** 0.5
returns_1    0.300306
returns_2    0.280425
strategy     0.057016
dtype: float64
# 策略累积收益率
data2['cumret'] = data2['strategy'].dropna().cumsum().apply(np.exp)
# 策略累积最大值
data2['cummax'] = data2['cumret'].cummax()
# 算回撤序列
drawdown = (data2['cummax'] - data2['cumret'])
# 算最大回撤
drawdown.max()
0.03645280148896235

Pair trading 策略 - 考虑时间序列平稳性

import pandas as pd
import numpy as np
import tushare as ts
import seaborn
from matplotlib import pyplot as plt
plt.style.use('seaborn')
%matplotlib inline

1. 数据准备

data3 = pd.read_csv('pair-trade-data2.csv')
data3.set_index('date',inplace = True)
data3.head()
000568000858
date
2013/6/320.71905620.343053
2013/6/420.35722020.060867
2013/6/520.51454020.274644
2013/6/620.11337420.172031
2013/6/719.70434219.667508
data3.plot(figsize=(8,6));

在这里插入图片描述

2. 策略开发思路

data3.corr()  # 协方差矩阵
000568000858
0005681.0000000.552409
0008580.5524091.000000
# 可视化看相关关系
plt.figure(figsize =(10,8))
plt.title('Stock Correlation')
plt.plot(data['000568'], data['000858'], '.');
plt.xlabel('000568')
plt.ylabel('000858')
data.dropna(inplace = True)

在这里插入图片描述

# 对两股票价格做线性回归(白噪声项符合正态分布)
[slope, intercept] = np.polyfit(data3.iloc[:,0], data3.iloc[:,1], 1).round(2)      
slope,intercept 
(0.51, 7.82)
data3['spread'] = data3.iloc[:,1] - (data3.iloc[:,0]*slope + intercept)
data3.head()
000568000858spread
date
2013/6/320.71905620.3430531.956334
2013/6/420.35722020.0608671.858684
2013/6/520.51454020.2746441.992228
2013/6/620.11337420.1720312.094210
2013/6/719.70434219.6675081.798294
data3['spread'].plot(figsize = (10,8),title = 'Price Spread');

在这里插入图片描述

data3['zscore'] = (data3['spread'] - data3['spread'].mean())/data3['spread'].std()
data3.head()
000568000858spreadzscore
date
2013/6/320.71905620.3430531.9563341.452385
2013/6/420.35722020.0608671.8586841.382488
2013/6/520.51454020.2746441.9922281.478078
2013/6/620.11337420.1720312.0942101.551075
2013/6/719.70434219.6675081.7982941.339261
data3['zscore'].plot(figsize = (10,8),title = 'Z-score')
plt.axhline(1.5)
plt.axhline(0)
plt.axhline(-1.5)
<matplotlib.lines.Line2D at 0xcb62632e8>

在这里插入图片描述

产生交易信号
data3['position_1'] = np.where(data3['zscore'] > 1.5, 1, np.nan)
data3['position_1'] = np.where(data3['zscore'] < -1.5, -1, data3['position_1'])
data3['position_1'] = np.where(abs(data3['zscore']) < 0.5, 0, data3['position_1'])
data3['position_1'] = data3['position_1'].ffill().fillna(0)
data3['position_1'].plot(ylim=[-1.1, 1.1], figsize=(10, 6),title = 'Trading Signal_Uptrade');

在这里插入图片描述

data3['position_2'] = -np.sign(data3['position_1'])
data3['position_2'].plot(ylim=[-1.1, 1.1], figsize=(10, 6),title = 'Trading Signal_Downtrade');

在这里插入图片描述

3. 计算策略年化收益并可视化

data3['returns_1'] = np.log(data3['000568'] / data3['000568'].shift(1))
data3['returns_2'] = np.log(data3['000858'] / data3['000858'].shift(1))
data3['strategy'] = 0.5*(data3['position_1'].shift(1) * data3['returns_1']) + 0.5*(data3['position_2'].shift(1) * data3['returns_2'])
# 计算累积收益率
data3[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).tail(1)
returns_1returns_2strategy
date
2014/12/310.8929550.973471.174494
data3[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 8),title = 'Strategy_Backtesting');

在这里插入图片描述

# 计算年化收益率
data3[['returns_1','returns_2','strategy']].dropna().mean() * 252
returns_1   -0.073915
returns_2   -0.017554
strategy     0.105002
dtype: float64
# 计算年化风险
data3[['returns_1','returns_2','strategy']].dropna().std() * 252 ** 0.5
returns_1    0.300306
returns_2    0.280425
strategy     0.068639
dtype: float64
# 策略累积收益率
data3['cumret'] = data3['strategy'].dropna().cumsum().apply(np.exp)
# 策略累积最大值
data3['cummax'] = data3['cumret'].cummax()
# 算回撤序列
drawdown = (data3['cummax'] - data3['cumret'])
# 算最大回撤
drawdown.max()
0.038159777097367176

策略的思考

  1. 对多只ETF进行配对交易,是很多实盘量化基金的交易策略;

策略的风险和问题:

  1. Spread不回归的风险,当市场结构发生重大改变时,用过去历史回归出来的Spread会发生不回归的重大风险;

  2. 中国市场做空受到限制,策略中有部分做空的收益是无法获得的;

  3. 回归系数需要Rebalancing;

  4. 策略没有考虑交易成本和其他成本;


评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值