在测试数据上计算日收益率和对数收益率,和基于两种收益率的累计收益率,比较了一下两种累计收益率,通过每天日收益率累计计算是在python金融数据分析中看到的,另一种是佐治亚理工公开课cs7646中介绍的一种方法,用当前的价格比上第一天买入的价格,在减去一
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tushare as ts
sz50 = ts.get_k_data('sz50',start='2004-01-01')
sz50.info()
sz50 = sz50.set_index('date')
sz50[:3]
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3269 entries, 0 to 3268
Data columns (total 7 columns):
date 3269 non-null object
open 3269 non-null float64
close 3269 non-null float64
high 3269 non-null float64
low 3269 non-null float64
volume 3269 non-null float64
code 3269 non-null object
dtypes: float64(5), object(2)
memory usage: 204.3+ KB
| open | close | high | low | volume | code |
---|
date | | | | | | |
---|
2004-01-02 | 997.00 | 1011.35 | 1021.57 | 993.89 | 8064650.0 | sz50 |
---|
2004-01-05 | 1008.28 | 1060.80 | 1060.90 | 1008.28 | 14468200.0 | sz50 |
---|
2004-01-06 | 1059.14 | 1075.66 | 1086.69 | 1059.09 | 16991300.0 | sz50 |
---|
%matplotlib inline
sz50['close'].plot(grid = True, figsize=(15,7))

sz50['42d'] = np.round(sz50['close'].rolling(window= 42).mean(), 2)
sz50['252d'] = np.round(sz50['close'].rolling(window= 252).mean(), 2)
sz50.tail()
| open | close | high | low | volume | code | 42d | 252d |
---|
date | | | | | | | | |
---|
2017-06-14 | 2505.06 | 2477.32 | 2505.20 | 2471.62 | 21841108.0 | sz50 | 2393.60 | 2287.56 |
---|
2017-06-15 | 2474.26 | 2461.97 | 2481.80 | 2453.13 | 20702545.0 | sz50 | 2395.84 | 2288.77 |
---|
2017-06-16 | 2454.84 | 2452.79 | 2466.69 | 2448.31 | 16518044.0 | sz50 | 2398.15 | 2290.02 |
---|
2017-06-19 | 2455.03 | 2484.12 | 2486.31 | 2453.35 | 20594004.0 | sz50 | 2401.17 | 2291.39 |
---|
2017-06-20 | 2489.20 | 2474.43 | 2492.22 | 2467.77 | 17771153.0 | sz50 | 2404.43 | 2292.67 |
---|
%matplotlib inline
sz50[['close','42d','252d']].plot(grid=True, figsize=(15,7))

sz50['42-252'] = sz50['42d'] - sz50['252d']
sz50[['close','42d','252d','42-252']].head()
| close | 42d | 252d | 42-252 |
---|
date | | | | |
---|
2004-01-02 | 1011.35 | NaN | NaN | NaN |
---|
2004-01-05 | 1060.80 | NaN | NaN | NaN |
---|
2004-01-06 | 1075.66 | NaN | NaN | NaN |
---|
2004-01-07 | 1086.30 | NaN | NaN | NaN |
---|
2004-01-08 | 1102.66 | NaN | NaN | NaN |
---|
sz50[['close','42d','252d','42-252']].tail()
| close | 42d | 252d | 42-252 |
---|
date | | | | |
---|
2017-06-14 | 2477.32 | 2393.60 | 2287.56 | 106.04 |
---|
2017-06-15 | 2461.97 | 2395.84 | 2288.77 | 107.07 |
---|
2017-06-16 | 2452.79 | 2398.15 | 2290.02 | 108.13 |
---|
2017-06-19 | 2484.12 | 2401.17 | 2291.39 | 109.78 |
---|
2017-06-20 | 2474.43 | 2404.43 | 2292.67 | 111.76 |
---|
SD = 50
sz50['signal'] = np.where(sz50['42-252'] > SD, 1, 0)
sz50['signal'] = np.where(sz50['42-252'] < -SD, -1, sz50['signal'])
sz50['signal'].value_counts()
-1 1363
1 1201
0 705
Name: signal, dtype: int64
%matplotlib inline
sz50['signal'].plot(grid=True,figsize=(10,5))

sz50['market'] = (sz50['close']/sz50['close'].shift(1))- 1.0
sz50['log_market'] = np.log(sz50['close']/sz50['close'].shift(1))
sz50['income'] = sz50['signal'].shift(1) * sz50['market']
sz50['log_income'] = sz50['signal'].shift(1) * sz50['log_market']
%matplotlib inline
sz50['income'].plot(grid=True, figsize=(10,6))
sz50['log_income'].plot(grid=True, figsize=(10,6),alpha=0.5,c='red')

%matplotlib inline
sz50[['market','log_market','income','log_income']].cumsum().plot(grid=True, figsize=(15,7))

sz50['accu_returns'] = (sz50['market'][:-1]*(1.0+sz50['market'].shift(1)[1:])
%matplotlib inline
sz50[['accu_returns']].plot(grid=True, figsize=(10,6))

acc_market = sz50['market'].cumsum()
acc_log_market = sz50['log_market'].cumsum()
acc_market.plot(label='market accu_returns',figsize=(15,7))
acc_log_market.plot(label='log market accu_returns')
sz50['accu_returns'].plot(label='accu_reutnrs')
plt.grid()
plt.legend()
plt.show()

三者的升降的趋势很明显是一致的
在07年到08年之间,accu_returns的变化非常巨大,因为这个累计收益是和04年1月02日的收盘价格来计算的(每天的收盘价除以04年的买入价)
market_accu_returns和log_market_accu_returns是把每天的收益累加计算得到