利用Tushare进行分析时间序列分析第二弹

最新推荐文章于 2024-07-21 14:23:07 发布

Hungry and Humble

最新推荐文章于 2024-07-21 14:23:07 发布

阅读量120

点赞数

文章标签：机器学习人工智能

本文链接：https://blog.csdn.net/weixin_44026608/article/details/132696923

版权

本文以沪深300指数时间序列为例，首先提取数据，并导入所需要的statsmodels库。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tushare as ts

df  = ts.get_hist_data('hs300','2016-01-01','2018-03-09').sort_index(0)
from scipy import stats
import statsmodels.api as sm
df['close_1'] = df['close'].diff(1)
df['close_2'] = df['close_1'].diff(1)
df[['close']].plot(subplot = Flase,color = 'red',figsize = (12,6))

df['close'].diff(1).plot(subplot = True,figsize=(12,6))

df['close'].diff(2).plot(subplot = True,figsize=(12,6))

diff(1)与diff(2)计算了时间序列的一阶差分和二阶差分

接下来检验一下沪深300指数的收益率是否平稳（调用了自相关函数ACF）：

df2=np.log(df/df.shift(1))
df2=df2.fillna(0) 
df3=data2['close']
m=10 #数	
acf,q,p=sm.tsa.acf(df3,nlags=m,qstat=True)
out=np.c_[range(1,11)，acf[1:]，q，p]	
output=pd.DataFrame(out, columns=['lag’, "ACF","Q","P-value"])
output=output.set_index('lag')	
output

检验多个自相关系数是否同时为0需要用到混成统计量Q（m），可以看到10个滞后，每个P对很小，因此拒绝同时为0的原假设，因此沪深300指数的对数收益率序列相关。

接下来，验证对数收益率的平稳性：

temp = np.array(df3)
model = sm.tsa.AR(temp)
reusults_AR= model.fit()
plt.figure(figsize=(10,4))
plt.plot(temp,'b',label = 'HS')
plt.plot(result_AR.fittedvalues,'r',label = 'AR moedl')
plt.legend()

模型有18阶，即自动生成的AR模型是18阶的。如果需要绘ARIMA模型，把plot里的参数改为result_ARIMA.fittedvalues就好。

当模型特征根的模都小于1时，AR时间序列平稳。

pi,sin,cos=np.pi,np.sin,np.cos
r1=1	
t=np.linspace(0,2*pi,360)
x1=r1*cos(t)
y1=r1*sin(t)	
plt.figure(figsize=(6,6))
plt.plot(x1,y1,'k') #画单位圆
roots =1/results_AR.roots#   注意，这里results_ARroots 是计算的特征方程的解，特征根是解的倒数 for i in range(len(roots)):
    plt.plot(roots[i].real,roots[i].imag,'.r',markersize=8) #画出特征根 
plt.show()

特征根均在半径为1的圆内，因此AR（18）时间序列平稳

绘制PACF图，lags一直到40：

fig=plt.figure(figsize=(20,5))
ax1=fig.addsubplot(111)
fig=sm.graphics.tsa.plot_pacf(temp,lags=40,ax=ax1)

从PACF图的感觉是，AR（3）似乎也是可以的，PACF前三个样本（0、1、2）是显著的，后面的阶数又过大了.

接下来，用信息准则验证：

包括了AIC（赤池信息准则）BIC（贝叶斯信息准则）和H—Q准则

以ARMA（7，0）即AR（7）为例：

arma_mod70=sm.tsa.ARMA(temp,(7,0)).fit()
print arma_mod70.aic,arma_mod70.bic,arma_mod70.hqic

aicList =[] 
bicList =[] 
hqicList=[]
for i in range(1,11):	
    order =(i,0)
    tempModel=sm.tsa.ARMA(temporder).fit() 
    aicList.append(tempModel.aic)     
    bicList.append(tempModel.bic) 
    hqicList.append(tempModelhqic)
plt.figure(figsize=(15,6))	
plt.plot(aicList,'r',label=aicvalue)	
plt.plot(bicList,'b,label=bic value) 
plt.plot(hqicList,'k',label='hqicvalue)
plt.legend(loc=0)

选择三个值最小的阶，由图可以看到，可以选择AR（4）或者AR（8）。

残差序列白噪声：

delta = reaults_AR.fittedvalues - temp[18:]#残差
plt.figure(figsize = (10,6))
plt.plot(delta,'r',label = 'residual error')
plt.legend(loc = 0)

acf,q,p=sm.tsa.acf(delta,nlags=10,qstat=True)
out =np.c_[range(1,11)，acf[1:]，q，p] 
output=pd.DataFrame(out, columns=['lag', "AC","Q","P-value"]) 
output =output.set_index('lag')
output

拒绝原假设，为白噪声。

用自动生成的AR模型做预测

train=temp[:-10]
test=temp[-10:]
output =sm.tsa.AR(train).fit() 
output.predict()
predicts = output.predict(355，364,dynamic=True) 
print len(predicts) 
comp=pd.DataFrame
comp['original'] =temp[-10:]
comp['predict'] =predicts10 comp

comp.plot(figsize= (8,5))

从上面结果可知AR模型的预测结果并不合意。

Hungry and Humble

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
利用Tushare进行分析时间序列分析第二弹

检验多个自相关系数是否同时为0需要用到混成统计量Q（m），可以看到10个滞后，每个P对很小，因此拒绝同时为0的原假设，因此沪深300指数的对数收益率序列相关。模型有18阶，即自动生成的AR模型是18阶的。从PACF图的感觉是，AR（3）似乎也是可以的，PACF前三个样本（0、1、2）是显著的，后面的阶数又过大了.本文以沪深300指数时间序列为例，首先提取数据，并导入所需要的statsmodels库。选择三个值最小的阶，由图可以看到，可以选择AR（4）或者AR（8）。拒绝原假设，为白噪声。
复制链接

扫一扫