获取金融数据
pandas_datareader是一个从Pandas拆分出来的一个模块,可以用来获取谷歌.雅虎财经的数据,
但是今年这雅虎财经借口进行了改动,现在不能正常使用.
tushare是一个财经数据获取的包,可以通过pip安装,文档http://tushare.org/classifying.html
可以获取一些简单的股票基金数据,并不能满足一些专业的需求
import tushare as ts
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# 通过tushare读取上证50指数
df_sz = ts.get_k_data('sz50',start='2012-01-01')
print df_sz.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1346 entries, 0 to 1345
Data columns (total 7 columns):
date 1346 non-null object
open 1346 non-null float64
close 1346 non-null float64
high 1346 non-null float64
low 1346 non-null float64
volume 1346 non-null float64
code 1346 non-null object
dtypes: float64(5), object(2)
memory usage: 84.1+ KB
None
# 后五行数据
print df_sz.tail()
date open close high low volume code
860 2017-07-13 2565.59 2600.95 2601.47 2563.22 42871574.0 sz50
861 2017-07-14 2600.76 2622.98 2623.32 2593.43 33653285.0 sz50
862 2017-07-17 2627.37 2631.41 2663.28 2594.42 60865949.0 sz50
863 2017-07-18 2619.23 2624.31 2637.53 2596.12 39435040.0 sz50
864 2017-07-19 2621.11 2657.89 2661.66 2618.37 48688897.0 sz50
df_sz = df_sz.set_index('date')
# 对收盘价格进行可视化
%matplotlib inline
df_sz['close'].plot(figsize=(12,6))
plt.grid(True)
上证50
对数收益率
# 对数收益率
%time
df_sz['d_ret'] = np.log(df_sz['close'] / df_sz['close'].shift(1))
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 12.2 µs
%matplotlib inline
df_sz[['close','d_ret']].plot(subplots=True, style='g', figsize=(10,7))
指数收益率
杠杆效应
波动性和市场收益是负相关,市场下跌时波动性上升.
滚动平均线
# 滚动平均曲线
# pandas提供了一些计算滚动值的内建方法,Series.rolling(window=22,center=False).mean()/max()/min()/corr()
df_sz['42d'] = df_sz['close'].rolling(window=42).mean()
df_sz['252d'] = df_sz['close'].rolling(window=252).mean()
%matplotlib inline
df_sz[['close','42d','252d']].plot(figsize=(12,6))
收益率的滚动标准差
# 对数收益率的,移动标准差
df_sz['d_ret_std'] = df_sz['d_ret'].rolling(window=42).std()
%matplotlib inline
df_sz[['close','d_ret_std','d_ret']].plot(subplots=True,figsize=(10,7))
上图中,杠杆假设,市场下降,波动率上升,上涨时波动率下降