工具介绍
import pandas as pd
import matplotlib.pyplot as plt
import tushare as ts
pro = ts.pro_api()
import math
plt.rcParams["font.sans-serif"]="SimHei"
import numpy as np
首先需要导入pandas和numpy模块完成数据分析的工作,还需要matplolib进行画图,最后需要获取指数的相关数据,这里导入了tushare数据包,tushare官网有数据包使用的详细教程。
构建指数相关性表格
df = pd.DataFrame(columns=["上证综指","深证成指","中小板指","创业板指","科创50","上证50","沪深300","中证500"])
list_code = ["000001.SH","399001.SZ","399005.SZ","399006.SZ","000688.SH","000016.SH","399300.SZ","399905.SZ"]
list_index = ["上证综指","深证成指","中小板指","创业板指","科创50","上证50","沪深300","中证500"]
for each in range(len(list_code)):
df1 = pro.index_daily(ts_code=list_code[each],start_date=20200315,end_date=20210315)
df1.sort_values(by="trade_date",ascending=True,inplace=True)
list1 = df1["close"].tolist()
list_return = []
for i in range(len(list1)):
if i < len(list1)-1:
r = math.log(list1[i+1]/list1[i])
list_return.append(r)
df[list_index[each]] = list_return
corr = df.corr()
利用tushare获取指数的日线行情,这里获取了近一年的日线行情。之后利用公式计算出每日收益率。最后利用pandas里的corr函数可以直接返回相关系数表格。如下图所示:
利用相关性表格画出热力图
fig = plt.figure(figsize=(10,6))
cmap = plt.cm.OrRd
plt.imshow(corr,cmap=cmap)
plt.colorbar()
industry = corr.columns.values
tickmarks = np.arange(len(industry))
plt.xticks(tickmarks,industry,rotation=90)
plt.yticks(tickmarks,industry)
首先设定好画布大小和热力图颜色,颜色的方案在CSDN很多文章都有介绍。这里使用OrRd的配色。绘制效果如下图: