用Python对CAPM和Fama French Three Factor model的初步学习

用Python对CAPM和Fama French Three Factor model的初步学习

概述

本文主要使用CAPM(资本资产定价模型)和Fama French Three Factor model来评估中国股票市场的股票价格。 使用CAPM来选择投资组合,然后建立Fama French Three Factor model来评估投资组合。
首先,使用SML绘制CAPM图表,然后随机选择50只股票将其在图上标记,保存高于该线的股票。 接下来显示这些股票之间的相关性。 之后根据收益率和相关性选择我们的投资组合并分配quantity。
其次,建立Fama French Three Factor model,包括计算三个因素,并使用投资组合的回报率进行回归。 在这一部分中,编写了两个函数来使代码更加清晰和高效。
总的来说,本文有三个部分,介绍,建模和分析,以及代码。

介绍

关于两个模型介绍部分不多赘述。简单看一下公式。
在这里插入图片描述在这里插入图片描述

建模与分析

CAPM

本文用的股票数据来源于Tushare,可以注册获取API。官方学生认证的话可以加2000积分。
首先,引入一些第三方库

import tushare
import requests
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import random
import time
pro = tushare.pro_api('官网注册获取')

无风险利率用的是中国国债利率,找了个均值,如果能有时间序列的无风险利率更好,大家如果知道怎么获取到的话,可以留言。
这里用国证指数来表示市场回报率。

# drew the CAPM picture, we need rf and rm
gzA = pro.index_daily(ts_code='399317.SZ', start_date='20000101', end_date='20190901')
gzA = gzA.sort_values('trade_date', ascending=True)
rf = 0.03064  # 3.064%
rm = gzA['pct_chg'].mean()

这里获取市场上股票列表,有3672支股。

stock_basic = pro.stock_basic()
stock_basic.head()

output:
在这里插入图片描述
随机选取50支股票,计算Beta和收益率,Mark在图上。
并且保存在线上的点。

x = np.linspace(0,2,50)
y = rf + x * (rm - rf)
plt.figure(figsize=(16, 12))
plt.title('CAPM', fontsize = 18)
plt.ylim(0 , 0.4)
plt.xlim(0 , 2)
plt.ylabel('E(ri)')
plt.xlabel('Beta')
plt.plot(x,y, label = 'Security Market Line')
plt.scatter(0 , rf , label = 'Rf' , marker='*', color = 'orange', s = 28 ** 2)
#plt.plot(0 , rf , "yo", label = 'rf')
i = 0
portfolio = pd.DataFrame()
while(i<50):
    rd = random.randint(0 , len(stock_basic)+1)
    ri = pro.daily(ts_code=stock_basic.at[rd,'ts_code'] , start_date='20000101', end_date='20190901')
    beta = np.mean((ri['pct_chg'] - ri['pct_chg'].mean()) * (gzA['pct_chg'] - gzA['pct_chg'].mean())) / gzA['pct_chg'].std() ** 2
    x = beta
    y = ri['pct_chg'].mean()
    if y > rf + x * (rm - rf):
        plt.scatter(x, y, marker='^', color = 'g')
        plt.annotate(stock_basic.at[rd,'ts_code'], xy = (x, y), xytext = (x+0.005, y+0.005))
        #portfolio_list[stock_basic.at[rd,'ts_code']] = stock_basic.at[rd,'name']
        portfolio[stock_basic.at[rd,'ts_code']] = ri['pct_chg'].head(100)
    else:
        plt.scatter(x, y, marker='o', color = 'r')
        plt.annotate(stock_basic.at[rd,'ts_code'], xy = (x, y), xytext = (x+0.005, y+0.005))
    i+=1
plt.legend()
plt.show()

output:
在这里插入图片描述
看一下保存下来股票的相关系数

#Corrilation of the portfolio
plt.figure(figsize=(16, 12))
sns.set_style("whitegrid")
sns.heatmap(portfolio.corr(), cmap='Blues', annot=False, vmin = 0.0, vmax = 1 ,linewidths=1 )

output:
在这里插入图片描述
根据收益率和相关性选取投组,选取规则是:
我们按照收益率对它们进行排序,将它们逐个放入my_portfolio_list,检查最后一只股票和下一只股票之间的相关性,如果它们的相关性大于0.3,那么我们将放弃选取下一个股票,获取下下一支股票,在检查两者相关性,以此类推。

my_portfolio_list = pd.DataFrame()
stock_mean = pd.DataFrame()
corr = portfolio.corr()
for stock in portfolio.columns:
    stock_mean.at[0,stock] = portfolio[stock].mean()
stock_mean = stock_mean.sort_values(0,ascending=False,axis = 1)
number = 0
for stock in stock_mean.columns:
    if number == 0:
        my_portfolio_list.at[0,stock] = stock_mean.at[0,stock]
        number += 1
        last = stock
    else:
        if corr.at[stock,last]>0.3 or stock_mean.at[0,stock]<0:
            pass
        else:
            my_portfolio_list.at[0,stock] = stock_mean.at[0,stock]
            last = stock

my_portfolio_list的输出:
在这里插入图片描述之后,就可以进行简单的投组可视化分析,例如收益率。

plt.figure(figsize=(20, 8))

for p in my_portfolio_list.columns:
    daily_price = pro.daily(ts_code=p , start_date='20190101', end_date='20190901')
    daily_price = daily_price.sort_values('trade_date', ascending=True)
    date = pd.to_datetime(daily_price['trade_date'])
    plt.plot(date , daily_price['pct_chg'],label = p)
plt.legend()
plt.ylabel('Rate of Return')
plt.xlabel('Date')
#pl.xticks(rotation=50)
plt.title('The Rate of Return of My Portfolio', fontsize = 20)

plt.show()

output:
在这里插入图片描述
设置完quantity之后可以看一下market value

quantity = 1000

for p in my_portfolio_list.columns:
    my_portfolio_list.at[1,p] = quantity
    quantity -= 100
my_portfolio_list.index = ['pct_chg','quantity']
my_portfolio_list

output:
在这里插入图片描述

plt.figure(figsize=(10, 6))
for p in my_portfolio_list.columns:
    daily_price = pro.daily(ts_code=p , start_date='20190101', end_date='20190901')
    daily_price = daily_price.sort_values('trade_date', ascending=True)
    plt.plot(pd.to_datetime(daily_price['trade_date']), daily_price['open']*my_portfolio_list.at['quantity',p], label=p)
plt.title('Market Value of My Portfolio', fontsize=20)
plt.ylabel('Market Value')
plt.xlabel('Date')
plt.xticks(rotation=50)
plt.legend()
plt.show()

output:
在这里插入图片描述

Fama French Three-Factor Model

首先还是库的导入。

import tushare
import requests
import pandas as pd
import seaborn as sns
import statsmodels.api as sm
import numpy as np
from statsmodels import regression
import matplotlib.pyplot as plt
import time

pro = tushare.pro_api('')

cal_smb_hml 函数,关于SMB、HML因子的计算。

def cal_smb_hml(stock_data):
    
    stock_data['SB'] = stock_data['circ_mv'].map(lambda x: 'B' if x >= stock_data['circ_mv'].median() else 'S')
    stock_data['BM'] = 1 / stock_data['pb']

    border_down, border_up = stock_data['BM'].quantile([0.3, 0.7])
    stock_data['HML'] = stock_data['BM'].map(lambda x: 'H' if x >= border_up else 'M')
    stock_data['HML'] = stock_data.apply(lambda row: 'L' if row['BM'] <= border_down else row['HML'], axis=1)

    stock_data_SL = stock_data.query('(SB=="S") & (HML=="L")')
    stock_data_SM = stock_data.query('(SB=="S") & (HML=="M")')
    stock_data_SH = stock_data.query('(SB=="S") & (HML=="H")')
    stock_data_BL = stock_data.query('(SB=="B") & (HML=="L")')
    stock_data_BM = stock_data.query('(SB=="B") & (HML=="M")')
    stock_data_BH = stock_data.query('(SB=="B") & (HML=="H")')

    R_SL = (stock_data_SL['pct_chg'] * stock_data_SL['circ_mv'] / 100).sum() / stock_data_SL['circ_mv'].sum()
    R_SM = (stock_data_SM['pct_chg'] * stock_data_SM['circ_mv'] / 100).sum() / stock_data_SM['circ_mv'].sum()
    R_SH = (stock_data_SH['pct_chg'] * stock_data_SH['circ_mv'] / 100).sum() / stock_data_SH['circ_mv'].sum()
    R_BL = (stock_data_BL['pct_chg'] * stock_data_BL['circ_mv'] / 100).sum() / stock_data_BL['circ_mv'].sum()
    R_BM = (stock_data_BM['pct_chg'] * stock_data_BM['circ_mv'] / 100).sum() / stock_data_BM['circ_mv'].sum()
    R_BH = (stock_data_BH['pct_chg'] * stock_data_BH['circ_mv'] / 100).sum() / stock_data_BH['circ_mv'].sum()

    smb = (R_SL + R_SM + R_SH - (R_BL + R_BM + R_BH)) / 3
    hml = (R_SH + R_BH -(R_SL + R_BL) ) / 2
    return smb, hml

get_data函数,获取相关数据。由于tushare权限不高,调用不能过于频繁,只能用try-except续命。

def get_data(st_d, ed_d):
    
    trade_calender = pro.trade_cal(start_date = st_d, end_date = ed_d)
    trade_calender =trade_calender.query('is_open==1')
    
    daily_data = pd.DataFrame()
    basic_data = pd.DataFrame()
    FF_data = pd.DataFrame(index = trade_calender['cal_date'], columns = ['Rm-Rf','SMB','HML','r_mean'])
    rf = 0.03064  # 3.064%
    
    for date in trade_calender['cal_date']:
        try:
            daily_data = pro.daily(trade_date = date)
        except:
            time.sleep(10)
            daily_data = pro.daily(trade_date = date)
        try:
            basic_data = pro.daily_basic(trade_date = date)
        except:
            time.sleep(10)
            basic_data = pro.daily_basic(trade_date = date)
        try:
            gzA = pro.index_daily(ts_code='399317.SZ', trade_date = date)
        except:
            time.sleep(10)
            gzA = pro.index_daily(ts_code='399317.SZ', trade_date = date)
        
        all_data = pd.merge(daily_data, basic_data, on='ts_code', how='inner')
        rm_rf = float((gzA['pct_chg']/100)-rf)
        smb, hml = cal_smb_hml(all_data)
        print(date , rm_rf , smb , hml)
        FF_data.at[date,'SMB'] = smb
        FF_data.at[date,'HML'] = hml
        FF_data.at[date,'Rm-Rf'] = float(rm_rf)
        
        s = requests.session()
        s.keep_alive = False
        
    return FF_data

我们看一下获取近来五个月的数据的output:
在这里插入图片描述
FF_data表:
在这里插入图片描述
然后再导入一下CAPM中选取的投组:
在这里插入图片描述
把FF_data数据补全

for stk in my_portfolio.columns:
    dp = pro.daily(ts_code=stk, start_date='20190401', end_date='20190901')
    dp = dp.sort_values('trade_date', ascending=True)
    dp.index = dp['trade_date']
    FF_data[stk] = dp['pct_chg']/100
FF_data.head()
FF_data['r_mean'] = 0.0

for d in FF_data.index:
    for stk in my_portfolio.columns:
        FF_data.at[d,'r_mean'] += my_portfolio.at[1, stk] /4000 * FF_data.at[d,stk]

output:
在这里插入图片描述
三个因子数据得到之后,同样可以做一些可视化分析。
相关性

sns.heatmap(FF_data.drop('cal_date',axis = 1).corr(), cmap='OrRd', annot=True)

在这里插入图片描述
收益率

import matplotlib.dates as dates
plt.figure(figsize=(24, 7))

for p in FF_data.columns:
    plt.plot(FF_data.index,FF_data[p],label = p)
plt.legend()
plt.ylabel('Rate of Return')
plt.xlabel('Date')
plt.title('The Rate of Return of My Portfolio', fontsize = 20)
plt.xticks(rotation=70)

plt.show()

output:
在这里插入图片描述
由于收益率有些混乱,我们只看一下三因子和投组的收益率。

FF_data = pd.read_csv('FFdata.csv')
FF_data.index = FF_data['cal_date']
FF_data = FF_data.drop('cal_date', axis = 1)
plt.figure(figsize=(24, 7))

for p in FF_data.columns:
    if p == 'r_mean' or p == 'SMB' or p == 'HML' or p == 'Rm-Rf':
        plt.plot(FF_data.index,FF_data[p],label = p)
    else:
        pass
plt.legend()
plt.ylabel('Rate of Return')
plt.xlabel('Date')
plt.title('The Rate of Return of My Portfolio', fontsize = 20)
plt.xticks(rotation=70)
plt.xlim('2019-04-01' , '2019-09-01')

plt.show()

在这里插入图片描述
最后十分重要的回归分析,这里用最小二乘法。

y = np.array((FF_data['r_mean']).astype(float))
y = y[1:len(y)]
Y = y.T
Rm_Rf = np.array(FF_data['Rm-Rf'].astype(float))
Rm_Rf = Rm_Rf[1:len(Rm_Rf)]
SMB = np.array((FF_data['SMB']).astype(float))
SMB = SMB[1:len(FF_data['SMB'])]
HMI = np.array((FF_data['HML']).astype(float))
HMI = HMI[1:len(FF_data['HML'])]
X = np.column_stack((Rm_Rf,SMB,HMI))
X = sm.add_constant(X)
model = sm.OLS(Y, X)
result = model.fit()

result.summary()

output:
在这里插入图片描述
从结果来看,回归效果还不错。

总结

这只是一个非常简单的学习过程,本人不是金融专业,对股票市场还有待进一步了解,如有错误,麻烦大家多多指正。第一次写文章,有很多可以改进的地方,欢迎大家留言讨论。

  • 5
    点赞
  • 50
    收藏
    觉得还不错? 一键收藏
  • 7
    评论
CAPM, or the Capital Asset Pricing Model, is a widely used financial model that quantifies the relationship between an asset's expected return and its risk. It provides a framework for estimating the expected return on an investment by taking into account its beta, which measures the asset's sensitivity to market movements. In MATLAB, you can calculate the CAPM using the following steps: 1. Gather historical data for the asset's returns and a market index (such as S&P 500). 2. Calculate the returns for both the asset and the market index. 3. Estimate the asset's beta by regressing the asset's returns against the market index returns using the "regress" function in MATLAB. 4. Once you have the beta, estimate the asset's expected return using the CAPM formula: Expected Return = Risk-Free Rate + Beta * (Market Return - Risk-Free Rate). Here is some example code in MATLAB to calculate the CAPM: ```matlab % Step 1: Gather historical data assetReturns = [0.05, 0.03, -0.02, 0.04, 0.01]; % Example asset returns marketReturns = [0.06, 0.04, 0.01, 0.02, -0.03]; % Example market returns % Step 2: Calculate returns assetReturns = diff(assetReturns); marketReturns = diff(marketReturns); % Step 3: Estimate beta [beta, ~] = regress(assetReturns', [ones(size(marketReturns')), marketReturns']); % Step 4: Calculate expected return riskFreeRate = 0.02; % Example risk-free rate marketReturn = mean(marketReturns); expectedReturn = riskFreeRate + beta * (marketReturn - riskFreeRate); disp(['Estimated Beta: ', num2str(beta)]); disp(['Expected Return: ', num2str(expectedReturn)]); ``` Note that this is a simplified example, and you may need to adjust it depending on your specific requirements and data availability.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值