【python】从CSMAR 的数据分析来找最佳的投资经理

最近通过基金经理的历史投资信息,来找最佳的投资经理,思路就是利用四因子模型怼基金经理投资过的股票进行建模,然后筛选出优质的基金,谁投资的优质的基金越多,就说明越好,我把代码放出来分享给大家,写得很简洁:

# imports
import pandas as pd
import statsmodels.formula.api as smf
from tqdm import tqdm 

data_path = 'found_manager_data/Fund_FundManager.csv'
data_X = pd.read_csv(data_path)
data_X.head()

data_X=data_X.dropna()

def get_status(x):
#     print(x)
    x1=x['ServiceStartDate'].split('-')[0]
    x2=x['ServiceEndDate'].split('-')[0]
    return int(x2)-int(x1)
data_X['status']=data_X.apply(lambda x:get_status(x),axis=1)

fund_person = data_X[data_X['status']>=2]
fund_person.head()

fund_person['StartDate']=fund_person['ServiceStartDate'].apply(lambda x:x[:-3])
fund_person['EndDate']=fund_person['ServiceEndDate'].apply(lambda x:x[:-3])
fund_person.head()

fund_person['ServiceStartDate'] = pd.to_datetime(fund_person['ServiceStartDate'])
fund_person['ServiceEndDate'] = pd.to_datetime(fund_person['ServiceEndDate'])

data_path = 'found_manager_data/STK_MKT_CARHARTFOURFACTORS.xlsx'
data_X = pd.read_excel(data_path)
data_X.head()

P9706=data_X[data_X['MarkettypeID']=='P9706']
P9706["RiskPremium2_r2"]=P9706["RiskPremium2"].apply(lambda x:x*x)
print(P9706.shape)
P9706.head()

P9706['TradingMonth']=pd.to_datetime(P9706['TradingMonth'])
Fund_NAV_Month_path = 'found_manager_data/Fund_NAV_Month.csv'

Fund_NAV_Month_data = pd.read_csv(Fund_NAV_Month_path)
Fund_NAV_Month_data.head()


list_results = []
for index,row in tqdm(fund_person.iterrows()):
    start_time = row['ServiceStartDate']
    end_time = row['ServiceEndDate']
    fund_code = row['MasterFundCode']
    fund_data = Fund_NAV_Month_data[Fund_NAV_Month_data['Symbol']==fund_code]

    fund_data['TradingMonth']=pd.to_datetime(fund_data['TradingMonth'])
    
    # Got data within period
    filter_data = fund_data[(fund_data['TradingMonth']>start_time)&(fund_data['TradingMonth']<end_time)]
    
    filter_data_x = P9706[(P9706['TradingMonth']>start_time)&(P9706['TradingMonth']<end_time)]
    
    pdata_merge = pd.merge(P9706,filter_data,left_on='TradingMonth',right_on='TradingMonth')
#     print(pdata_merge.shape)
    if(pdata_merge.shape[0]==0):
        continue
    # train
    train_data = pdata_merge[['RiskPremium2','RiskPremium2_r2','SMB2','HML2','UMD2','ReturnNAV']]
    train_data=train_data.astype(float)
    model = smf.ols(formula='ReturnNAV ~ RiskPremium2+RiskPremium2_r2+SMB2+HML2+UMD2', data=train_data).fit()
    parameters= model.params.tolist()
    list_results.append([row['FullName'],row['MasterFundCode']]+parameters)


results= pd.DataFrame(list_results,columns=['fullName','fund_code','Intercept','RiskPremium2','RiskPremium2_r2','SMB2','HML2','UMD2'])
results.to_csv('parctice_two_results.csv',index=False)

mean_results = results.groupby('fullName')

list_data = []
for name,group in mean_results:
    mean_val = group['Intercept'].mean()
    list_data.append([name,mean_val])
#     break
# mean_results.head()

mean_res=pd.DataFrame(list_data,columns=['full_name','mean_intercept'])

# get top 50 funds by sort intrcept
topk_fund = mean_res.sort_values(by="mean_intercept",ascending=False).head(50)
print(topk_fund.head())
topk_fund.to_csv('top50_fund_manager.csv',index=False)

可能有些人会有疑问,这些数据长什么样子呢?我分享一些demo数据,详细数据还需要自己来弄哈。

Fund_FundManager.csv的数据:

"FundID","MasterFundCode","FullName","ServiceStartDate","ServiceEndDate"
"108426","000001","孙振峰","2012-04-05","2013-06-28"
"108426","000001","倪邈","2014-03-17","2015-11-19"
"108426","000001","李铧汶","2014-03-17","2017-01-13"
"108426","000001","崔同魁","2014-06-20","2015-01-07"
"108426","000001","董阳阳","2015-01-07","2021-02-22"

STK_MKT_CARHARTFOURFACTORS.xlsx的数据:

在这里插入图片描述

Fund_NAV_Month.csv的数据:

"TradingDate","TradingMonth","Symbol","ReturnNAV"
"2012-01-31","2012-01","000001","-0.0056"
"2012-02-29","2012-02","000001","0.0617"
"2012-03-31","2012-03","000001","-0.0444"
"2012-04-27","2012-04","000001","0.0454"
"2012-05-31","2012-05","000001","0.0254"
"2012-06-30","2012-06","000001","-0.0268"

数据是从CSMAR 上下载的哈。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

农民小飞侠

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值