虽然其实指数的一些股指指标个人还是觉得用投资组合的计算方式更好,比如一个index pe 那最好是用 各成分的earning_yield * weighting再倒数,这么着搞一个调和平均数(harmonic mean)去得到。但是针对国内指数的日权重可没那么容易搞到(主要中证的tradable share ratio)其实不算公开数据(虽然说明书里说是通过公开数据得到)。暴力求一个也是可以的。
下面放一个pe_ttm的计算吧。其实主要想放的还是一些手法备用…比如抓到Mysql数据之后,如何搞成dataFrame然后进行进一步运算这些个。对对,还是算备忘,个人还是喜欢做一些常用包或者软件备忘,这么着记得牢些,做新东西速度也快些。
大概思路:
1. 拿到每日成分。
2. 拿每日市值,木有市值的股票整个踢掉。*
3. 从*的单子里拿各股票pe_ttm: pe_ttm: 的计算对应 母公司归属净利润ttm, 取最近季度。如果trading_day当天 - 最近季度日期 > 366则这个元素在分子分母都剔除。
4. 得到最后的股票单子做个指数pe暴力计算。
各种库的到位
import rqdatac
from rqdatac import *
rqdatac.init("research", "research", ('q.r.com', 16003))
import os
from urllib.parse import urlparse
import yaml
from pymysql import connect,cursors
import pandas as pd
import numpy as np
def get_mysql_connection(url):
results = urlparse(url)
host = results.hostname
port = results.port
user = results.username
pwd = results.password
database = results.path[1:]
return connect(host=host, port=port, user=user, password=pwd, db=database,charset='utf8',cursorclass=cursors.DictCursor)
dest_url_day = "mysql+pymysql://rrrrr:rrrr@192.168.0.12:3306/fd_test"
dest_url_quarter = "mysql+pymysql://rrrrr:rrrr@192.168.0.14:3306/financials"
dest_conn_day = get_mysql_connection(dest_url_day)
dest_conn_quarter = get_mysql_connection(dest_url_quarter)
接下来是要用的指数:
testing_date='20160104'
lasting_date='20140104'
testing_year='20151231'
lasting_year='20141231'
index_50 = tuple(index_components('000016.XSHG',date=testing_date))
index_300 = tuple(index_components('000300.XSHG',date=testing_date))
submission_one = index_300
最后开算。其实说白了还是sql写到位。不过看单位的技术,是喜欢抓出来之后用Python筛。
觉得我这个好,嘿嘿,省内存。
#pe_ttm part:
sql_cap = '''select STOCKCODE,TRADEDATE,market_cap from day
where STOCKCODE in {} and TRADEDATE = {} '''.format(submission_one,testing_date) #list index_50 needs to get changed to tuple
text_market_cap = pd.DataFrame()
df = pd.read_sql(sql_cap, dest_conn_day)
market_cap_list = text_market_cap.append(df)
market_cap_list = market_cap_list.dropna(axis=0,how='any') #remove NaN marketCap and get the list_1
list_1 = tuple(market_cap_list.STOCKCODE)
sql_pettm = '''select a.STOCKCODE,a.END_DATE,a.np_parent_company_ownersTTM from research_quarter a
inner join
(
select b.STOCKCODE,max(b.END_DATE) as Ending from research_quarter b
where b.STOCKCODE in {} and b.END_DATE between {} and {} and (b.ANNOUNCE_DATE < {} or b.ANNOUNCE_DATE is null)
group by b.STOCKCODE
) as d
on a.STOCKCODE = d.STOCKCODE and a.END_DATE = d.Ending'''.format(list_1,lasting_date,testing_date,testing_date)
text_pettm = pd.DataFrame()
df = pd.read_sql(sql_pettm, dest_conn_quarter)
pettm_list = text_pettm.append(df)
pettm_list = pettm_list.dropna(axis=0,how='any') #remove NaN marketCap and get the list_2
list_final_for_marketCap = tuple(pettm_list.STOCKCODE)
#market_cap.market_cap.sum()
sql_cap = '''select STOCKCODE,TRADEDATE,market_cap from day
where STOCKCODE in {} and TRADEDATE = {} '''.format(list_final_for_marketCap,testing_date) #list index_50 needs to get changed to tuple
text_market_cap = pd.DataFrame()
df = pd.read_sql(sql_cap, dest_conn_day)
market_cap_final = text_market_cap.append(df)
market_cap_final = market_cap_final.dropna(axis=0,how='any') #remove NaN marketCap and get the list_1
if len(market_cap_final.STOCKCODE) / len(submission_one) >= 0.8:
print(testing_date,"pe_ttm",market_cap_final.market_cap.sum()/pettm_list.np_parent_company_ownersTTM.sum())