python_股票增加控制人与流通股东等筛选条件

本文介绍了如何使用Python编程语言,结合优矿和巨潮数据,以及akshare库,分别获取并合并证券行业的行业数据、控制人数据和十大流通股东数据。通过编写函数和利用pandas库进行数据处理,将不同来源的字段整合到单一表格中,便于后续的数据分析和处理。
摘要由CSDN通过智能技术生成

目录

写字前面:

结果展示

获取数据

行业数据

控制人数据

 十大流通股东数据

 开始合并

1 从行业数据中提取证券股的行业数据

2 合并控制人数据 

3 合并十大流通股东 

4 把三个结果按列合并


写字前面:

在分析数据的时候,常常需要的字段分散在多个文档里,这时候就需要把这些字段合并到一起方便查看和分析。本文以证券板块的股票为例,整合行业、控制人、十大流通股东这三项数据。

结果展示

获取数据

行业数据

数据来源优矿【具体优矿的使用过程看“python_寻找底部股票”博文】

下载的文件存储为industry_list.csv

控制人数据

 巨潮-》数据-》专题统计-》股本股东-》实际控制人持股变动

可以直接下载

注意:好像非交易日没法打开,那我这就不截网站的图了。

下图是下载后的数据:

 十大流通股东数据

通过akshare获取

akshare文档中  Welcome to AKShare’s Online Documentation! — AKShare 1.10.57 文档

有股票基本面数据的接口

路径:数据字典-》股票数据-》基本面数据-》流通股东

 注意:最好使用python3.8或python3.8以上版本安装akshare包执行接口代码

    import akshare as ak
    pre_dir = r'D:/temp001/'
    code_list = ['000166','000686','000712','000728','000750','000776','000783','002500','002670','002673','002736','002797','002926','002939','002945','300059','600030','600061','600095','600109','600155','600369','600621','600837','600864','600906','600909','600918','600858','600999','601059','601066','601099','601108','601136','601162','601198','601211','601236','601375','601377','601456','601555','601688','601696','601788','601878','601881','601901','601990','601995']
    for code in code_list:
        # 流通股东
        try:
            df = ak.stock_circulate_stock_holder(symbol=code)
            df.to_excel(f'{pre_dir}流通股东_{code}.xlsx', engine='openpyxl')
        except Exception as e:
            print(str(e))
            print(f'{code} 失败')
            pass
        time.sleep(5)

 开始合并

已知:证券股代码列表

1 从行业数据中提取证券股的行业数据

def method_001():
    res_dir = r'D:/temp000/'
    def pd_000(x):
        sec_str = x['secID']
        res_str = sec_str.split('.')[0]
        return res_str
    industry_path = r'D:/temp000/industry_list.csv'
    code_list = ['000166','000686','000712','000728','000750','000776','000783','002500','002670','002673','002736','002797','002926','002939','002945','300059','600030','600061','600095','600109','600155','600369','600621','600837','600864','600906','600909','600918','600858','600999','601059','601066','601099','601108','601136','601162','601198','601211','601236','601375','601377','601456','601555','601688','601696','601788','601878','601881','601901','601990','601995']
    df = pd.read_csv(industry_path,encoding='utf-8')
    df = df.loc[df['isNew']==1].copy()
    df['ticker00'] = df.apply(pd_000,axis=1)
    df00 = df.loc[df['ticker00'].isin(code_list)].copy()
    df00 = df00.loc[:,['ticker00','secShortName','industryName1','industryName2','industryName3']]
    df00.to_excel(res_dir+'industry_000.xlsx',engine='openpyxl')
    pass

执行后结果

2 合并控制人数据 

def method_002():
    res_dir = r'D:/temp003/'
    code_list = ['000166', '000686', '000712', '000728', '000750', '000776', '000783', '002500', '002670', '002673',
                 '002736', '002797', '002926', '002939', '002945', '300059', '600030', '600061', '600095', '600109',
                 '600155', '600369', '600621', '600837', '600864', '600906', '600909', '600918', '600858', '600999',
                 '601059', '601066', '601099', '601108', '601136', '601162', '601198', '601211', '601236', '601375',
                 '601377', '601456', '601555', '601688', '601696', '601788', '601878', '601881', '601901', '601990',
                 '601995']

    control_path = r'D:/temp000/专题统计_实际控制人持股变动.xlsx'
    df = pd.read_excel(control_path,engine='openpyxl')
    df = df.loc[:,['证券代码','实际控制人名称','控股数量(万股)','控股比例(%)','直接控制人名称','控制类型']].copy()
    df = df.loc[df['证券代码'].isin(code_list)].copy()
    df.to_excel(res_dir+'control_000.xlsx',engine='openpyxl')
    pass

执行后结果

3 合并十大流通股东 

先看下每个股票十大流通股东的数据结构

它是每季度十行数据,只取最新一季度的十大流通股东,把十行的股东名称转成一个字符串,彼此之间用分号隔开

def method_003():
    code_list = ['000166', '000686', '000712', '000728', '000750', '000776', '000783', '002500', '002670', '002673',
                 '002736', '002797', '002926', '002939', '002945', '300059', '600030', '600061', '600095', '600109',
                 '600155', '600369', '600621', '600837', '600864', '600906', '600909', '600918', '600858', '600999',
                 '601059', '601066', '601099', '601108', '601136', '601162', '601198', '601211', '601236', '601375',
                 '601377', '601456', '601555', '601688', '601696', '601788', '601878', '601881', '601901', '601990',
                 '601995']
    # pre_dir 十大流通股东文件存储的文件夹
    pre_dir = r'D:/temp000/000/'
    results_list = []
    for code in code_list:
        print(code)
        file_name = f"流通股东_{code}.xlsx"
        file_path = pre_dir + file_name

        df = pd.read_excel(file_path,engine='openpyxl')
        # 取最新的一次十大流通股东
        df = df.iloc[0:10].copy()
        holders = df['股东名称'].values.tolist()
        holders_str = ';'.join(holders)
        date = df.iloc[0]['截止日期']
        results_list.append(
            {
                'ticker':code,
                'date':date,
                'circulation_holders':holders_str
            }
        )
        pass
    res_df = pd.DataFrame(results_list)
    res_df.to_excel('D:/temp003/holders_000.xlsx',engine='openpyxl')
    pass

 执行结果

4 把三个结果按列合并

def method_004():
    one_path = r'D:/temp003/industry_000.xlsx'
    two_path = r'D:/temp003/control_000.xlsx'
    three_path = r'D:/temp003/holders_000.xlsx'
    df_one = pd.read_excel(one_path,engine='openpyxl',converters={'ticker00':str})
    df_two = pd.read_excel(two_path,engine='openpyxl',converters={'证券代码':str})
    df_three = pd.read_excel(three_path,engine='openpyxl',converters={'ticker':str})
    df00 = pd.merge(df_one,df_two,how='inner',left_on='ticker00',right_on='证券代码')
    df01 = pd.merge(df00,df_three,how='inner',left_on='ticker00',right_on='ticker')
    df01.to_excel(r'D:/temp003/res_000.xlsx',engine='openpyxl')
    pass

 最终结果

把数据都合并到一个表后,就可以进行下一步的分析了。 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值