python 量化爬虫

该代码实现通过Python的requests库从新浪财经获取特定股票的三大财务报表(资产负债表、利润表、现金流量表)以及申万一级行业的数据。通过指定股票代码和报表类型,函数返回DataFrame格式的财务数据;sw_index_data()函数则用于获取申万一级行业的数据。
摘要由CSDN通过智能技术生成

获取三大报表信息

def stock_financial_report_sina(
  stock: str = "600004", symbol: str = "现金流量表"
) -> pd.DataFrame:
  """
  新浪财经-财务报表-三大报表
  https://vip.stock.finance.sina.com.cn/corp/go.php/vFD_BalanceSheet/stockid/600004/ctrl/part/displaytype/4.phtml
  :param stock: 股票代码
  :type stock: str
  :param symbol: choice of {"资产负债表", "利润表", "现金流量表"}
  :type symbol:
  :return: 新浪财经-财务报表-三大报表
  :rtype: pandas.DataFrame
  """
  if symbol == "资产负债表":
    url = f"http://money.finance.sina.com.cn/corp/go.php/vDOWN_BalanceSheet/displaytype/4/stockid/{stock}/ctrl/all.phtml"  # 资产负债表
  elif symbol == "利润表":
    url = f"http://money.finance.sina.com.cn/corp/go.php/vDOWN_ProfitStatement/displaytype/4/stockid/{stock}/ctrl/all.phtml"  # 利润表
  elif symbol == "现金流量表":
    url = f"http://money.finance.sina.com.cn/corp/go.php/vDOWN_CashFlow/displaytype/4/stockid/{stock}/ctrl/all.phtml"  # 现金流量表
  r = requests.get(url)
  temp_df = pd.read_table(BytesIO(r.content), encoding="gb2312", header=None).iloc[
            :, :-2
            ]
  temp_df = temp_df.T
  temp_df.columns = temp_df.iloc[0, :]
  temp_df = temp_df.iloc[1:, :]
  temp_df.index.name = None
  temp_df.columns.name = None
  return temp_df

申万一级行业数据

import requests,json
import pandas as pd


def sw_index_data(sw_code="801010"):
    url = "https://www.swsresearch.com/insWechatSw/swIndex/quotationexportExc"
    params = {"indexCode": "801010"}
    headers = {
        "Content-Type":"application/x-www-form-urlencoded;charset=utf-8",
        "Cookie":"i18next=zh-CN",
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate, br",
        "Accept-Language": "zh-CN,zh;q=0.9",
        "Clienttype": "4",
        "Connection": "keep-alive",
        "Content-Length": "22",
        "Host": "www.swsresearch.com",
        "Origin": "https://www.swsresearch.com",
        "Sec-Ch-Ua-Platform": "Windows",
        "Sec-Fetch-Mode": "cors",
        "Sec-Fetch-Site": "same-origin",
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
    }
    params=json.dumps(params)
    print(params,headers)
    response = requests.post(url, data=params, headers=headers)
    df = pd.read_excel(response.content)
    return df


def sw_index_code():
    url = "https://www.swsresearch.com/institute-sw/api/index_publish/current/"
    params = {"page": "1","page_size": "50","indextype": "一级行业"}
    response = requests.get(url, params=params)
    results = json.loads(response.text)
    print(results)
    df = pd.json_normalize(results["data"], record_path=['results'], meta=['count', 'next','previous'])
    print(len(df))
    return df
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AICVer

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值