1.目标
在中财网(https://www.cfi.cn/) 获取给定上市股票、给定年分的第一大股东持股比例,如下图所示:
- 分析xhr请求
查看payload
需要三个参数,但是非常简单哈,contenttype
、jzrq
非常简单,主要是stockid
为什么不是我们熟悉的六位的股票代码呢?
在网站上看到股票代码的页面如下:
从上面的网页源代码中,可以找到对应的stockid
- 将请求转化为
python
代码
import requests,re
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
'Connection': 'keep-alive',
'Referer': 'https://quote.cfi.cn/quote.aspx?actstockid=7&actcontenttype=gdtj&client=pc&searchcode=',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Safari/537.36',
'sec-ch-ua': '"Google Chrome";v="111", "Not(A:Brand";v="8", "Chromium";v="111"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
}
def getTable(stockid,jzrq):
params = {
'stockid': stockid,
'contenttype': 'gdtj',
'jzrq': jzrq,
}
response = requests.get('https://quote.cfi.cn/quote.aspx', params=params, headers=headers)
return response.text
def reg_find(text):
"""
</td><td>23.67%</td><td>
"""
anss = re.findall(r'</td><td>([\d|\.]*)%</td><td>',text)
if len(anss) == 0:
print("error")
exit(0)
return anss[0]
def id2stkid(uid):
params = {
't': '12',
}
response = requests.get('https://quote.cfi.cn/stockList.aspx', params=params, headers=headers)
ans = re.findall(rf"οnclick=\"stock_clickFunc\((\d+),\'{uid}\'\)",response.text)
return ans
if __name__ == "__main__":
codes = ['000001','000002','000008']
for i in codes:
ncode = id2stkid(i)
text = getTable(ncode,'2020-06-30')
ans = reg_find(text)
print(ans)
- 运行截图