python网页点击下载按钮_python:如何从下载按钮隐藏链接的网页下载数据?

使用Python自动化从含有隐藏下载链接的网页下载CSV文件。通过分析网页的Network面板获取POST请求的URL和数据,利用requests库构造相同请求以实现数据下载。可以自定义日期等参数来获取不同数据。
摘要由CSDN通过智能技术生成

When click the button shown below, I got a .csv file:

I want to do this automatically using python where I can specify the date etc.

I find here that one can use pandas pd.read_csv to read data from webpage, but first one need to get the right url. However in my case I don't know what the url is.

Besides, I also want to specify the date and the contract etc. myself.

Before asking, I actually tried to the dev tool, I still can't see the url, and I don't know how to make it programatic.

解决方案

The javascript exportData('excel') results in a form that is submitted. By using Chrome devtools and the Network panel, you can figure out the headers and the post data used, and then write a python script to submit an identical http request.

import requests

url = 'http://www.dce.com.cn/publicweb/quotesdata/exportMemberDealPosiQuotesData.html'

formdata = {

'memberDealPosiQuotes.variety':'a',

'memberDealPosiQuotes.trade_type':0,

'contract.contract_id':'all',

'contract.variety_id':'a',

'exportFlag':'excel',

}

response = requests.post(url, data=formdata)

filename = response.headers.get('Content-Disposition').split('=')[-1]

with open(filename, 'wb') as fp:

fp.write(response.content)

It's probably possible to find ways to modify the post data to fetch different data. Either by reverse engineering, by trial and error or by finding some documentation.

For example, you can include fields for year and date:

'year':2017,

'month':3,

'day':20

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值