先放输出结果
E:\Python38\python.exe E:/PycharmProjects/test.py
http://www.szse.cn/api/disc/announcement/annList?random=0.16208973259833276
<Response [200]>
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<link href="/maintain/images/favicon.ico" rel="shortcut icon" type="image/x-icon">
<title>深圳证券交易所</title>
<title>50x</title>
<style>
* {
padding: 0;
margin: 0;
}
html,
body {
width: 100%;
height: 100%;
position: relative;
background: #fff;
}
#wrap {
position: absolute;
top: 0;
right: 0;
bottom: 0;
left: 0;
margin: auto;
}
#contImg {
max-width: 100%;
max-height: 100%;
}
</style>
</head>
<body>
<div id="wrap">
<img id="contImg" src="">
</div>
</body>
<script>
(function () {
var img = new Image();
var src = '/maintain/images/50x_b.png';
var wrap = document.getElementById('wrap');
var contImg = document.getElementById('contImg');
var vWidth = window.innerWidth;
var vHeight = window.innerHeight;
img.onload = function () {
window.cartoonIWidth = img.width;
window.cartoonIHeight = img.height;
cartoonHImgOnloaded(vWidth, vHeight, wrap, contImg);
};
img.src = src;
window.cartoonHImgOnloaded = function (vWidth, vHeight, wrap, contImg) {
var wrAspectRatio = cartoonIWidth / cartoonIHeight;
var wrWidth = cartoonIWidth;
var wrHeight = cartoonIHeight;
if (wrWidth < vWidth && wrHeight < vHeight) {
wrap.style.width = wrWidth + 'px';
wrap.style.height = wrHeight + 'px';
contImg.style.height = cartoonIHeight + 'px';
}
if (wrWidth >= vWidth) {
var h = vWidth * .9 / wrAspectRatio;
if (h <= vHeight) {
wrap.style.width = '90%';
contImg.style.height = h + 'px';
wrap.style.height = h + 'px';
}
}
if (wrHeight >= vHeight) {
var h = vHeight * .9;
var w = h * wrAspectRatio;
if (w <= vWidth) {
wrap.style.height = '90%';
contImg.style.height = h + 'px';
wrap.style.width = w + 'px';
}
}
contImg.src = src;
}
})()
</script>
import json
import time
import datetime
import requests
t = time.time()
random = '0.' + str(t).replace(".", '')
url = "http://www.szse.cn/api/disc/announcement/annList?random=" + random
print(url)
headers = {
"Host": "www.szse.cn",
"Referer": "http://www.szse.cn/disclosure/bond/notice/index.html",
"Origin": "http://www.szse.cn",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36",
}
data = {"seDate": ["", ""],
"channelCode": ["bondinfoNotice_disc"],
"smallCategoryId": ["013901"],
"pageSize": 30,
"pageNum": 1
}
response = requests.post(url=url, headers=headers, data=json.dumps(data))
print(response)
print(response.content.decode())
</html>
Process finished with exit code 0
代码
import json
import time
import datetime
import requests
t = time.time()
random = '0.' + str(t).replace(".", '')
url = "http://www.szse.cn/api/disc/announcement/annList?random=" + random
print(url)
headers = {
"Host": "www.szse.cn",
"Referer": "http://www.szse.cn/disclosure/bond/notice/index.html",
"Origin": "http://www.szse.cn",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36",
}
data = {"seDate": ["", ""],
"channelCode": ["bondinfoNotice_disc"],
"smallCategoryId": ["013901"],
"pageSize": 30,
"pageNum": 1
}
response = requests.post(url=url, headers=headers, data=json.dumps(data))
print(response)
print(response.content.decode())
在这里我们可以看到requests请求是成功的,code为200,但返回的页面确实错误的。
经过多次尝试在不断添加headers中的参数
最后在添加"Content-Type": "application/json",
之后成功获得了数据。
我想这也是一种反爬手段,你必须要去请求获得的数据类型,才能获得数据。