老师,我想自己浏览家乡的新闻,先用requests测试一下,于是我写了一个demo
import requests
url = "http://www.pujiang.gov.cn/"
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "zh-CN,zh;q=0.9",
"Cache-Control": "max-age=0",
"Connection": "keep-alive",
"Cookie": "azSsQE5NvspcS=5NlipQPmHQMcPceeSjssjv2D3MsuJCfLBHPwfM2sEJOUgjO24ZcptleEpg0ZqI_UZ6s0zDtg4BaoT9.ddCL1upG; yfx_c_g_u_id_10000066=_ck20090710013115134391159251172; yfx_f_l_v_t_10000066=f_t_1599444091506__r_t_1599444091506__v_t_1599446549310__r_c_0; azSsQE5NvspcT=5UTK8pTeYv99qqqm0RvxmaqlBxPxHD78Qck1und30xVEzcxI68l9G6OyRiXIABechC3Bjoe9VYolcVF3wkNJ6IST5SjXQX5R2NbY7fB_VfJbiCiSuE7fqth92DN3_LjXcKXup8C8mlViCuLBIWL3SCvBOTxfrhOKW.MM3JNbdzU20qOV.5b1IvNCubfK9wBe1np9R1A3yampT_gKwdYm8lgbVP5G3VTPtWqL2yf015X_qFw6ItL_p0k54GE9Wcyxc9",
"Host": "www.pujiang.gov.cn",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"
}
res = requests.get("http://www.pujiang.gov.cn/")
返回了412
于是我又尝试用了selenium
from selenium.webdriver import Chrome
browser = Chrome()
browser.get("http://www.pujiang.gov.cn/")
它是空白页面。。
我又直接这么尝试:
from selenium.webdriver import Chrome
browser = Chrome()
然后手动在浏览器输入网址,还是空白
网上关于412的解释是缺少服务器所需参数,但我在chrome的调试模式下把整个request.headers都复制了。这种网站有解决方案么?