我的代码如下:importurllib.requestimporthttp.cookiejarurl_a="https://www.zhihu.com/"url_a="https://www.zhihu.com/explore"url_b="https://www.zhihu.com/signup?next=%2F"head...
我的代码如下:
import urllib.request
import http.cookiejar
url_a="https://www.zhihu.com/"
url_a="https://www.zhihu.com/explore"
url_b="https://www.zhihu.com/signup?next=%2F"
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134"}
data={"username":"***",
"password":"***"}
data_post=urllib.parse.urlencode(data).encode("utf_8")
req=urllib.request.Request(url_b,data=data_post,headers=headers)
cookie=http.cookiejar.CookieJar()
opener=urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cookie))
print(cookie)
opener.open(req)
print(cookie)
req=urllib.request.Request(url_a,headers=headers)
resq=opener.open(req)
print(cookie)
file=open("D://Desktop//txt.txt","wb")
file.write(resq.read())
file.close()
问题描述,如果爬取的目标网址是url_a="https://www.zhihu.com/explore",那么能正常爬取发现,但是如果目标网址是url_a="https://www.zhihu.com/",就不行了,这是怎么回事
展开