python爬取知乎问题_python爬取知乎首页问题

最新推荐文章于 2021-03-24 21:41:13 发布

weixin_40003780

最新推荐文章于 2021-03-24 21:41:13 发布

阅读量93

点赞数

文章标签： python爬取知乎问题

我的代码如下：importurllib.requestimporthttp.cookiejarurl_a="https://www.zhihu.com/"url_a="https://www.zhihu.com/explore"url_b="https://www.zhihu.com/signup?next=%2F"head...

我的代码如下：

import urllib.request

import http.cookiejar

url_a="https://www.zhihu.com/"

url_a="https://www.zhihu.com/explore"

url_b="https://www.zhihu.com/signup?next=%2F"

headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134"}

data={"username":"***",

"password":"***"}

data_post=urllib.parse.urlencode(data).encode("utf_8")

req=urllib.request.Request(url_b,data=data_post,headers=headers)

cookie=http.cookiejar.CookieJar()

opener=urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cookie))

print(cookie)

opener.open(req)

print(cookie)

req=urllib.request.Request(url_a,headers=headers)

resq=opener.open(req)

print(cookie)

file=open("D://Desktop//txt.txt","wb")

file.write(resq.read())

file.close()

问题描述，如果爬取的目标网址是url_a="https://www.zhihu.com/explore"，那么能正常爬取发现，但是如果目标网址是url_a="https://www.zhihu.com/"，就不行了，这是怎么回事

展开

weixin_40003780

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python爬取知乎问题_python爬取知乎首页问题

我的代码如下：importurllib.requestimporthttp.cookiejarurl_a="https://www.zhihu.com/"url_a="https://www.zhihu.com/explore"url_b="https://www.zhihu.com/signup?next=%2F"head...我的代码如下：import urllib.requestimport...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。