python爬取知乎回答书单_python爬虫爬取知乎问题下全部回答

水平有限,大神看到给改进下

我就不一一解释了,直接上代码

import requests,json,re

path=r'C:\Users\Administrator\Desktop\\'

headers ={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',

"Referer":"https://www.mzitu.com",

"cookie": "__guid=183739808.265989272285087840.1598944863859.3015; monitor_count=2"

}

url="https://www.zhihu.com/question/311171163"#(打开全部回答,把后面的数字换下就可了,后面还有一处)

response=requests.get(url,headers=headers)

k=re.findall(r'"answerCount":(.*?),',response.text)

s=int(k[0])/5

for i in range(1,int(s)):

k=i*5

response=requests.get("https://www.zhihu.com/api/v4/questions/311171163/answers?include=data%5B%2A%5D.is_normal%2Cadmin_closed_comment%2Creward_info%2Cis_collapsed%2Cannotation_action%2Cannotation_detail%2Ccollapse_reason%2Cis_sticky%2Ccollapsed_by%2Csuggest_edit%2Ccomment_count%2Ccan_comment%2Ccontent%2Ceditable_content%2Cvoteup_count%2Creshipment_settings%2Ccomment_permission%2Ccreated_time%2Cupdated_time%2Creview_info%2Crelevant_info%2Cquestion%2Cexcerpt%2Crelationship.is_authorized%2Cis_author%2Cvoting%2Cis_thanked%2Cis_nothelp%2Cis_labeled%2Cis_recognized%2Cpaid_info%2Cpaid_info_content%3Bdata%5B%2A%5D.mark_infos%5B%2A%5D.url%3Bdata%5B%2A%5D.author.follower_count%2Cbadge%5B%2A%5D.topics&limit=5&offset={}&platform=desktop&sort_by=default".format(k),headers=headers)

d=(response.text)

json_data=json.loads(d)

data_list=json_data["data"]

for i in data_list:

d=(i["author"]["name"]+":")

ref=re.compile(r'

(.*?)

')

reg=re.findall(ref,i["content"])

a="                          "

for k in(d,reg):

f=open(path+"对婚姻死了心.txt","a+",encoding="utf-8")

f.write(str(k))

f.write(a)

f.write("\n")

2f760107cc693a77c5ef03399409cc694508de9d.jpg

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值