使用正则爬取糗事百科段子，并保存为文本

最新推荐文章于 2022-03-30 10:32:24 发布

weixin_46837101

最新推荐文章于 2022-03-30 10:32:24 发布

阅读量226

点赞数

分类专栏：爬虫系列文章标签： python 正则表达式

本文链接：https://blog.csdn.net/weixin_46837101/article/details/106932839

版权

爬虫系列专栏收录该内容

24 篇文章 0 订阅

订阅专栏

使用正则爬取糗事百科段子，并保存为文本

import requests
import re
from fake_useragent import UserAgent

url='https://www.qiushibaike.com/text/page/1/'
headers={
    'User-Agent':UserAgent().random
}

response=requests.get(url,headers=headers)
info=response.text
#使用正则提取
infos=re.findall(r'<div class="content">\s*<span>\s*(.+)\s*</span>', info)
#保存
with open('duanzi.txt','a',encoding='utf-8') as f:
    for info in infos:
        f.write(info + "\n\n\n")

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_46837101

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
使用正则爬取糗事百科段子，并保存为文本

使用正则爬取糗事百科段子，并保存为文本import requestsimport refrom fake_useragent import UserAgenturl='https://www.qiushibaike.com/text/page/1/'headers={ 'User-Agent':UserAgent().random}response=requests.get(url,headers=headers)info=response.text#使用正则提取i.
复制链接

扫一扫