通过bs4爬取三国演义

最新推荐文章于 2024-06-02 21:49:12 发布

itLaity

最新推荐文章于 2024-06-02 21:49:12 发布

阅读量314

点赞数 2

分类专栏：网络爬虫随笔集文章标签： python request 爬虫

ItLaity

本文链接：https://blog.csdn.net/duyun0/article/details/117051406

版权

网络爬虫随笔集专栏收录该内容

29 篇文章 5 订阅

订阅专栏

爬取三国演义：

import requests
from bs4 import BeautifulSoup  # 新的认知 只能解析首页

url = 'https://www.shicimingju.com/book/sanguoyanyi.html'

response = requests.get(url)  # 获取字符串型的数据
f = open('./sanguo.txt','w',encoding='utf-8')

response.encoding = 'utf-8'
response = response.text
# 数据解析
soup = BeautifulSoup(response, 'lxml')
a_list = soup.select('.book-mulu > ul > li > a')
for a in a_list:
    title = a.string
    url1 = 'https://www.shicimingju.com' + a['href']
    # 对详情页发起请求 获取 章节内容
    page_text = requests.get(url1)
    page_text.encoding = 'utf-8'
    page_text = page_text.text
    soup = BeautifulSoup(page_text, 'lxml')
    # a = soup.xpath('//*[@id="main_left"]/div[1]/div')
    divs = soup.find('div',class_ ='chapter_content')
    com = divs.text
    f.write(title+':'+com+'\n')
    print('保存成功！')
print('结束！')
f.close()

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

itLaity

关注关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
2
评论
通过bs4爬取三国演义

爬取三国演义：import requestsfrom bs4 import BeautifulSoup # 新的认知只能解析首页url = 'https://www.shicimingju.com/book/sanguoyanyi.html'response = requests.get(url) # 获取字符串型的数据f = open('./sanguo.txt','w',encoding='utf-8')response.encoding = 'utf-8'response
复制链接

扫一扫