python爬取小说（有注释，超简单）

最新推荐文章于 2024-05-01 21:57:08 发布

醉世老翁

最新推荐文章于 2024-05-01 21:57:08 发布

阅读量1.7k

点赞数

分类专栏： python 文章标签： python 爬虫

本文链接：https://blog.csdn.net/wojiuwangla/article/details/86223463

版权

from pyquery import PyQuery as pq
import requests


#输入保存到本地的文件名
filename = input("Please input the name you want to save: ")
#提供小说的编号，https://www.biqukan.com/0_790/   提供0_790就行，输入0_790
book_url = input("Please input url of this novel: ")

#获取小说文本内容
def get_txt(url):
    #获取url的返回值
    response = requests.get(url)
    #获取网页内容
    tmp_title = pq(response.text)
    #通过pyquery的特性，直接取页面中的h1标签文本内容，这个就是小说当前章节标题
    title = tmp_title("h1").text()
    #获取小说内容，#content，是通过id名取值，  .是通过类名，这里是通过id名，获取content的文本内容，当前章节的内容
    content = tmp_title("#content").text()
    #返回标题然后换行后内容
    return title + "\n" + content

#定义download函数
def download(url):
    #打开文件，编码为utf-8，别名叫file
    with open(filename,"a+",encoding="utf-8") as file:
        #调用get_txt这个函数，并且把download函数接收到的url传给get_txt函数，然后把返回的小说标题和内容写入到文件中

最低0.47元/天解锁文章

醉世老翁

关注

0
点赞
踩
17

收藏

觉得还不错? 一键收藏
2
评论
python爬取小说（有注释，超简单）

from pyquery import PyQuery as pqimport requests#输入保存到本地的文件名filename = input("Please input the name you want to save: ")#提供小说的编号，https://www.biqukan.com/0_790/ 提供0_790就行，输入0_790book_url = in...
复制链接

扫一扫