通过使用Python的Requests和BeautifulSoup库，编写爬虫程序来抓取古诗词并将其保存在文本文件中

最新推荐文章于 2023-10-08 13:20:34 发布

Dream_Bri

最新推荐文章于 2023-10-08 13:20:34 发布

阅读量1.2k

点赞数 1

文章标签： python 爬虫开发语言

本文链接：https://blog.csdn.net/ximu__l/article/details/131696952

版权

以下是一个Python爬虫程序，从古诗词网爬取三首经典古诗，并将其写入桌面的文本文件中。这个程序使用Requests和BeautifulSoup库来实现：

# 导入所需的库
import requests
from bs4 import BeautifulSoup
import os

# 确定爬虫目标URL
url = 'https://www.gushiwen.org/'

# 向目标URL发送GET请求
response = requests.get(url)

# 解析HTML代码
soup = BeautifulSoup(response.content, 'html.parser')

# 通过CSS选择器获取古诗列表
poem_list = soup.select('.main3 .left .sons .cont a')

# 获取前三首古诗的标题和内容
poem_titles = []
poem_contents = []

for i in range(3):
    # 获取古诗的标题
    poem_title = poem_list[i].text.strip()
    poem_titles.append(poem_title)
    
    # 获取古诗的URL
    poem_url = url + poem_list[i].get('href')
    
    # 向古诗的URL发送GET请求
    poem_response = requests.get(poem_url)
    
    # 解析HTML代码
    poem_soup = BeautifulSoup(poem_response.content, 'html.parser')
    
    # 获取古诗的内容
    poem_content = poem_soup.select('.main3 .left .sons .contson')[0].text
    
    poem_contents.append(poem_content.strip())

# 将三首古诗写入文本文件
desktop_path = os.path.expanduser("~") + '/Desktop/'
file_path = desktop_path + 'poems.txt'

with open(file_path, 'w', encoding='utf-8') as f:
    for i in range(3):
        f.write(poem_titles[i] + '\n\n')
        f.write(poem_contents[i] + '\n\n\n')