豆瓣TOP250的电影爬取与导出

最新推荐文章于 2024-10-12 12:26:23 发布

GK赫然

最新推荐文章于 2024-10-12 12:26:23 发布

阅读量76

点赞数

文章标签： python 娱乐

本文链接：https://blog.csdn.net/qq_34677066/article/details/132759328

版权

本文介绍了使用Python的requests和BeautifulSoup库爬取豆瓣电影Top250列表中的电影标题，保存到文本文件中。

摘要由CSDN通过智能技术生成

import requests
from bs4 import BeautifulSoup
with open("C:\\Users\\29130\\Desktop\\豆瓣前250电影.txt", 'a', encoding="utf-8") as b:
    head = {'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                          "(KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.69"}
    count = 1
    for num in range(0, 250, 25):
        response = requests.get(f"https://movie.douban.com/top250?start={num}&filter=", headers=head, verify=False).text
        soup = BeautifulSoup(response, "html.parser")
        titles = soup.findAll('span', attrs={"class": "title"})
        for title in titles:
            if '/' not in title.string:
                b.write(f'{count}.{title.string}\n')
                count += 1