python

最新推荐文章于 2024-10-18 00:00:00 发布

芝士急冻树

最新推荐文章于 2024-10-18 00:00:00 发布

阅读量49

点赞数

文章标签： python 开发语言

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_51945579/article/details/131125599

版权

该文介绍了如何利用Python的BeautifulSoup库和requests库来抓取和解析豆瓣网站上的电影Top250列表信息，首先通过pip安装必要的包，然后设置User-Agent请求头以防止被网站屏蔽，接着发送HTTP请求获取HTML内容，最后使用BeautifulSoup解析网页代码。

摘要由CSDN通过智能技术生成

BeautifulSoup网页内容解析

1.处理相关包：

pip install bs4
pip install lxml

2.导入包：

from bs4 import BeautifulSoup
import requests

3.提取网站：

douban_headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36',
}
url="https://movie.douban.com/top250?start=0&filter="
response=requests.get(url,headers=douban_headers)
content=response.text
soup=BeautifulSoup(content,'html.parser') #解析网页代码

芝士急冻树

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

芝士急冻树 CSDN认证博客专家 CSDN认证企业博客

码龄4年

4: 原创

158万+: 周排名

130万+: 总排名

414: 访问

: 等级

50: 积分

15: 粉丝

0: 获赞

0: 评论

1: 收藏

私信

关注

热门文章

最新评论

python
芝士急冻树: from bs4 import BeautifulSoup import requests url="https://so.gushiwen.cn/mingjus/default.aspx?page=1&tstr=&astr=&cstr=&xstr=" headers={'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'} a=1 info="" for i in range(4): url="https://so.gushiwen.cn/mingjus/default.aspx?page=1&tstr=&astr=&cstr=&xstr=".format(i+1) s = requests.get(url,headers=headers) b = BeautifulSoup(soup.text,'html.parser') poems=b.find('div',class_='sons').find_all('div',class_='cont') for p in poems: p=poem.find_all('a') ju=poem[0] if(len(poem)==1): info+="{0}:{1}\n".format(index,ju.text) else: ren=poem[1] info+="{0}:{1}--{2}\n".format(index,ju.text,ren.text) index=index+1 print(info)
Kafka
CSDN-Ada助手: 不知道云原生入门技能树是否可以帮到你：https://edu.csdn.net/skill/cloud_native?utm_source=AI_act_cloud_native

最新文章

目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。