2021-08-30：第一次爬取尝试一下爬取小王子这本书里面的简评

青灯画琉璃

于 2021-08-30 18:28:19 发布

阅读量108

点赞数

分类专栏：学习笔记文章标签： python 爬虫

本文链接：https://blog.csdn.net/weixin_42591391/article/details/120002894

版权

学习笔记专栏收录该内容

24 篇文章 2 订阅

订阅专栏

在这里插入图片描述

代码如下：

import requests                  # 插入request模块
from bs4 import BeautifulSoup    # 插入BeautifulSoup函数
url = 'https://book.douban.com/subject/3693974/'    # 所要爬取的网址
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36'}
res = requests.get(url , headers = headers)         # 爬取到request的内容
soup = BeautifulSoup(res.text,'lxml')               # 得到的内容以text形式给beautifulsoup对象
pattern = soup.find_all('span','short')             # 用find_all寻找到所有的评论所在行，因为评论行的特征是标签span，属性内容是short，find_all 返回的是一个列表
print(res.status_code)

for item in pattern: # item这个对象在 pattern列表中，，然后只要输入对象的string就可以了
    print(item.string)

结果如下：
在这里插入图片描述