python简单爬虫（pycharm）(二)

最新推荐文章于 2024-05-02 04:50:36 发布

qq_43012160

最新推荐文章于 2024-05-02 04:50:36 发布

阅读量969

点赞数

本文链接：https://blog.csdn.net/qq_43012160/article/details/94415018

版权

python简单爬虫（pycharm）(二)

源网页

地址：http://learning.gem5.org/book/part1/building.html
我们来把他的文本，也就是

标签下的东西给爬出来。
比如这一段，注意那句：
“To build gem5,we will use SCons.”
在这里插入图片描述

包的安装

这里选用BeautifulSoup包。
首先打开cmd，进入安装python的文件夹下的script文件夹:
在这里插入图片描述
然后正常的安装：

pip install beautifulsoup4

装完长这样：
在这里插入图片描述

代码

import requests
from bs4 import BeautifulSoup
url = 'http://learning.gem5.org/book/part1/building.html'       #这里的URL就是通过开发者工具找到的网页的请求信息里的Request URL
res = requests.get(url)   #requests后面的方法要根据网页的请求信息来判断
res.encoding='utf-8'      #可加可不加，爬虫结果乱码，可以用这个代码更正
soup = BeautifulSoup(res.text)	 #利用BeautifulSoup对爬到的数据进行分析
for item in soup.select('p'):	 #选出所有<p>标签
        try:
            print(item)		#打印所有<p>标签
        except OSError:
            pass
        continu

运行结果：
在这里插入图片描述
注意那句：
“To build gem5,we will use SCons.”

一个小的python爬虫就做好了。

qq_43012160

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
python简单爬虫（pycharm）(二)

python简单爬虫（pycharm）(二)源网页地址：http://learning.gem5.org/book/part1/building.html我们来把他的文本，也就是标签下的东西给爬出来。比如这一段，注意那句：“To build gem5,we will use SCons.”包的安装这里选用BeautifulSoup包。首先打开cmd，进入安装python的文件夹下...
复制链接

扫一扫