BeautifulSoup的安装和基本使用方式

最新推荐文章于 2024-07-10 17:28:32 发布

Abvedu

最新推荐文章于 2024-07-10 17:28:32 发布

阅读量1.6k

点赞数 1

分类专栏： Python 文章标签： Python BeautifulSoup

本文链接：https://blog.csdn.net/abvedu/article/details/54845345

版权

Python 专栏收录该内容

23 篇文章 0 订阅

订阅专栏

“BeautifulSoup是用Python写的一个HTML/XML的解析器，它可以很好的处理不规范标记并生成剖析树(parse tree)。它提供简单又常用的导航（navigating），搜索以及修改剖析树的操作。它可以大大节省解析网页的编程时间。”——引用自《BeautifulSoup文档》

1、BeautifulSoup的安装

BeautifulSoup官方网址：http://www.crummy.com/software/BeautifulSoup/，目前的最新版本：beautifulsoup4-4.5.3。

Beautiful Soup 4 通过PyPi发布,所以如果你无法使用系统包管理安装,那么也可以通过 easy_install 或 pip 来安装.包的名字是 beautifulsoup4 ,这个包兼容Python2和Python3.
easy_install beautifulsoup4
或者：pip install beautifulsoup4

【如果你没有安装 easy_install 或 pip ,那你也可以下载beautifulsoup4-4.5.3.tar ,然后通过setup.py来安装：Python setup.py install】

2、BeautifulSoup4的基本使用方式：

先使用from bs4 import BeautifulSoup导入模块（注意：Python对大小写敏感，不要写成beautifulsoup)。
搭配requests模块提取网页的数据。
使用BeautifulSoup产生一个html实例。
应用BeautifulSoup所提供的函数存取html实例的数据，这些函数主要是以html文档的标签为目标进行操作。

3、示例：

from bs4 import BeautifulSoup
import requests
url ='http://www.timeanddate.com/weather/'
html = requests.get(url).text
sp = BeautifulSoup(html,"html.parser")
print("Type:",type(sp))
links = sp.find_all('a')
print("------>",links[10])
print("******",links[10].contents)
print("======",links[10].get('href'))

4、执行结果：

Type: <class 'bs4.BeautifulSoup'>
------> <a href="/custom/site.html">My Units</a>
****** ['My Units']
====== /custom/site.html

Process finished with exit code 0

Abvedu

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
BeautifulSoup的安装和基本使用方式

“BeautifulSoup是用Python写的一个HTML/XML的解析器，它可以很好的处理不规范标记并生成剖析树(parse tree)。它提供简单又常用的导航（navigating），搜索以及修改剖析树的操作。它可以大大节省解析网页的编程时间。”——引用自《BeautifulSoup文档》1、BeautifulSoup的安装BeautifulSoup官方网址：htt
复制链接

扫一扫

专栏目录