BeautifulSoup模块使用

最新推荐文章于 2022-07-11 18:03:05 发布

lxccc9

最新推荐文章于 2022-07-11 18:03:05 发布

阅读量77

点赞数

分类专栏：笔记 python

本文链接：https://blog.csdn.net/lxccc9/article/details/118364493

版权

笔记同时被 2 个专栏收录

22 篇文章 1 订阅

订阅专栏

python

11 篇文章 1 订阅

订阅专栏

1.安装BeautifulSoup模块

pip install beautifulsoup4

2.文件中引入

from bs4 import BeautifulSoup

3.使用BeautifulSoup

with open('./tests/python.html',encoding='utf-8') as f:
    texts = f.read()

bs = BeautifulSoup(texts,'html.parser')
print(bs.title)
# 获取节点文本
print(bs.title.text)
# 获取节点名称
print(bs.title.name)
# 取父节点名称
print(bs.title.parent.name)
# 取出所有的子节点
print(bs.p.children)


print(list(bs.p.children))
# 获取节点的属性
print(bs.p['class'])
# 取出所有指定节点
print(bs.find_all('a'))

links = bs.find_all('a')
for link in links:
    print(link['href'])

print(bs.find('a'))

# 按条件查找对象
# 查找p标签中class为titile的标签 因为class是python的关键字，所以要加个下划线class_  find返回时对象
print(bs.find('p',class_ = 'title'))

print(bs.find('p',id = 'title1'))

# 查找所有class为title的标签 find_all 返回时列表
print(bs.find_all(class_='title'))

# 取得文档内的所有文本内容
print(bs.get_text())

# 工整地打印整个代码
print(bs.prettify())

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

lxccc9

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
BeautifulSoup模块使用

1.安装BeautifulSoup模块pip install beautifulsoup42.文件中引入from bs4 import BeautifulSoup3.使用BeautifulSoupwith open('./tests/python.html',encoding='utf-8') as f: texts = f.read()bs = BeautifulSoup(texts,'html.parser')print(bs.title)# 获取节点文本pr
复制链接

扫一扫