BeautifulSoup 的遍历的实现方法

最新推荐文章于 2024-08-04 13:55:04 发布

可爱到冒泡泡

最新推荐文章于 2024-08-04 13:55:04 发布

阅读量1.3k

点赞数 1

分类专栏： python 文章标签： python BeautifulSoup 遍历

本文链接：https://blog.csdn.net/weixin_44151143/article/details/100043894

版权

2 篇文章 0 订阅

订阅专栏

BeautifulSoup库是对标签树功能的遍历集合
BeautifulSoup 的遍历包括

注意：’and ’，‘\n’也都是节点，在使用 len() 计算节点个数需要计入个数，在平行遍历时也要作为单独节点列出。

下行遍历

.content 用法
例如：

soup.head.contents

类似返回结果：[ < title> this is a page </ title > ]

soup.body.contents

类似返回结果：[ ’ \n '. < p > XXXXX </ p > < p ><a class=“py” href=“http://” </ a > </ p > ]

.children 用法(遍历得到儿子节点)

for n in soup.body.children:
print(n)

.descendants 用法（遍历得到子孙节点）

for n in soup.body.descendants:
print(n)

上行遍历

属性	用法
.parent	节点的父亲标签
.parents	循环遍历先辈节点

.parent 用法
例如：

soup.titleparent

类似返回结果：< head >< title > this is a page < /title >< /head >

.parents用法

soup = BeautifulSoup(demo,“html.parser”)
for n in soup.a.parents:
if n is None:
print(n)
else:
print(n.name)

平行遍历

注：平行节点是处于同一父节点下的节点

.next_sibling 用法（遍历后续节点）

for n in soup.a.next_sibling:
print(n)

.previous_sibling 用法（遍历前沿节点）

for n in soup.a.previous_sibling:
print(n)

关注

专栏目录