beatifulsoup定位兄弟_Python BeautifulSoup定位取值

weixin_39922683

于 2021-02-06 13:51:45 发布

阅读量563

点赞数 1

文章标签： beatifulsoup定位兄弟

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39922683/article/details/113891690

版权

-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

从网页中获取指定标签、属性值，取值方式：

1.通过标签名获取：tag.name tag对应的type是

2.通过属性获取：tag.attrs

3.获取标签属性：tag.get('属性名') 或 tag['属性名']

获取标签内容：

1.tag.string 获取当前标签的内容，只有一个标签的时候，(是能处理一个标签，返回标签的text内容)

2.tag.get_text() 获取标签内所有的字符串

BeautifulSoup 功能标签

1. stripped_strings

输出的字符串中可能包含了很多空格或空行,使用 .stripped_strings 可以去除多余空白内容

for string in soup.stripped_strings:

print(repr(string))

# u"The Dormouse's story"

# u"The Dormouse's story"

# u'Once upon a time there were three little sisters; and their names were'

# u'Elsie'

# u','

# u'Lacie'

# u'and'

# u'Tillie'

# u';\nand they lived at the bottom of a well.'

2. 标准输出页面：

soup.prettify()

BeautifulSoup 查找元素：

1.find_all(class_="class") 返回的是多个标签，格式为

2.find(class_="class") 返回一个标签，格式是

3.select_one() 返回一个标签，格式是

4.select() 返回的是多个标签，格式为

5.　soup = BeautifulSoup(backdata,'html.parser')#转换为BeautifulSoup形式属性

soup.find_all('标签名'，attrs{'属性名':'属性值'} ) #返回的是列表

limitk 控制 find_allf返回的数量

recursive=Flasef返回tag的直接子元素

soup.find_all(text=re.compile(' content ')) 根据文本匹配，可模糊匹配

子节点处理方式：

1. contents

.contents 属性可以将tag的子节点以列表的方式输出

2.children

.children 生成器,可以对tag的子节点进行循环

3. descendants

contents和children 只是返回的是直接子节点，而descendants返回的是对多有的子孙节点进行循环

父节点处理方式：

1. parent

通过 .parent 属性来获取某个元素的父节点

2. find_parents()

返回祖先节点

2. find_parent()

返回父节点

兄弟节点处理方式：

1. next_siblings 下一个兄弟节点

2. previous_siblings 上一个兄弟节点

3. find_next_siblings()下一个兄弟节点

4. find_next_sibling()上一个兄弟节点

weixin_39922683

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
beatifulsoup定位兄弟_Python BeautifulSoup定位取值

-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-从网页中获取指定标签、属性值，取值方式：1.通过标签名获取：tag.name tag对应的type是2.通过属性获取：tag.attrs3.获取标签属性：tag.get('属性名')...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。