python取第一个子标签_python 爬虫（一）Beautifulsoup 和父标签、子标签、兄标签...

最新推荐文章于 2023-02-25 14:09:17 发布

weixin_39637370

最新推荐文章于 2023-02-25 14:09:17 发布

阅读量1.6k

点赞数 1

文章标签： python取第一个子标签

1. 在指定网站爬取指定class的信息：

from urllib.request import urlopen

from bs4 import BeautifulSoup

html = urlopen("http://www.pythonscraping.com/pages/warandpeace.html")

bsObj = BeautifulSoup(html)

nameList = bsObj.findAll("span", {"class":"green"})

for name in nameList:

print(name.get_text())

2. find和findAll函数的情况

findAll(tag,attributes,rescursive,text,limit,keywords)

find(tag,attributes,rescursive,text,keywords)

tag 为标签名称

findAll({"h1","h2","h3"})

attributes 是对应的属性值

nameList = bsObj.findAll("span", {"class":"green"})

rescursive 是布尔值

True是所有标签

Fasle就只查一级标签

text是用标签的文本内容去匹配

比如:

nameList = bsObj.findAll(text="the prince")

print(len(nameList))

limit

最低0.47元/天解锁文章

weixin_39637370

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
python取第一个子标签_python 爬虫（一）Beautifulsoup 和父标签、子标签、兄标签...

1. 在指定网站爬取指定class的信息：from urllib.request import urlopenfrom bs4 import BeautifulSouphtml = urlopen("http://www.pythonscraping.com/pages/warandpeace.html")bsObj = BeautifulSoup(html)nameList = bsObj.fi...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。