python从html中提取文本_python-使用BeautifulSoup从html中仅提取文本,脚...

最新推荐文章于 2023-07-16 00:30:00 发布

weixin_39736547

最新推荐文章于 2023-07-16 00:30:00 发布

阅读量526

点赞数

文章标签： python从html中提取文本

我有这样的HTML

Ages 15

getCurrentLocationVal("loc_loads1",29.45218856,59.38139268,1);

我正在尝试使用BeautifulSoup提取年龄15

所以我写了如下的python代码

码：

from bs4 import BeautifulSoup as bs

import urllib3

URL = 'html file'

http = urllib3.PoolManager()

page = http.request('GET', URL)

soup = bs(page.data, 'html.parser')

age = soup.find("span", {"class": "age"})

print(age.text)

输出：

Age 15 getCurrentLocationVal("loc_loads1",29.45218856,59.38139268,1);

我只想要15岁,而不是脚本标记中的函数.有什么办法只能获取文本：15岁？或以任何方式排除脚本标签的内容？

PS: there are too many script tags and different URLS. I don’t prefer

replace text from the output.

weixin_39736547

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python从html中提取文本_python-使用BeautifulSoup从html中仅提取文本,脚...

我有这样的HTMLAges 15getCurrentLocationVal("loc_loads1",29.45218856,59.38139268,1);我正在尝试使用BeautifulSoup提取年龄15所以我写了如下的python代码码：from bs4 import BeautifulSoup as bsimport urllib3URL = 'html file'http = urlli...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。