shell读取html网页curl,在Shell脚本中用CURL解析HTML

最新推荐文章于 2023-07-31 19:08:06 发布

weixin_39760068

最新推荐文章于 2023-07-31 19:08:06 发布

阅读量710

点赞数

文章标签： shell读取html网页curl

不要。使用HTML解析器。例如，Python的BeautifulSoup易于使用，并且可以非常轻松地完成此操作。

也就是说，请记住grep适用于行。该模式匹配行，而不是整个字符串。

你可以使用什么是-A赛后还输出线：

grep -A2 -E -m 1 '

'

应该输出：

Diplo - Justin Bieber - Skrillex

Where Are U Now

然后，您可以通过管道得到它的最后或倒数第二行到tail：

$ grep -A2 -E -m 1 '

' | tail -n1

Where Are U Now

$ grep -A2 -E -m 1 '

' | tail -n2 | head -n1

Diplo - Justin Bieber - Skrillex

然后用去掉HTML：

$ grep -A2 -E -m 1 '

' | tail -n1

Where Are U Now

$ grep -A2 -E -m 1 '

' | tail -n2 | head -n1 | sed 's/]*>//g'

Diplo - Justin Bieber - Skrillex

但正如所说，这是善变的，有可能打破，而不是很漂亮。下面是与BeautifulSoup相同，顺便说一句：

html = '''

Blah text

Diplo - Justin Bieber - Skrillex

Where Are U Now

'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html.parser')

for track in soup.find_all(class_='tracklistInfo'):

print(track.find_all('p')[0].text)

print(track.find_all('p')[1].text)

这也适用于的tracklistInfo多行 - 补充说，在shell命令需要更多的工作;-)

weixin_39760068

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
shell读取html网页curl,在Shell脚本中用CURL解析HTML

不要。使用HTML解析器。例如，Python的BeautifulSoup易于使用，并且可以非常轻松地完成此操作。也就是说，请记住grep适用于行。该模式匹配行，而不是整个字符串。你可以使用什么是-A赛后还输出线：grep -A2 -E -m 1 ''应该输出：Diplo - Justin Bieber - SkrillexWhere Are U Now然后，您可以通过管道得到它的最后或倒数第二行到...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。