python与html区别,python：“ lxml”和“ html.parser”之间的区别和“ html5lib”配上漂亮的汤？...

最新推荐文章于 2023-05-23 22:08:32 发布

Mon1st

最新推荐文章于 2023-05-23 22:08:32 发布

阅读量265

点赞数

文章标签： python与html区别

When using beautiful soup what is the difference between 'lxml' and "html.parser" and "html5lib"? When would you use one over the other and the benefits of each? from the times i used each they seem to be interchangeable but i do get corrected that i should be using a different one from people on here. Would like to strengthen my understanding of these. I have read a couple posts on here about this but they are not going over the uses much in any at all.

Example -

soup = BeautifulSoup(response.text, 'lxml')

解决方案

From the docs's summarized table of advantages and disadvantages:

html.parser - BeautifulSoup(markup, "html.parser")

Advantages: Batteries included, Decent speed, Lenient (as of Python 2.7.3 and 3.2.)

Disadvantages: Not very lenient (before Python 2.7.3 or 3.2.2)

lxml - BeautifulSoup(markup, "lxml")

Advantages: Very fast, Lenient

Disadvantages: External C dependency

html5lib - BeautifulSoup(markup, "html5lib")

Advantages: Extremely lenient, Parses pages the same way a web browser does, Creates valid HTML5

Disadvantages: Very slow, External Python dependency

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

Mon1st

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python与html区别,python：“ lxml”和“ html.parser”之间的区别和“ html5lib”配上漂亮的汤？...

When using beautiful soup what is the difference between 'lxml' and "html.parser" and "html5lib"? When would you use one over the other and the benefits of each? from the times i used each they seem ...
复制链接

扫一扫