python 解析html 时lxml跟beautifulSoup对比

最新推荐文章于 2023-07-06 21:52:05 发布

weixin_30572613

最新推荐文章于 2023-07-06 21:52:05 发布

阅读量358

点赞数

文章标签： python

原文链接：http://www.cnblogs.com/chaoboma/archive/2013/05/13/3075236.html

版权

根据我使用经验lxml比beautifulSoup速度更快，容错和处理能力更强。

lxml示例如下：

　　　　　　def getGooglePlayAppInfo(self):
                pageUrl='https://play.google.com/store/apps/details?id=com.taobao.taobao'
                pageUrl_openHandle=self.open_url(pageUrl)
                if pageUrl_openHandle:
                        pageUrlHtmlSource=pageUrl_openHandle.read().decode("utf-8")
                        #print pageUrlHtmlSource
                        doc=etree.HTML(pageUrlHtmlSource)
                        hrefs = doc.xpath(u"//a[@class=\"doc-header-link\"]")
                        for href in hrefs:
                                print href.text

转载于:https://www.cnblogs.com/chaoboma/archive/2013/05/13/3075236.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

weixin_30572613

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 解析html 时lxml跟beautifulSoup对比

根据我使用经验lxml比beautifulSoup速度更快，容错和处理能力更强。lxml示例如下：　　　　　　def getGooglePlayAppInfo(self): pageUrl='https://play.google.com/store/apps/details?id=com.taobao.taobao' pag...
复制链接

扫一扫