理解lxml module in Python

  • Overview

    homepage

    Python 爬虫利器三之 Xpath 语法与 lxml 库的用法

    The ElementTree XML API in Python

    The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 2.7 to 3.9.

  • Tutorial

    pip install lxml
    from lxml import etree
    root = etree.Element("root")
    root.append(etree.Element("child1"))
    child2 = etree.SubElement(root, "child2")
    chile3 = etree.SubElement(root, "child3")
    # Elements are lists
    child = root[0]
    root[0].getparent()
    etree.tostring(root)
    # Elements carry attributes as a dict
    root1 = etree.Element("root", interesting="totally")
    root1.get("interesting")
    root1.set("hello","Huhu")
    etree.tostring(root)
    root1.keys()# display all the attributes
    root1.attrib # return a dict
    # Elements contain text
    root2 = etree.Element("root")
    root2.text = "TEXT"
    etree.tostring(root2,with_tail=False,method="text")
    ####### Using XPath to find text
    html.xpath("string()")
    html.xpath("//text()")
    
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值