pythonfromlxml导入html_Python:使用`lxml.html`将HTML内容注入标签

I'm using the lxml.html library to parse an HTML document.

I located a specific tag, that I call content_tag, and I want to change its content (i.e. the text between

and
,) and the new content is a string with some html in it, say it's 'Hello world!'.

How do I do that? I tried content_tag.text = 'Hello world!' but then it escapes all the html tags, replacing < with < etc.

I want to inject the text without escaping any HTML. How can I do that?

解决方案

This is one way:

#!/usr/bin/env python2.6

from lxml.html import fromstring, tostring

from lxml.html import builder as E

fragment = """\

This is div.
"""

div = fromstring(fragment)

print tostring(div)

#

#

This is div.

#

div.replace(div.get_element_by_id('inner'), E.DIV('Hello ', E.B('world!')))

print tostring(div)

#

#

Hello world!

Edit: So, I should have confessed earlier that I'm not all that familiar with lxml. I looked at the docs and source briefly, but didn't find a clean solution. Perhaps, someone more familiar will stop by and set us both straight.

In the meantime, this seems to work, but is not well tested:

import lxml.html

content_tag = lxml.html.fromstring('

Goodbye.
')

content_tag.text = '' # assumes only text to start

for elem in lxml.html.fragments_fromstring('Hello world!'):

if type(elem) == str: #but, only the first?

content_tag.text += elem

else:

content_tag.append(elem)

print lxml.html.tostring(content_tag)

Edit again: and this version removes text and children

somehtml = 'Hello world!'

# purge element contents

content_tag.text = ''

for child in content_tag.getchildren():

content_tag.remove(child)

fragments = lxml.html.fragments_fromstring(somehtml)

if type(fragments[0]) == str:

content_tag.text = fragments.pop(0)

content_tag.extend(fragments)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值