python etree tostring_如何告诉lxml.etree.tostring(element)不要在python中编写名称空间?...

I have a huge xml file (1 Gig). I want to move some of the elements (entrys) to another file with the same header and specifications.

Let's say the original file contains this entry with tag :

...

some text

...

...

...

I use lxml.etree.iterparse to iterate through the file. Works fine. When I find the element with tag , let's assume it is stored in the variable element I do

new_file.write(etree.tostring(element))

But this results in

...

#

some text

...

...

...

So the question is: How to tell etree.tostring() not to write the xmlns:="some". Is this possible? I struggeled with the api-documentation of lxml.etree, but I couldn't find a satisfying answer.

This is what I found for etree.trostring:

tostring(element_or_tree, encoding=None, method="xml",

xml_declaration=None, pretty_print=False, with_tail=True,

standalone=None, doctype=None, exclusive=False, with_comments=True)

Serialize an element to an encoded string representation of its XML

tree.

To me every one of the parameters of tostring() does not seem to help. Any suggestion or corrections?

解决方案

I often grab a namespace to make an alias for it like this:

someXML = lxml.etree.XML(someString)

if ns is None:

ns = {"m": someXML.tag.split("}")[0][1:]}

someid = someXML.xpath('.//m:ImportantThing//m:ID', namespaces=ns)

You could do something similar to grab the namespace in order to make a regex that will clean it up after using tostring.

Or you could clean up the input string. Find the first space, check if it is followed by xmlns, if yes, delete the whole xmlns bit up to the next space, if no delete the space. Repeat until there are no more spaces or xmlns declarations. But don't go past the first >.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值