python判断xml的iter为空,使用LXML和Python解析空白XML标签

最新推荐文章于 2022-06-16 10:00:39 发布

張子佾

最新推荐文章于 2022-06-16 10:00:39 发布

阅读量458

点赞数

文章标签： python判断xml的iter为空

When parsing XML documents in the format of:

Blue

Chevy

Camaro

I use the following code:

carData = element.xpath('//Root/Foo/Bar/Car/node()[text()]')

parsedCarData = [{field.tag: field.text for field in carData} for action in carData]

print parsedCarData[0]['Color'] #Blue

This code will not work if a tag is empty such as :

Blue

Chevy

Using the same code as above:

carData = element.xpath('//Root/Foo/Bar/Car/node()[text()]')

parsedCarData = [{field.tag: field.text for field in carData} for action in carData]

print parsedCarData[0]['Model'] #Key Error

How would I parse this blank tag.

解决方案

You're putting in a [text()] filter which explicitly asks only for elements which have text nodes them... and then you're unhappy when it doesn't give you elements without text nodes?

Leave that filter out, and you'll get your model element:

>>> s='''

...

... Blue

... Chevy

...

... '''

>>> e = lxml.etree.fromstring(s)

>>> carData = e.xpath('Car/node()')

>>> carData

[, , ]

>>> dict(((e.tag, e.text) for e in carData))

{'Color': 'Blue', 'Make': 'Chevy', 'Model': None}

That said -- if your immediate goal is to iterate over the nodes in the tree, you might consider using lxml.etree.iterparse() instead, which will avoid trying to build a full DOM tree in memory and otherwise be much more efficient than building a tree and then iterating over it with XPath. (Think SAX, but without the insane and painful API).

Implementing with iterparse could look like this:

def get_cars(infile):

in_car = False

current_car = {}

for (event, element) in lxml.etree.iterparse(infile, events=('start', 'end')):

if event == 'start':

if element.tag == 'Car':

in_car = True

current_car = {}

continue

if not in_car: continue

if element.tag == 'Car':