python getchildren xml,如何使用Python正确解析父/子XML

最新推荐文章于 2024-09-13 12:06:24 发布

请叫我铁牛

最新推荐文章于 2024-09-13 12:06:24 发布

阅读量448

点赞数

文章标签： python getchildren xml

I have a XML parsing issue that I have been working on for the last few days and I just can't figure it out. I've used both the ElementTree built-in to Python as well as the LXML libraries but get the same results. I would like to continue using ElementTree if I can, but if there are limitations to that library then LXML would do. Please see the following XML example. What I am trying to do is find a connection element and see what classes that element contains. I am expecting each connection to contain at least one class. If it doesn't have at least one class I want to know that it doesn't. The problem I am facing is that my code is returning ALL THE CLASSES in the document for each connection, instead of only the classes for that specific connection.

DVD

DVD_TEST

For example, here is my Python code and the output that it returns:

for parentConnection in elemetTree.getiterator('connection'):

# print parentConnection.tag

for childConnection in parentConnection:

# print childConnection.text

if childConnection.tag == 'id':

connID = childConnection.text

print connID

for p in tree.xpath('./connections/connection/classes/class'):

for attrib in p.attrib:

print '@' + attrib + '=' + p.attrib[attrib]

children = p.getchildren()

for child in children:

print child.text

Here is the output:

DVD

DVD_TEST

DVD

DVD_TEST

As you can see, I am printing out the text of the CONNECTION ID and then the text for each CLASSNAME. However, as you can see, they both contain the same text for CLASSNAME. The output should really look like this:

DVD

DVD_TEST

Now as the above hand modified example shows each connection ID (Parent) has the appropriate classes/classnames (children). I just can't figure out how to make this work. If any of you have the knowledge to make this work, I would love to hear it.

I've tried building a data structure and other examples on this forum but just can't get it to work right.

解决方案

My solution without using xpath. What I recommend is digging a little further into lxml documentation. There might be more elegant and direct ways to achieve this. There's a lot to explore!.

Solution:

from lxml import etree

from io import BytesIO

class FindClasses(object):

@staticmethod

def parse_xml(xml_string):

parser = etree.XMLParser()

fs = etree.parse(BytesIO(xml_string), parser)

fstring = etree.tostring(fs, pretty_print=True)

element = etree.fromstring(fstring)

return element

def find(self, xml_string):