【XML】python解析XML

最新推荐文章于 2024-10-28 12:05:02 发布

玄云飘风

最新推荐文章于 2024-10-28 12:05:02 发布

阅读量355

点赞数 1

分类专栏： python 基本功文章标签： xml python

本文链接：https://blog.csdn.net/tfcy694/article/details/85071723

版权

基本功同时被 2 个专栏收录

35 篇文章 0 订阅

订阅专栏

python

13 篇文章 1 订阅

订阅专栏

使用python元素树解析XML文件

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

import xml.etree.ElementTree as ET

tree = ET.ElementTree(file="target.xml")
root = tree.getroot()
print(root)

for child in root:
    if child.tag == 'country':
    	child.remove(child[0])		#删除子元素
    	child[0].text = '100'		#利用int索引修改rank元素，不得使用字典索引
    	child[0].set('000','111')	#设置新属性 元素属性不得多余1个
   		child.append(root[0][1])	#增加子元素
   		child.remove(child[0])		#删除子元素

tree.write('target1.xml')

VOC数据集的解析(https://github.com/amdegroot/ssd.pytorch/blob/master/data/voc0712.py)

    def __call__(self, target, width, height):
        """
        Arguments:
            target (annotation) : the target annotation to be made usable will be an ET.Element
        Returns:
            a list containing lists of bounding boxes  [bbox coords, class name]
        """
        res = []
        for obj in target.iter('object'):
            difficult = int(obj.find('difficult').text) == 1
            if not self.keep_difficult and difficult:
                continue
            name = obj.find('name').text.lower().strip()
            bbox = obj.find('bndbox')

            pts = ['xmin', 'ymin', 'xmax', 'ymax']
            bndbox = []
            for i, pt in enumerate(pts):
                cur_pt = int(bbox.find(pt).text) - 1
                # scale height or width
                cur_pt = cur_pt / width if i % 2 == 0 else cur_pt / height
                bndbox.append(cur_pt)
            label_idx = self.class_to_ind[name]
            bndbox.append(label_idx)
            res += [bndbox]  # [xmin, ymin, xmax, ymax, label_ind]
            # img_id = target.find('filename').text[:-4]

        return res  # [[xmin, ymin, xmax, ymax, label_ind], ... ]