python对xml处理总结

最新推荐文章于 2024-08-16 10:30:50 发布

cncxz5801

最新推荐文章于 2024-08-16 10:30:50 发布

阅读量531

点赞数 1

分类专栏： deeplearn

本文链接：https://blog.csdn.net/cncxz5801/article/details/81475514

版权

deeplearn 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

前言：本次项目需要对xml进行信息读取以便可视化数据分布，这里简单介绍一下python中xml.ttree.ElementTree包

<annotation>
	<folder>data</folder>
	<filename>000001.jpg</filename>
	<path>D:\Program Files\Profiles\labelimg\data\000001.jpg</path>
	<source>
		<database>Unknown</database>
	</source>
	<size>
		<width>1500</width>
		<height>1000</height>
		<depth>3</depth>
	</size>
	<segmented>0</segmented>
	<object>
		<name>e</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>727</xmin>
			<ymin>377</ymin>
			<xmax>791</xmax>
			<ymax>441</ymax>
		</bndbox>
	</object>
	<object>
		<name>o</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>710</xmin>
			<ymin>209</ymin>
			<xmax>763</xmax>
			<ymax>245</ymax>
		</bndbox>
	</object>
	<object>
		<name>l</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>917</xmin>
			<ymin>319</ymin>
			<xmax>956</xmax>
			<ymax>357</ymax>
		</bndbox>
	</object>
	<object>
		<name>r</name>
		<pose>Unspecified</pose>
		<truncated>0</truncated>
		<difficult>0</difficult>
		<bndbox>
			<xmin>1461</xmin>
			<ymin>217</ymin>
			<xmax>1495</xmax>
			<ymax>253</ymax>
		</bndbox>
	</object>
</annotation>

1.基本操作

---xml解析：

tree = ET.parse("你的xml路径") #得到ET对象

root = tree.getroot() #得到根节点

ET.dump(root) #显示整个xml

---对于每一个element对象都有一下属性：

tag: string对象，表示数据代表的种类

attrib:dictionary对象，表示附有的属性

text: string对象，表示element的内容

tail：string对象，表示element闭合之后的尾迹

若干子元素

import os
import xml.etree.cElementTree as ET
import shutil
import sys

tree = ET.parse("000001.xml")
root = tree.getroot()
print("root:", root)
print("tag:",root.tag)
print("attrib",root.attrib)
print("text", root.text)
print("tail:",root.tail)
'''
root: <Element 'annotation' at 0x7f72dfc8ce08>
tag: annotation
attrib {}
text '\n'
tail: None'''

---简单遍历：

#直接全遍历
for child in root:
    print("tag:", child.tag, "attrib:", child.attrib, "text:", child.text)
'''
tag: folder attrib: {} text: data
tag: filename attrib: {} text: 000001.jpg
tag: path attrib: {} text: D:\Program Files\Profiles\labelimg\data\000001.jpg
tag: source attrib: {} text: 
		
tag: size attrib: {} text: 
		
tag: segmented attrib: {} text: 0
tag: object attrib: {} text: 
		
tag: object attrib: {} text: 
		
tag: object attrib: {} text: 
		
tag: object attrib: {} text:
'''
#数组的形式访问
print(root[4][1].tag)#height

---一些方便的查找函数：

1.find(match) #查找第一个匹配的子元素，match可以是tag或是xpath路径

2.findall(match) #返回所有匹配的子元素列表

3.findtext(match , default=None)

4.iter(tag=None) #以当前元素为根节点，创建树迭代器，如果tag不是none，则以tag过滤

5.iterfind(match) #

for child in root.iter("name"):#不只下一级
    print(child.text)
for child in root.findall("object"):#只能找下一级
    print(child.text)

----修改xml

1.属性相关（是一个标签里的属性级操作）

改好后：tree.write("你保存的xml路径") #保存

attrib 　　为包含元素属性的字典
keys() 返回元素属性名称列表
items() 返回(name,value)列表
get(key, default=None) 获取属性
set(key, value) # 跟新/添加属性
del xxx.attrib[key] # 删除对应的属性

2.节点相关

删除节点：.remove(....)

添加子元素方法总结:

append(subelement)
extend(subelements)
insert(index, element)

cncxz5801

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录