5.3【数据编码与处理】解析xml

最新推荐文章于 2023-06-02 23:08:04 发布

同学他叫Hugh

最新推荐文章于 2023-06-02 23:08:04 发布

阅读量54

点赞数

分类专栏： Python 文章标签： python

本文链接：https://blog.csdn.net/bugnerH/article/details/130153379

版权

Python 专栏收录该内容

42 篇文章 0 订阅

订阅专栏

解析xml
xml是一种十分常用的标记性语言，可提供统一的方法来描述应用程序的结构化数据：
<?xml version="1.0"?>
<data>
	<country name="Liechtenstein">
		<rank updated="yes">2</rank>
		<yesr>2020</year>
		<gdppc>141100</gdppc>
		<neighbor name="Austria" direction="E"/>
		<neighbor name="Switzerland" direction="W"/>
	</country>
	<country name="Singapore">
		<rank updated="yes">5</rank>
		<yesr>2021</year>
		<gdppc>59900</gdppc>
		<neighbor name="Malaysia" direction="N"/>
	</country>
	<country name="Panama">
		<rank updated="yes">69</rank>
		<yesr>2021</year>
		<gdppc>13600</gdppc>
		<neighbor name="Costa Rica" direction="W"/>
		<neighbor name="Colombia" direction="E"/>
	</country>
</data>
#
使用标准库中的xml.etree.ElementTree，其中的parse函数解析xml文档

from xml.etree.ElementTree import parse
f = open('demo.xml')
et = parse(f)
root = et.getroot()
root.tag # 查看标签		'data'
root.attrib # 查看属性	{}
root.text # '\n\t'
root.text.strip() # ''
for child in root: # 获取子元素
	print child.get('name') # 获取特定属性
root.find('country') # 找到第一个标签，范围是子节点层
root.findall('country') # 得到列表
root.iterfind('country') # 得到一个生成器对象
for e in root.iterfind('country'): print e.get('name')
root.iter() # 得到该节点下的所有节点
list(root.iter())
list(root.iter('rank'))

root.findall('country/*')
root.findall('.//rank') # // 任意层次下的子元素
root.findall('.//rank/..') # 父节点层次的所有节点
root.findall('country[@name]') # 包含name属性的
root.findall('country[@name="Singapore"]')
root.findall('country[rank]') # 包含特定属性的
root.findall('country[rank="5"]')
root.findall('country[1]')
root.findall('country[last()]')
root.findall('country[last()-1]')