Python xml 标准库解析xml文件初体验

最新推荐文章于 2024-08-13 18:30:18 发布

敲代码的小风

最新推荐文章于 2024-08-13 18:30:18 发布

阅读量227

点赞数 1

分类专栏： Python基础实验零基础学习SSD网络PyTorch实现文章标签： xml python

本文链接：https://blog.csdn.net/m0_46653437/article/details/109685844

版权

零基础学习SSD网络PyTorch实现同时被 2 个专栏收录

293 篇文章 26 订阅

订阅专栏

Python基础实验

232 篇文章 12 订阅

订阅专栏

代码:xml处理初体验.py

# xml文件解析初体验...
# country_data.xml文件的内容如下:

'''
<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>
'''

import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml') # 读取文件，获得树结构
# print(tree) # <xml.etree.ElementTree.ElementTree object at 0x000001FB12B0EF08>
root = tree.getroot() # 获得树结构的根
# print(root) # <Element 'data' at 0x000001FB12B922C8>
# print(root.tag)      # data
# print(root.attrib)     # {}

'''
分析:
<country name="Singapore"></country>
其中 country是tag,  name="Singapore"是attrib
'''

for child in root: # 迭代所有直接子节点
    print(child.tag, child.attrib)

"""
控制台输出如下:
country {'name': 'Liechtenstein'}
country {'name': 'Singapore'}
country {'name': 'Panama'}
"""

print("# 子级是可以嵌套的，我们可以通过索引访问特定的子级节点:")
# 子级是可以嵌套的，我们可以通过索引访问特定的子级节点:
print(root[0][0].text)  # 1
print(root[0][1].text)  # 2008
print(root[0][2].text)  # 141100
print(root[0][3].text)  # None
print(root[0][4].text)  # None
# print(root[0][5].text)  # IndexError: child index out of range

print("输出tag字段:")
print(root[0][0].tag)  # rank
print(root[0][1].tag)  # year
print(root[0][2].tag)  # gdppc
print(root[0][3].tag)  # neighbor
print(root[0][4].tag)  # neighbor

print("输出attrib字段:")
print(root[0][0].attrib)  # {}
print(root[0][1].attrib)  # {}
print(root[0][2].attrib)  # {}
print(root[0][3].attrib)  # {'name': 'Austria', 'direction': 'E'}
print(root[0][4].attrib)  # {'name': 'Switzerland', 'direction': 'W'}



'''
Element 有一些很有效的方法,
可帮助递归遍历其下的所有子树
(包括子级,子级的子级,等等).
例如 Element.iter().

iter(tag=None)
Creates a tree iterator with the current element as the root. 
The iterator iterates over this element and all elements below it, 
in document (depth first) order. If tag is not None or '*', 
only elements whose tag equals tag are returned from the iterator. 
If the tree structure is modified during iteration, the result is undefined.
'''
print("展示Element.iter()方法的使用:")
for neighbor in root.iter('neighbor'):
    print(neighbor.attrib)
"""
控制台输出如下:
{'name': 'Austria', 'direction': 'E'}
{'name': 'Switzerland', 'direction': 'W'}
{'name': 'Malaysia', 'direction': 'N'}
{'name': 'Costa Rica', 'direction': 'W'}
{'name': 'Colombia', 'direction': 'E'}
"""





"""
Element.findall() 仅查找当前元素的直接子元素中带有指定标签的元素。 
Element.find() 找带有特定标签的 第一个 子级，
然后可以用 Element.text 访问元素的文本内容。 
Element.get 访问元素的属性
"""
print("展示Element.findall()方法的使用:")
for country in root.findall('country'):
    rank = country.find('rank').text
    name = country.get('name')
    print(name, rank)
"""
控制台输出:
Liechtenstein 1
Singapore 4
Panama 68
"""