【Python】如何对xml文件进行新增、修改和删除等操作


xml文件示例:

<?xml version="1.0"?>
<data>
    <disabled>false</disabled>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
    <builders>
    <hudson.tasks.Shell>
        <command>echo "Hello world!"</command>
    </hudson.tasks.Shell>
    </builders>
</data>

1 按照示例文件内容新增xml文件

#!/usr/bin/env python
# -*- coding:utf-8 -*-
# author: Sudley
# ctime: 2020/02/16

import xml.etree.ElementTree as ET

def create_Xml(xml_file):
    #创建包含root标签的xml文件
    with open(xml_file,'w') as f:
        f.write('<?xml version="1.0"?>\n')
        f.write('<data>\n')
        f.write('</data>\n')

    #使用ET模块对xml文件进行解析
    tree = ET.parse(xml_file)
    root = tree.getroot()
    #创建disabled标签
    SubElement_disabled = ET.SubElement(root,'disabled')
    SubElement_disabled.text = 'false'

    #创建第一个country标签
    SubElement_country0 = ET.SubElement(root,'country',attrib={'name':'"Liechtenstein"'})  #添加含attrib的标签,atrib后面接的是字典格式的
    SubElement_country0_rank = ET.SubElement(SubElement_country0,'rank')
    SubElement_country0_rank.text = '1'       #配置text,注意不能直接用int类型的
    SubElement_country0_year = ET.SubElement(SubElement_country0,'year')
    SubElement_country0_year.text = '2008'
    SubElement_country0_gdppc = ET.SubElement(SubElement_country0,'gdppc')
    SubElement_country0_gdppc.text = '141100'
    SubElement_country0_neighbor0 = ET.SubElement(SubElement_country0,'neighbor',attrib={'name':'Austria','direction':'E'})
    SubElement_country0_neighbor1 = ET.SubElement(SubElement_country0,'neighbor',attrib={'name':'Switzerland','direction':'W'})

    #创建二个country标签
    SubElement_country1 = ET.SubElement(root,'country',attrib={'name':'Singapore'})
    SubElement_country1_rank = ET.SubElement(SubElement_country1,'rank')
    SubElement_country1_rank.text = '4'
    SubElement_country1_year = ET.SubElement(SubElement_country1,'year')
    SubElement_country1_year.text = '2011'
    SubElement_country1_gdppc = ET.SubElement(SubElement_country1,'gdppc')
    SubElement_country1_gdppc.text = '59900'
    SubElement_country1_neighbor0 = ET.SubElement(SubElement_country1,'neighbor',attrib={'name':'Malaysia','direction':'N'})

    #创建三个country标签
    SubElement_country2 = ET.SubElement(root,'country',attrib={'name':'Panama'})
    SubElement_country2_rank = ET.SubElement(SubElement_country2,'rank')
    SubElement_country2_rank.text = '68'
    SubElement_country2_year = ET.SubElement(SubElement_country2,'year')
    SubElement_country2_year.text = '2011'
    SubElement_country2_gdppc = ET.SubElement(SubElement_country2,'gdppc')
    SubElement_country2_gdppc.text = '13600'
    SubElement_country2_neighbor0 = ET.SubElement(SubElement_country2,'neighbor',attrib={'name':'Costa Rica','direction':'W'})
    SubElement_country2_neighbor1 = ET.SubElement(SubElement_country2,'neighbor',attrib={'name':'Colombia','direction':'E'})

    #创建builders标签
    SubElement_builders = ET.SubElement(root,'builders')
    SubElement_builders_Shell = ET.SubElement(SubElement_builders,'hudson.tasks.Shell')
    SubElement_builders_Shell_command = ET.SubElement(SubElement_builders_Shell,'command')
    SubElement_builders_Shell_command.text = 'echo "Hello world!"'

    #上面创建的内容都在一行上面显示,不利于我们的查看,对标签执行美化,标签前面添加是的的缩进
    prettyXml(root, '    ', '\n')            #执行美化方法
    ET.dump(root)                 #显示出美化后的XML内容

    tree.write(xml_file)                   #将修改写入本地xml文件

def prettyXml(element, indent, newline, level = 0): # elemnt为传进来的Elment类,参数indent用于缩进,newline用于换行
    if element:  # 判断element是否有子元素
        if element.text == None or element.text.isspace(): # 如果element的text没有内容
            element.text = newline + indent * (level + 1)
        else:
            element.text = newline + indent * (level + 1) + element.text.strip() + newline + indent * (level + 1)
    #else:  # 此处两行如果把注释去掉,Element的text也会另起一行
        #element.text = newline + indent * (level + 1) + element.text.strip() + newline + indent * level
    temp = list(element) # 将elemnt转成list
    for subelement in temp:
        if temp.index(subelement) < (len(temp) - 1): # 如果不是list的最后一个元素,说明下一个行是同级别元素的起始,缩进应一致
            subelement.tail = newline + indent * (level + 1)
        else:  # 如果是list的最后一个元素, 说明下一行是母元素的结束,缩进应该少一个
            subelement.tail = newline + indent * level
        prettyXml(subelement, indent, newline, level = level + 1) # 对子元素进行递归操作


xml_file = '/tmp/template.xml'
create_Xml(xml_file)

2 查看并修改xml文件内容

查看所有的neighbor信息,并把attrib属性中’direction’为‘E’的修改为‘East’

>>> for neighbor in root.iter('neighbor'):
...   if neighbor.attrib['direction'] == 'E':
...     neighbor.attrib['direction'] = 'East'
...   print(neighbor.attrib)
...
{'direction': 'East', 'name': 'Austria'}
{'direction': 'W', 'name': 'Switzerland'}
{'direction': 'N', 'name': 'Malaysia'}
{'direction': 'W', 'name': 'Costa Rica'}
{'direction': 'East', 'name': 'Colombia'}

获取country name和rank属性

>>> for country in root.findall('country'):
...   rank = country.find('rank').text
...   name = country.get('name')
...   print(name, rank)
...
Liechtenstein 1
Singapore 4
Panama 68

在rank标签中新增attrib属性

>>> for rank in root.iter('rank'):
...   new_rank = int(rank.text) + 1
...   rank.text = str(new_rank)
...   rank.set('updated', 'yes')
...
>>> tree.write('/tmp/output.xml')

修改后的xml文件如下:

<data>
    <disabled>false</disabled>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="East" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor direction="N" name="Malaysia" />
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor direction="W" name="Costa Rica" />
        <neighbor direction="East" name="Colombia" />
    </country>
    <builders>
        <hudson.tasks.Shell>
            <command>echo "Hello world!"</command>
        </hudson.tasks.Shell>
    </builders>
</data>

3 删除xml文件内容

删除rank > 3的country,删除builders下的hudson.tasks.Shell标签,并配置text为deleted

>>> for country in root.findall('country'):
...   rank = int(country.find('rank').text)
...   if rank > 3:
...     root.remove(country)
...
>>> for builders in root.findall('builders'):
...   for shell in builders.findall('hudson.tasks.Shell'):
...     builders.remove(shell)
...   builders.text = 'deleted'
...
>>> tree.write('/tmp/output.xml')

修改后的xml文件:

<data>
    <disabled>false</disabled>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="East" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
    <builders>deleted</builders>
</data>

当一个页签下面有多个子标签时使用remove删除发现一次删不全
比如下面文件

<hudson.model.ListView>
  <name>test_all</name>
  <jobNames>
    <comparator class="hudson.util.CaseInsensitiveComparator" />
    <string>compile</string>
    <string>get_node_list</string>
    <string>job_data</string>
    <string>new_job</string>
    <string>pipeline0</string>
    <string>pipeline1</string>
    <string>template</string>
    <string>test_1</string>
    <string>test_2</string>
    <string>test_3</string>
  </jobNames>
  <jobFilters />
  <recurse>false</recurse>
</hudson.model.ListView>

想删除jobNames标签下的所有string标签,可先获取string的数量num然后执行num次删除操作,删除部分代码参考如下

root = tree.getroot()
for i in range(0,self.num):
    for jobNames in root.findall('jobNames'):
        for string in jobNames:
            if string.text:
                jobNames.remove(string)

参考文献
https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.SubElement
Python使用ElementTree处理XML的美化

  • 5
    点赞
  • 28
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
要逐节点比较两个XML文件新增修改删除内容,可以使用Python中的xml.etree.ElementTree模块。 首先,使用ElementTree将两个XML文件解析为树形结构,并获取它们的根节点。然后,对这两个根节点进行逐节点比较,判断节点是否存在、节点的标签是否相同、节点的属性是否相同以及节点的文本内容是否相同。如果节点存在但是属性或文本内容有所变化,则表示节点内容被修改。如果节点不存在,则表示节点被删除新增。 下面是一个示例代码,用于比较两个XML文件的差异: ```python import xml.etree.ElementTree as ET # 解析两个XML文件,并获取它们的根节点 tree1 = ET.parse('file1.xml') root1 = tree1.getroot() tree2 = ET.parse('file2.xml') root2 = tree2.getroot() # 逐节点比较两个XML文件 for child1 in root1: # 判断节点是否存在于第二个XML文件中 child2 = root2.find(child1.tag) if child2 is None: print('节点被删除:', child1.tag) else: # 判断节点的属性是否相同 if child1.attrib != child2.attrib: print('节点属性被修改:', child1.tag) # 判断节点的文本内容是否相同 if child1.text != child2.text: print('节点文本内容被修改:', child1.tag) # 判断第二个XML文件中是否存在新增节点 for child2 in root2: child1 = root1.find(child2.tag) if child1 is None: print('节点被新增:', child2.tag) ``` 请注意,这只是一个简单的示例代码,实际情况可能更加复杂,需要根据具体的XML文件格式和需求进行相应的修改

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值