xml文件中的多余信息的删除

最新推荐文章于 2023-03-09 14:15:05 发布

weixin_30794491

最新推荐文章于 2023-03-09 14:15:05 发布

阅读量451

点赞数

文章标签： web.xml

原文链接：http://www.cnblogs.com/hope100/p/4368256.html

版权

xml文件中可能会存入一些不想用的信息，而这些信息是由于采集的时候没法避免引入的。

其中最常见的是版本信息<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>，该如何去除。为了解决这个问题，我试了Google n种关键词，包括remove question mark xml之类的，最后找到的办法也是折衷的，在采集信息的时候，加入换行符，然后再按照下面的处理。

这个网页中讲的是如何去除掉文件中包含某个关键的行（http://segmentfault.com/q/1010000000124564）

import shutil

with open('/path/to/file', 'r') as f: with open('/path/to/file.new', 'w') as g: for line in f.readlines(): if '/local/server' not in line: g.write(line) shutil.move('/path/to/file.new', '/path/to/file')


另外我还找到一种新的读xml文件的方式，（http://pycoders-weekly-chinese.readthedocs.org/en/latest/issue6/processing-xml-in-python-with-element-tree.html）

 
   >>> import xml.etree.cElementTree as ET
>>> tree = ET.ElementTree(file='doc1.xml')

然后抓根结点元素：

 
   >>> tree.getroot()
<Element 'doc' at 0x11eb780>

和预期一样，root 是一个 Element 元素。我们可以来看看：

 
   >>> root = tree.getroot() >>> root.tag, root.attrib ('doc', {}) 
  

注意，在

 tree = ET.ElementTree(file='doc1.xml')括号里面一定要加file，不然后面的root就得不到根节点，这个问题搞了我很久。

转载于:https://www.cnblogs.com/hope100/p/4368256.html

weixin_30794491

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
xml文件中的多余信息的删除

xml文件中可能会存入一些不想用的信息，而这些信息是由于采集的时候没法避免引入的。其中最常见的是版本信息<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>，该如何去除。为了解决这个问题，我试了Google n种关键词，包括remove question mark xml之类的，最后找到的办法也是折衷的，在采集信息的时候，...
复制链接

扫一扫