Python XML解析
参考资料:
菜鸟教程:https://www.runoob.com/python/python-xml.html
官网文档:https://docs.python.org/3.6/library/markup.html
Python由xml包(Lib/xml)提供对XML的支持。
Python处理XML主要有两种模型,xml.dom
和xml.sax
分别定义了两种处理模型的接口:
- 事件驱动模型:SAX (Simple API for XML),在解析XML的过程中触发一个个的事件,并调用用户定义的回调函数,以此来处理XML文件。
- 文档对象模型:DOM (Document Object Model),将 XML 数据在内存中解析成一个树,通过对树的操作来操作XML。
The XML handling submodules are:
-
xml.etree.ElementTree
: the ElementTree API, a simple and lightweight XML processor -
xml.dom
: the DOM API definition -
xml.dom.minidom
: a minimal DOM implementation -
xml.dom.pulldom
: support for building partial DOM trees -
xml.sax
: SAX2 base classes and convenience functions -
xml.parsers.expat
: the Expat parser binding
xml.etree.ElementTree
xml.dom
手册:https://docs.python.org/3.6/library/xml.dom.html
Interface | Section | Purpose |
---|---|---|
DOMImplementation | DOMImplementation Objects | Interface to the underlying implementation. |
Node | Node Objects | Base interface for most objects in a document. |
NodeList | NodeList Objects | Interface for a sequence of nodes. |
DocumentType | DocumentType Objects | Information about the declarations needed to process a document. |
Document | Document Objects | Object which represents an entire document. |
Element | Element Objects | Element nodes in the document hierarchy. |
Attr | Attr Objects | Attribute value nodes on element nodes. |
Comment | Comment Objects | Representation of comments in the source document. |
Text | Text and CDATASection Objects | Nodes containing textual content from the document. |
ProcessingInstruction | ProcessingInstruction Objects | Processing instruction representation. |
xml.dom.minidom
import xml.dom.minidom
cproject = r'.cproject'
dom = xml.dom.minidom.parse(cproject)
options = dom.getElementsByTagName('option')
for opt in options:
if opt.getAttribute('superClass')=='gnu.c.compiler.option.preprocessor.def.symbols':
nod_text = opt.childNodes[0]
nod_elem = opt.childNodes[1]
last_child = opt.lastChild
nod_text = nod_text.cloneNode(False)
nod_elem = nod_elem.cloneNode(False)
nod_elem.setAttribute('value', 'GI_COMMIT="%s"' % git_commit)
opt.insertBefore(nod_text, last_child)
opt.insertBefore(nod_elem, last_child)
nod_text = nod_text.cloneNode(False)
nod_elem = nod_elem.cloneNode(False)
nod_elem.setAttribute('value', 'GIT_BRANCH="%s"' % git_branch)
opt.insertBefore(nod_text, last_child)
opt.insertBefore(nod_elem, last_child)
break
f = open(cproject, 'w')
dom.writexml(f, cproject+'.new')
代码功能是修改eclipse的工程文件(.cproject),增加宏定义GIT_COMMIT和GIT_BRANCH,将当前编译版本的源码git commit id和分支名编入版本。