用dom读取xml
1.引入包
import xml.dom.minidom
2.定义一个xml文件
xmlTest.xml
<?xml version="1.0" encoding="utf-8"?>
<emps>
<emp id="0" sex="male">
<empno value="0000">7369</empno>
<ename>SMITH</ename>
<job>CLERK</job>
</emp>
<emp id="1" sex="female">
<empno value="1111">7499</empno>
<ename>ALLEN</ename>
<job>SALESMAN</job>
</emp>
<emp id="2">
<empno value="2222">7566</empno>
<ename>JONES</ename>
<job>MANAGER</job>
</emp>
<emp id="3">
<empno>75662</empno>
<ename>JONES2</ename>
<job>MANAGER2</job>
<age>33</age>
</emp>
</emps>
3.用dom读取xml文件
DOM典型的缺点是比较慢,消耗更多的内存,因为DOM会将整个XML数读入内存中(xml文件大时,不建议使用),并为树中的第一个节点建立一个对象。使用DOM的好处是你不需要对状态进行追踪,因为每一个节点都知道谁是它的父节点,谁是子节点。但是DOM用起来有些麻烦。
#xmlTest.py
import xml.dom.minidom
filename = "xmlTest.xml"
###1.DOM(Document Object Model)
print "1.DOM"
dom = xml.dom.minidom.parse(filename)
root = dom.documentElement
emplist = root.getElementsByTagName("emp")
num = 1
for node in emplist:
print "*** emp node %d ***" % num
print node, node.toxml(), node.nodeName, node.getAttribute("id")
print "*** empno node ***"
empnolist = node.getElementsByTagName("empno")
print empnolist[0].toxml(), empnolist[0].nodeName, empnolist[0].firstChild.data
print "*** ename node ***"
enamelist = node.getElementsByTagName("ename")
print enamelist[0].toxml(), enamelist[0].nodeName, enamelist[0].firstChild.data
print "*** job node ***"
joblist = node.getElementsByTagName("job")
print joblist[0].toxml(), joblist[0].nodeName, joblist[0].firstChild.data
print
num = num + 1
4.显示结果
D:\pyProjects>python xmlTest.py
1.DOM
*** emp node 1 ***
<DOM Element: emp at 0x2481788> <emp id="0" sex="male">
<empno value="0000">7369</empno>
<ename>SMITH</ename>
<job>CLERK</job>
</emp> emp 0
*** empno node ***
<empno value="0000">7369</empno> empno 7369
*** ename node ***
<ename>SMITH</ename> ename SMITH
*** job node ***
<job>CLERK</job> job CLERK
*** emp node 2 ***
<DOM Element: emp at 0x2481be8> <emp id="1" sex="female">
<empno value="1111">7499</empno>
<ename>ALLEN</ename>
<job>SALESMAN</job>
</emp> emp 1
*** empno node ***
<empno value="1111">7499</empno> empno 7499
*** ename node ***
<ename>ALLEN</ename> ename ALLEN
*** job node ***
<job>SALESMAN</job> job SALESMAN
*** emp node 3 ***
<DOM Element: emp at 0x2489080> <emp id="2">
<empno value="2222">7566</empno>
<ename>JONES</ename>
<job>MANAGER</job>
</emp> emp 2
*** empno node ***
<empno value="2222">7566</empno> empno 7566
*** ename node ***
<ename>JONES</ename> ename JONES
*** job node ***
<job>MANAGER</job> job MANAGER
*** emp node 4 ***
<DOM Element: emp at 0x2489440> <emp id="3">
<empno>75662</empno>
<ename>JONES2</ename>
<job>MANAGER2</job>
<age>33</age>
</emp> emp 3
*** empno node ***
<empno>75662</empno> empno 75662
*** ename node ***
<ename>JONES2</ename> ename JONES2
*** job node ***
<job>MANAGER2</job> job MANAGER2
D:\pyProjects>