针对XML文件形如:
<某某某>
<Doc ID="1">
<Sentence ID="1">111——111</Sentence>
</Doc>
<Doc ID="2">
<Sentence ID="1">222——111</Sentence>
<Sentence ID="2">222——222</Sentence>
</Doc>
目标:输出每一句话
代码:
name='XML文件名称'
f=open(name+".txt","w",encoding='utf-8')
DOMTree = xml.dom.minidom.parse(name+".xml")
Data = DOMTree.documentElement
docs = Data.getElementsByTagName("Doc")
for doc in docs:
# if doc.hasAttribute("ID"):
# print ("ID: %s" % doc.getAttribute("ID"))
sens = doc.getElementsByTagName('Sentence')
for sen in sens:
s=sen.childNodes[0].data
print(s)
f.write(s)
f.write('\n')
f.close()
搞定!