python解析html xml模块_python 之模块之 xml.dom.minidom解析xml

weixin_39842519

于 2020-12-03 20:59:58 发布

阅读量223

点赞数

文章标签： python解析html xml模块

#-*- coding: cp936 -*-#python 27#xiaodeng#python 之模块之 xml.dom.minidom解析xml#http://www.cnblogs.com/coser/archive/2012/01/10/2318298.html#python有三种方法解析XML，SAX，DOM，以及ElementTree

#import xml.dom#这里主要通过xml.dom.minidom创建xml文档，然后解析用以熟悉api#常用方法function()

'''minidom.parse(filename) #加载和读取xml文件

doc.documentElement #获取xml文档对象

node.getAttribute(AttributeName) #获取xml节点属性值

node.getElementsByTagName(TagName) #获取xml节点对象集合

node.childNodes #获取子节点列表

node.childNodes[index].nodeValue #获取xml节点值

node.firstChild #访问第一个节点

n.childNodes[0].data #获得文本值

node.childNodes[index].nodeValue #获取XML节点值

doc=minidom.parse(filename)

doc.toxml('utf-8') #返回Node节点的xml表示的文本'''

#test.xml

'''

War, Thriller

DVD

2003

PG

10

Talk about a US-Japan war

Anime, Science Fiction

DVD

1989

R

8

A schientific fiction

Anime, Action

DVD

4

PG

10

Vash the Stampede!

Comedy

VHS

PG

2

Viewable boredom

'''

#解析案例

from xml.dom importminidom

doc=minidom.parse('test.xml') #parse("foo.xml")

#parseString("")

#实例化

root=doc.documentElement #注意没括号

#文档对象元素

print '--'*25

print root.nodeName #节点名字，collection

print root.nodeValue #节点的值，None

print root.nodeType #节点类型，1

print root.ELEMENT_NODE #1

print '--'*25

#在集合中获取所有电影

nodes=root.getElementsByTagName('movie') #获取xml节点对象集合

#打印每部电影的详细信息

for n innodes:#print n#

#获得电影的title的属性值

#print n.getAttribute('title')

#获取xml节点type对象的具体信息

type= n.getElementsByTagName('type')[0]print "Type:%s" % type.childNodes[0].data##获得文本值

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。