获取猫眼电影信息的dd节点
//dl[@class=“board-wrapper”]/dd
获取电影名称的xpath:
//dl[@class=“board-wrapper”]/dd//p[@class=“name”]/a/text()
获取电影主演的xpath:
//dl[@class=“board-wrapper”]/dd//p[@class=“star”]/text()
获取上映商检的xpath:
//dl[@class=“board-wrapper”]/dd//p[@class=“releasetime”]/text()
lxml module:
from lxml import etree
#获取所有a节点的文本内容
parse_html=etree.HTML(html)
r_list=parse_html.xpath(’//a/text()’)
print(r_list)
#获取所有a节点的href的属性值
parse_html=etree.HTML(html)
r_list=parse_html.xpath(’ //a/@href’)
print(r_list)