先使用request对页面进行Html的解析时用utf9编码
data = requests.get(url,cookies={'JSESSIONID':'301D98D176495CA77A115D520E63302B'}).content.decode('utf-8')
dom_tree = etree.HTML(data)
用etree的html函数带入后就可以用xpath进行选择
links=dom_tree.xpath("/html/body/div[2]/div[2]/ul/li[2]/h2")
返回的是个字典类型,用for取出它的值
for index in range(len(links)):
print(links[index].text)