xpath

最新推荐文章于 2023-01-26 17:02:54 发布

PLA_hui

最新推荐文章于 2023-01-26 17:02:54 发布

阅读量126

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/qq_17481769/article/details/79793047

版权

python 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

定义：1. xml路径语言，拥有在数据结构树中查找节点的能力

2. 被开发者当做小型查询语言使用

3. xpath通过元素和属性进行导航

支持html

比正则表达式简单

比正则表达式强大

scrapy

xpath使用路径表达式在xml文档中选取节点

路径表达式：/ 从根节点选取

// 从匹配选择的当前节点选择文档中的节点，而不考虑他们的位置

@ 选取属性

使用通配符：* 匹配任何元素节点

@*匹配任何属性节点

选取多个路径 | ：/bookstore/book/title | /bookstore/book/author

xpath的使用：安装 lxml: pip install lxml

 
  from 
   lxml  
  import 
   etree 
 
 
  html  
  = 
   etree 
  . 
  parse 
  ( 
  ' 
  hello.html 
  ' 
  ) 
 
 
  print 
  ( 
  type 
  ( 
  html 
  )) 
 
 
  # result=html.xpath('//div/ul|//div/ul/li') 
 
 
  # result=html.xpath('//li//a[@href="pla.html"]') 
 
 
  #result=html.xpath('//li//@href') #获取li下的所有href 
 
 
  #result=html.xpath('//li//@class') #获取li下的所有class 
 
 
  result 
  = 
  html 
  . 
  xpath 
  ( 
  ' 
  //* 
  ' 
  ) 
 
 
  print 
  ( 
  result 
  ) 
 
 
  print 
  ( 
  len 
  ( 
  result 
  )) 
 
 
  print 
  ( 
  type 
  ( 
  result 
  )) 
 
 
  # print(result[0].text)