xpath用法详解

最新推荐文章于 2023-06-14 14:05:00 发布

划船的使者

最新推荐文章于 2023-06-14 14:05:00 发布

阅读量658

点赞数 1

文章标签： xpath

本文链接：https://blog.csdn.net/weixin_42185136/article/details/112003877

版权

XPath 网页元素属性筛选文本匹配定位技术

关键词由CSDN通过智能技术生成

#选择不包含class属性的节点
result = article.xpath("//span[not(@class)]");
#选择不包含class和id属性的节点
result = article.xpath("//span[not(@class) and not(@id)]");
#选择不包含class="expire"的span
result = article.xpath("//span[not(contains(@class,'expire'))]");
#选择包含class="expire"的span
result = article.xpath(".//span[contains(@class,'expire')]");
#查找name属性中开始位置包含'name1'关键字的页面元素
result = article.xpath("//input[starts-with(@name,'name1')]");
#查找name属性中包含na关键字的页面元素
result = article.xpath("//input[contains(@name,'na')]");
#查找文本中包含百度搜索的文本
result = article.xpath("//a[contains(text(),'百度搜索')]");
result = article.xpath("//a[contains(string(), '百度搜索')]/text()");
#查找文本等于百度搜索的节点
result = article.xpath("//a[text()='百度搜索']");
result = article.xpath("//input[@type='submit'][@name='fuck']");
result = article.xpath("//input[@type='submit' and @name='fuck']");
result = article.xpath("//input[@type='submit' or @name='fuck']");
#它会取class含有有a和b的元素
result = article.xpath('//div[contains(@class,"a") and contains(@class,"b")]') 
#它会取class 含有 a 或者 b满足时，或者同时满足时的元素
result = article.xpath('//div[contains(@class,"a") or contains(@class,"b")]') 
#查找所有input标签中含有type属性的元素
result = article.xpath("//input[@type]");
#匹配id以aa开头的元素，id='aaname'
result = article.xpath("//input[start-with(@id,'aa')]");
#匹配id以aa结尾的元素，id='nameaa'
result = article.xpath("//input[ends-with(@id,'aa')]");
#匹配id中含有aa的元素，id='nameaaname'
result = article.xpath("//input[contains(@id,'aa')]");
#匹配所有input元素中含有属性的name的元素
result = article.xpath("//input[@*='name']");
#选取 id 属性为 form 的任意属性内部，并且 type 属性为 text 的任意元素。这里会找到 input
result = article.xpath("//*[@id='form']//*[@type='text']");
#先通过/..找到 span 的父节点，再通过父节点找到 div
result = article.xpath("//span[@class='bg']/../div");
#查找倒数第几个子元素，选取 form 下的倒数第一个 span
result = article.xpath("//form[@id='form']/span[last()-1]");
#使用 position() 函数，选取 from 下第二个 span
result = article.xpath("//form[@id='form']/span[position()=2]");
#使用 position() 函数，选取下标大于 2 的 span
result = article.xpath("//form[@id='form']/span[position()>2]");
#使用|，同时查找多个路径，取或
result = article.xpath("//form[@id='form']//span | //form[@id='form']//input");
# 获取每组li中的第一个li节点里面的a的文本
result = html.xpath("//li[1]/a/text()") 
 # 获取每组li中最后一个li节点里面的a的文本
result = html.xpath("//li[last()]/a/text()")
# 获取每组li中前两个li节点里面的a的文本
result = html.xpath("//li[position()<3]/a/text()") 
# 获取每组li中倒数第三个li节点里面的a的文本
result = html.xpath("//li[last()-2]/a/text()") 
# 获取li的所有祖先节点
result = html.xpath("//li[1]/ancestor::*") 
 # 获取li的所有祖先中的ul节点
result = html.xpath("//li[1]/ancestor::ul")
 # 获取li中a节点的所有属性值
result = html.xpath("//li[1]/a/attribute::*")
#获取li子节点中属性href值的a节点
result = html.xpath("//li/child::a[@href='搜狐']") 
 # 获取body中的所有子孙节点a
result = html.xpath("//body/descendant::a")
 #获取li中的第三个节点
result = html.xpath("//li[3]")
 #获取第三个li节点之后所有li节点
result = html.xpath("//li[3]/following::li")
#获取第三个li节点之后所有同级li节点
result = html.xpath("//li[3]/following-sibling::*") 
#获取ul li下面所有span标签的文本值
result = html.xpath("//ul//li//span//text()")

划船的使者

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
xpath用法详解

#选择不包含class属性的节点result = article.xpath("//span[not(@class)]");#选择不包含class和id属性的节点result = article.xpath("//span[not(@class) and not(@id)]");#选择不包含class="expire"的spanresult = article.xpath("//span[not(contains(@class,'expire'))]");#选择包含class="expire"的
复制链接

扫一扫