WebElement.find_element + xpath + 双斜杠// 的巨坑

bigcarp

已于 2022-03-11 19:00:40 修改

阅读量1.4k

点赞数 4

分类专栏：编程文章标签： python selenium webElement findElement

于 2022-01-12 16:46:13 首次发布

本文链接：https://blog.csdn.net/bigcarp/article/details/122456325

版权

编程专栏收录该内容

92 篇文章 6 订阅

订阅专栏

先说结论：selenium官方手册，明确WebElement.find_element 是可以缩小搜索范围的。

但下面的例子的第2条语句似乎并没有得到预期结果：

1、driver.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
2、ele.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')

以上两个语句，获取到的都是整个页面的第一个匹配项。以上第2个例子之所以不是在容器内搜索，而是在整个页面中搜索，原因在于双斜杠“//”，去掉双斜杠才符合预期。

我的测试页面如下：

<html>
<meta content="text/html; charset=utf-8">

<body>
    <div class="country">
        <div class="provice">
            <div class="city">
                <div>广州</div>
                <div class="gdp">NO.4</div>
            </div>
        </div>
        <div class="provice">
            <div class="city">
                <div>深圳</div>
                <div class="gdp">NO.3</div>
            </div>
        </div>
    </div>
</body>
</html>

from selenium import webdriver
from selenium.webdriver.common.by import By


driver=webdriver.Chrome()
url='http://127.0.0.1:8000/a/x.htm'
driver.get(url)

eleContaine=driver.find_element(By.XPATH,'//div[text()="深圳"]/..')
print(eleContaine.get_attribute('outerHTML'))
ele=eleContaine.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
print(ele.text)

我得到结果是：深圳、广州都是显示【GDP排名 NO.4】

经过测试，把双斜杠去掉就ok了

ele=eleContaine.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')

改为：

ele=eleContaine.find_element(By.XPATH,'div[@class="gdp"]')

查看xpath文档对双斜杠的说明。

Index	Expression	Description
1)	nodename	Selects all nodes with the name "nodename"
2)	/	Selects from the root node.
3)	//	Selects nodes in the document from the current node that match the selection no matter where they are. （从当前节点匹配文档中的节点）-->总感觉歧义很大，究竟搜索范围是the document当还是curent node？
4)	.	Selects the current node
5)	..	Selects the parent of the current node
6)	@	Selects attributes

See the path expressions and their details in the above example:

Path Expression	Result
bookstore	Selects all nodes with the name "bookstore"
/bookstore	Selects the root element bookstore. Note: if the path starts with a slash ( / ) it always represents an absolute path to an element!
bookstore/book	Selects all book elements that are children of bookstore.
//book	Selects all book elements no matter where they are in the document.
bookstore//book	Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element.
//@lang	Selects all attributes that are named lang.

------------------

项目中好几次“从列表中匹配某一项”，发现得到的结果都不是预期的结果。甚是奇怪。查了selenium官方手册，也是明确WebElement.find_element 可以缩小搜索范围。

Evaluating a subset of the DOM
Rather than finding a unique locator in the entire DOM, it is often useful to narrow the search to the scope of another located element. ==>翻译：与其在整个 DOM 中找到唯一的定位器，不如将搜索范围缩小到另一个定位元素的范围通常很有用。

<ol id="vegetables">
 <li class="potatoes">…
 <li class="onions">…
 <li class="tomatoes"><span>Tomato is a Vegetable</span>…
</ol>
<ul id="fruits">
  <li class="bananas">…
  <li class="apples">…
  <li class="tomatoes"><span>Tomato is a Fruit</span>…
</ul>


===============

fruits = driver.findElement(By.ID,"fruits")
fruit = fruits.findElement(By.ID,"tomatoes")

bigcarp

关注

4
点赞
踩
4

收藏

觉得还不错? 一键收藏
2
评论
WebElement.find_element + xpath + 双斜杠// 的巨坑

先说结论：1、driver.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')2、eleContainer.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')这两个获取到的都是整个页面的第一个匹配项。（2）之所以不是在容器内搜索，而是在整个页面中搜索，原因在于双斜杠“//”，去掉双斜杠才符合预期。我的测试页面如下：<html&g
复制链接

扫一扫

专栏目录