先说结论:selenium官方手册,明确WebElement.find_element 是可以缩小搜索范围的。
但下面的例子的第2条语句似乎并没有得到预期结果:
1、driver.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
2、ele.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
以上两个语句,获取到的都是整个页面的第一个匹配项。 以上第2个例子之所以不是在容器内搜索,而是在整个页面中搜索,原因在于双斜杠“//”,去掉双斜杠才符合预期。
我的测试页面如下:
<html>
<meta content="text/html; charset=utf-8">
<body>
<div class="country">
<div class="provice">
<div class="city">
<div>广州</div>
<div class="gdp">NO.4</div>
</div>
</div>
<div class="provice">
<div class="city">
<div>深圳</div>
<div class="gdp">NO.3</div>
</div>
</div>
</div>
</body>
</html>
from selenium import webdriver
from selenium.webdriver.common.by import By
driver=webdriver.Chrome()
url='http://127.0.0.1:8000/a/x.htm'
driver.get(url)
eleContaine=driver.find_element(By.XPATH,'//div[text()="深圳"]/..')
print(eleContaine.get_attribute('outerHTML'))
ele=eleContaine.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
print(ele.text)
我得到结果是:深圳、广州 都是显示 【GDP排名 NO.4】
经过测试,把双斜杠去掉就ok了
ele=eleContaine.find_element(By.XPATH,'//div[@class="city"]/div[@class="gdp"]')
改为:
ele=eleContaine.find_element(By.XPATH,'div[@class="gdp"]')
查看xpath文档对双斜杠的说明。
Index | Expression | Description |
---|---|---|
1) | nodename | Selects all nodes with the name "nodename" |
2) | / | Selects from the root node. |
3) | // | Selects nodes in the document from the current node that match the selection no matter where they are. (从当前节点匹配文档中的节点)-->总感觉歧义很大,究竟搜索范围是the document当还是curent node? |
4) | . | Selects the current node |
5) | .. | Selects the parent of the current node |
6) | @ | Selects attributes |
See the path expressions and their details in the above example:
Path Expression | Result |
---|---|
bookstore | Selects all nodes with the name "bookstore" |
/bookstore | Selects the root element bookstore. Note: if the path starts with a slash ( / ) it always represents an absolute path to an element! |
bookstore/book | Selects all book elements that are children of bookstore. |
//book | Selects all book elements no matter where they are in the document. |
bookstore//book | Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element. |
//@lang | Selects all attributes that are named lang. |
------------------
项目中好几次“从列表中匹配某一项”,发现得到的结果都不是预期的结果。甚是奇怪。查了selenium官方手册,也是明确WebElement.find_element 可以缩小搜索范围。
Evaluating a subset of the DOM
Rather than finding a unique locator in the entire DOM, it is often useful to narrow the search to the scope of another located element. ==>翻译:与其在整个 DOM 中找到唯一的定位器,不如将搜索范围缩小到另一个定位元素的范围通常很有用。
<ol id="vegetables">
<li class="potatoes">…
<li class="onions">…
<li class="tomatoes"><span>Tomato is a Vegetable</span>…
</ol>
<ul id="fruits">
<li class="bananas">…
<li class="apples">…
<li class="tomatoes"><span>Tomato is a Fruit</span>…
</ul>
===============
fruits = driver.findElement(By.ID,"fruits")
fruit = fruits.findElement(By.ID,"tomatoes")