Python---selenium 使用及定位

我姓曹，谢谢

已于 2023-12-14 09:48:41 修改

阅读量2k

点赞数 2

文章标签： python selenium

于 2023-03-14 17:20:22 首次发布

本文链接：https://blog.csdn.net/weixin_57999977/article/details/129533368

版权

使用find_element_by_*() 方法只需导入 from selenium import webdriver，使用 find_element() 方法除了导入 from selenium import webdriver ，还要导入 from selenium.webdriver.common.by import By。

Selenium4 提供了 8 种定位（单）节点的方法，如下表所示：

方法说明

find_element(By.ID) 通过 id 属性值定位节点

find_element(By.NAME) 通过 name 属性值定位节点

find_element(By.CLASS_NAME) 通过 class 属性值定位节点

find_element(By.TAG_NAME) 通过 tag 标签名定位节点

find_element(By.LINK_TEXT) 通过<a>标签内文本定位节点，即精准定位。

find_element(By.PARTIAL_LINK_TEXT)通过<a>标签内部分文本定位节点，即模糊定位。

find_element(By.XPATH) 通过 xpath 表达式定位节点

find_element(By.CSS_SELECTOR) 通过 css 选择器定位节点

find_element 找的是第一个符合条件的节点，

如果要查找所有符合条件的标签，需要用 find_elements，find_elements 的返回值是列表类型，可以用for循环遍历里面的节点。

一、定位


from selenium import webdriver
from selenium.webdriver.common.by import By

# 创建浏览器对象并访问网站
browser = webdriver.Chrome()
url = "https://www.baidu.com"
browser.get(url)

# 根据 id 定位
browser.find_element(By.ID,'su')

# 根据 name 定位
# 返回第一个元素
browser.find_element(By.CLASS_NAME,'fenlei')
# 返回所有元素
browser.find_elements(By.CLASS_NAME,'fenlei')

# 根据 class 定位
browser.find_element(By.NAME,'title-text c-font-medium c-color-t')
browser.find_elements(By.NAME,'title-text c-font-medium c-color-t')

# 根据标签名称定位
browser.find_element(By.TAG_NAME,'input')
browser.find_elements(By.TAG_NAME,'input')

# 使用链接文本定位超链接
browser.find_element(By.LINK_TEXT,'index')
browser.find_elements(By.LINK_TEXT,'index')
browser.find_element(By.PARTIAL_LINK_TEXT,'index')
browser.find_elements(By.PARTIAL_LINK_TEXT,'index')

# 使用 xpath 定位
browser.find_element(By.XPATH,'//input[@id="su"]')
browser.find_elements(By.XPATH,'//input[@id="su"]')

# 使用 CSS 选择器定位
browser.find_element(By.CSS_SELECTOR,'#su')
browser.find_elements(By.CSS_SELECTOR,'#su')

# 关闭浏览器
browser.close()

# 父找子
# 1.串联寻找
print driver.find_element_by_id('B').find_element_by_tag_name('div').text

# 2.xpath父子关系寻找
print driver.find_element_by_xpath("//div[@id='B']/div").text

# 3.css selector父子关系寻找
print driver.find_element_by_css_selector('div#B>div').text

# 4.css selector nth-child
print driver.find_element_by_css_selector('div#B div:nth-child(1)').text

# 5.css selector nth-of-type
print driver.find_element_by_css_selector('div#B div:nth-of-type(1)').text

# 6.xpath轴 child
print driver.find_element_by_xpath("//div[@id='B']/child::div").text
driver.quit()

# 子找父
# 1.xpath: `.`代表当前节点; '..'代表父节点
print driver.find_element_by_xpath("//div[@id='C']/../..").text

# 2.xpath轴 parent
print driver.find_element_by_xpath("//div[@id='C']/parent::*/parent::div").text

# 哥哥节点
# 1.xpath,通过父节点获取其哥哥节点
print driver.find_element_by_xpath("//div[@id='D']/../div[1]").text

# 2.xpath轴 preceding-sibling
print driver.find_element_by_xpath("//div[@id='D']/preceding-sibling::div[1]").text

# 弟弟节点
# 1.xpath，通过父节点获取其弟弟节点
print driver.find_element_by_xpath("//div[@id='D']/../div[3]").text

# 2.xpath轴 following-sibling
print driver.find_element_by_xpath("//div[@id='D']/following-sibling::div[1]").text

# 3.xpath轴 following
print driver.find_element_by_xpath("//div[@id='D']/following::*").text

# 4.css selector +
print driver.find_element_by_css_selector('div#D + div').text

# 5.css selector ~
print driver.find_element_by_css_selector('div#D ~ div').text

二、获取节点数据


from selenium import webdriver
from selenium.webdriver.common.by import By

browser = webdriver.Chrome()
url = "https://www.baidu.com"
browser.get(url)

# 定位节点
element = browser.find_element(By.ID,'su')

# 获取class的属性值
print(element.get_attribute('class'))
# 获取id的属性值
print(element.get_attribute('id'))
# 获取type的属性值
print(element.get_attribute('type'))
# 获取value的属性值
print(element.get_attribute('value'))

browser.close()
element1 = browser.find_element_by_link_text("地图")
# 获取节点的宽高
print(element1.size)

三、特殊节点处理（shadow dom）

此节点为动态加载的html标签，需要特殊处理（嵌入js语句切换到shadow dom节点里面获取相关数据，特别注意在shadow dom里面不可以使用xpath寻找元素，会报错显示找不到该元素）


def expand_shadow_element(element):
    shadow_root = self.driver.execute_script('return arguments[0].shadowRoot', element)
    return shadow_root


root1 = self.driver.find_element(By.XPATH, '/html/body/gradio-app')

shadow_root1 = expand_shadow_element(root1)
a = shadow_root1.find_element(By.ID, 'txt2img_prompt').find_element(By.TAG_NAME, 'textarea')
a.click()
a.clear()
a.send_keys(self.in_word if self.in_word else 'sea')