python 代码模拟事件

wx_h13813744

已于 2024-05-31 16:28:01 修改

阅读量307

点赞数 1

文章标签： python chrome 开发语言

于 2022-06-26 19:23:52 首次发布

本文链接：https://blog.csdn.net/hello_world_cff/article/details/125472779

版权

参考xpath1
参考xpath2
参考xpath3

环境搭建

安装

pip install selenium

pip install beautifulsoup4

还不行，可试一下下载安装到当前运行的编辑器下的包文件夹里

pip install --target=E:\AZ\python\ANACONDA\envs\py38\Lib\site-packages  beautifulsoup4
pip install --target=E:\AZ\python\ANACONDA\envs\py38\Lib\site-packages selenium

下载

Message: 'chromedriver' executable needs to be in PATH. Please see https://chromedriver.chromium.org/home

chromedriver.exe文件放至python.exe所在目录

以下错误说明下载版本不对，要下载102.0.5005才行

 Message: session not created: This version of ChromeDriver only supports Chrome version 103      
Current browser version is 102.0.5005.115 with binary path C:\Users\CFFHL\AppData\Local\Google\Chrome\Application\chrome.exe
Stacktrace:

参考代码

from bs4 import BeautifulSoup
from selenium import webdriver
target = '网页网址'
option = webdriver.ChromeOptions()
option.add_argument('headless')  # 设置option,后台运行
driver = webdriver.Chrome(chrome_options=option)
driver.get(target)

result= driver.find_element_by_class_name('需要点击的类名')
result.click()


result_list= driver.find_elements_by_class_name('需要点击的类名')
for i in range(4, 8):
    result_list[i].click()


selenium_page = driver.page_source
driver.quit()
soup = BeautifulSoup(selenium_page, 'html.parser')
# one = soup.find('div', {'class': '布拉布拉类名'}) 单个
many= cities.find_all('div', {'class': '咕噜咕噜类名'})  #多个
for i in many:
        content = i.find_all('p') #找到对应元素
        nation = content[0].get_text() # 读取内容

以分析百度首页为例

from bs4 import BeautifulSoup
from selenium import webdriver
target = 'https://www.baidu.com/'
option = webdriver.ChromeOptions()
option.add_argument('headless')  # 设置option,后台运行
driver = webdriver.Chrome(chrome_options=option)
driver.get(target)

result= driver.find_element_by_class_name('bg s_btn btnhover')
result.click()


result_list= driver.find_elements_by_class_name('bg s_btn btnhover')
for i in range(4, 8):
    result_list[i].click()


selenium_page = driver.page_source
driver.quit()
soup = BeautifulSoup(selenium_page, 'html.parser')
print(soup)
# one = soup.find('div', {'class': '布拉布拉类名'}) 单个
# many= soup.find_all('div', {'class': '咕噜咕噜类名'})  #多个
# for i in many:
#         content = i.find_all('p') #找到对应元素
#         nation = content[0].get_text() # 读取内容

出现以下，或者正确结果，说明环境搭建成功

AttributeError: 'WebDriver' object has no attribute 'find_element_by_class_name'

加入

from selenium.webdriver.common.by import By

把元素选取改为


result_list= driver.find_element(By.XPATH,r'网页元素复制过来的').click()

控制台调试

$x('网页元素复制过来的xpath')

以下错误可忽略：

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
requests 2.22.0 requires urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1, but you have urllib3 1.26.9 which is incompatible.