selenium 爬虫

最新推荐文章于 2024-10-03 09:02:12 发布

chenkan0214

最新推荐文章于 2024-10-03 09:02:12 发布

阅读量102

点赞数

文章标签： python 爬虫测试

原文链接：https://my.oschina.net/u/3771014/blog/1634194

版权

基础：


"""
Selenium
是一个用于文本应用程序测试工具，提供一些函数通过这些函数可以指定操作到指定的标签，这些定位标签API函数就是通过python实现的，框架底层是
同过javascipt实现的，完全模拟用户操作

#使用selenium做爬虫的目的：
有些网站通过动态加载的方式来展示数据，这些网站在正常请求时，数据没有办法拿回来，就可以使用selenium加载操作网页，等待数据加载完成后，在继续解析数据

用户模拟登录，然后直接访问数据的操作，并且在操作中，不需要手动提取cookie，浏览器会根据操作请求，自动携带一些需要的数据

"""
#引用selenium中的webdriver
from selenium import webdriver
# 创建一个火狐浏览器对象，会自动打开浏览器
firefox = webdriver.Firefox()
# chorme = webdriver.Chrome()
# 打开一个目标网址
firefox.get("http://www.baidu.com")
# chorme.get("htttp://www.baidu.com")

# # 通过class属性值查找
# firefox.find_element_by_class_name()
# #通过id 属性值查找
# firefox.find_element_by_id()
# # 通过超链接文本内容查找
# firefox.find_element_by_link_text()
# # 通过css选择器查找
# firefox.find_element_by_css_selector()
# # 通过name属性值查找
# firefox.find_element_by_name()
# # 通过标签名
# firefox.find_element_by_tag_name()
# # 通过xpath查找
# firefox.find_element_by_xpath()

ele = firefox.find_element_by_id("kw")
# ele = firefox.find_element_by_class("s_ipt")
# ele = chorme.find_element_by_class("s_ipt")

# 向输入框输入内容
# ele.send_keys("selenium")
# #找到百度一下的按钮
# btn = firefox.find_element_by_id("su")
# # 点击
# btn.click()

# get_attribute 获取标签内的属性值
res = ele.get_attribute("class")
print(res)
#获取标签文本内容
res = ele.text
print(res)
# 获取标签的名称
res = ele.tag_name
print(res)
#判断是否被选中
res = ele.is_selected()
print(res)
#判断标签是否可以
res = ele.is_enabled()
print(res)
#向文本框内输入一些数据
ele.send_keys("selenium")
# 点击
ele.click()
ele.submit()#提交表单
import time
time.sleep(1)
res = ele.clear()# 清空输入框的内容

# 截图
ele.screenshot('test.png')
# 退出浏览器
firefox.quit()

转载于:https://my.oschina.net/u/3771014/blog/1634194