Selenium应用
一. Selenium背景与概念
Selenium是一个Web应用程序测试工具。Selenium测试直接运行在浏览器中,支持.Net、Java、Perl、Python等不同语言的测试脚本。
因为各大应用平台没有对应的服务发布接口,sdk也没有覆盖相应的功能,所以此处转换了一种思路,使用Selenium操作各应用平台的web页面间接实现app发布的功能。各大应用商像 “360开放平台”,“腾讯应用宝”,“VIVO应用商店”,“OPPO软件商店”,“百度开放平台”,“小米”,“华为”,“魅族”,"阿里应用分发开放平台"等,此处以腾讯应用平台为示例,实现app发布功能
二. Selenium组成
- Selenium Hub :也叫做 Selenium Server,用来接收服务请求,并将收到的请求转发给相应的node执行测试请求。
- Selenium Node :Selenium任务执行的主体,用来接收Server发过来的任务,并调用相应的浏览器执行测试请求。
- Python test :编写任务,并将任务发送给 Selenium Hub。
三. Selenium搭建
1. java环境准备,此处不做详解,Hub/node都需要。
2. Selenium Hub搭建
- 下载 selenium-server-standalone jar 包:/data/sel/selenium-server-standalone-2.42.2.jar
- 准备 selenium配置文件 /data/sel/config.json
{ "host": null, "port": 4444, "prioritizer": null, "capabilityMatcher": "org.openqa.grid.internal.utils.DefaultCapabilityMatcher", "throwOnCapabilityNotPresent": true, "newSessionWaitTimeout": -1, "jettyMaxThreads": -1, "nodePolling": 5000, "cleanUpCycle": 5000, "timeout": 30000, "browserTimeout": 0, "maxSession": 5, "unregisterIfStillDownAfter": 30000 }
- 启动 Selenium Hub服务
#] java -jar /data/sel/selenium-server-standalone-2.42.2.jar -role hub -hubConfig /data/sel/config.json Sep 29, 2020 5:07:15 PM org.openqa.grid.selenium.GridLauncher main INFO: Launching a selenium grid server 2020-09-29 17:07:16.404:INFO:osjs.Server:jetty-7.x.y-SNAPSHOT 2020-09-29 17:07:16.427:INFO:osjsh.ContextHandler:started o.s.j.s.ServletContextHandler{/,null} 2020-09-29 17:07:16.435:INFO:osjs.AbstractConnector:Started SocketConnector@0.0.0.0:4444
有监听端口提示后,代表 Selenium Hub服务端搭建启动完成
3. Selenium Node搭建
-
下载 selenium-server-standalone jar 包:D:\apps\selenium-server-standalone-2.42.2.jar
-
安装谷歌浏览器
-
安装 chromedriver.exe 谷歌浏览器驱动:D:\apps\chromedriver.exe
-
准备 selenium配置文件 D:\apps\config.json
{ "capabilities": [ { "browserName": "*googlechrome", "maxInstances": 1, "seleniumProtocol": "Selenium" }, { "browserName": "chrome", "maxInstances": 1, "seleniumProtocol": "WebDriver" } ], "configuration": { "proxy": "org.openqa.grid.selenium.proxy.DefaultRemoteProxy", "maxSession": 1, "port": 5555, "register": true, "registerCycle": 5000 } }
-
打开cmd,启动 Selenium Node服务
java -Dwebdriver.chrome.driver=D:\apps\chromedriver.exe -jar D:\apps\selenium-server-tandalone-2.42.2.jar -nole node -host 本机地址 -hub http://HUB地址:4444/grid/register -nodeConfig D:\apps\config.json 9月 29, 2020 5:14:34 下午 org.openqa.grid.selenium.GridLauncher main 信息: Launching a selenium grid node 17:14:35.101 INFO - Java: Oracle Corporation 15+36-1562 17:14:35.101 INFO - OS: Windows 10 10.0 amd64 17:14:35.105 INFO - v2.42.2, with Core v2.42.2. Built from revision 6a6995d 17:14:35.144 INFO - RemoteWebDriver instances should connect to: http://127.0.0.1:5555/wd/hub 17:14:35.145 INFO - Version Jetty/5.1.x 17:14:35.145 INFO - Started HttpContext[/selenium-server,/selenium-server] 17:14:35.175 INFO - Started org.openqa.jetty.jetty.servlet.ServletHandler@71c3b41 17:14:35.175 INFO - Started HttpContext[/wd,/wd] 17:14:35.176 INFO - Started HttpContext[/selenium-server/driver,/selenium-server/driver] 17:14:35.176 INFO - Started HttpContext[/,/] 17:14:35.177 INFO - Started SocketListener on 0.0.0.0:5555 17:14:35.177 INFO - Started org.openqa.jetty.jetty.Server@3ce1e309 17:14:35.178 INFO - using the json request : {"capabilities":[{"seleniumProtocol":"Selenium","browserName":"*googlechrome","maxInstances":1,"platform":"XP"},{"seleniumProtocol":"WebDriver","browserName":"chrome","maxInstances":1,"platform":"XP"}],"configuration":{"role":"node","remoteHost":"http://192.168.89.1:5555","hubHost":"192.168.89.133","hubPort":4444,"nodeConfig":"D:\\apps\\config.json","url":"http://192.168.89.1:5555","proxy":"org.openqa.grid.selenium.proxy.DefaultRemoteProxy","hub":"http://192.168.89.133:4444/grid/register","port":5555,"host":"192.168.89.1","maxSession":1,"registerCycle":5000,"register":true},"class":"org.openqa.grid.common.RegistrationRequest"} 17:14:35.178 INFO - Starting auto register thread. Will try to register every 5000 ms. 17:14:35.178 INFO - Registering the node to hub :http://hub地址:4444/grid/register
有这样的提示,代表Selenium node启动成功。
注意 :node与hub直接网络互通,如果无法访问,则不能连接
4. Python 任务编写运行
- 安装selenium
#] pip3 install selenium
- 编写python脚本 qqplatform.py
from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.desired_capabilities import DesiredCapabilities from selenium.webdriver.common.by import By import time class QQPlatform(): def __init__(self): self.url = "https://open.qq.com/login" self.username = "username" self.password = "password" self.hub = 'hub地址' self.browser = webdriver.Remote( command_executor='http://%s:4444/wd/hub'%(self.hub), desired_capabilities=DesiredCapabilities.CHROME) self.browser.maximize_window() def login(self): print("开始登录") self.browser.get(self.url) time.sleep(0.2) self.browser.implicitly_wait(30) self.browser.switch_to.frame('login_frame') self.browser.find_element_by_id("switcher_plogin").click() time.sleep(0.2) self.browser.find_element_by_id("u").send_keys(self.username) self.browser.find_element_by_id("p").send_keys(self.password) time.sleep(0.2) self.browser.find_element_by_id("login_button").click() WebDriverWait(self.browser,15).until(lambda the_driver: the_driver.find_element_by_id('_leftSidebar').is_displayed()) print("登录成功") def upload(self,apk): # 新功能 print("开始上传") apkName=apk.split('/')[-1] handle = self.browser.current_window_handle self.browser.find_element_by_class_name('cover').click() self.browser.find_element_by_partial_link_text('更新安装包').click() print(self.browser.current_url) handles=self.browser.window_handles for newhandle in handles: if newhandle != handle: self.browser.switch_to.window(newhandle) print(self.browser.current_url) try: self.browser.find_element_by_class_name("webuploader-element-invisible").send_keys(apk) self.browser.implicitly_wait(20) except: print("软件包上传失败") exit(2) # 下一步: 待优化 time.sleep(30) # verifyTextPresent(pattern):校验当前页面是否出现该文字 # WebDriverWait(self.browser, 30).until( # lambda the_driver: the_driver.find_element(By.XPATH, '//p[text()="%s")]'%(apkName)).is_displayed()) print("软件包上传成功") def submit(self,isTest): print("开始发布") self.browser.find_element_by_id('j-submit-btn').click() time.sleep(3) if isTest: # 测试打开 self.browser.find_element_by_class_name('j-confirm-no').click() else: # 正式使用打开 self.browser.find_element_by_id('j-confirm-yes').click() time.sleep(1) print("软件包已发布") def close(self): time.sleep(2) print("退出浏览器") time.sleep(2) self.browser.quit() print("应用发布结束") if __name__ == '__main__': apk="/data/apks/_V_2.0.7_300_2020-09-27.apk" isTest=True qq = QQPlatform() qq.login() qq.upload(apk=apk) qq.submit(isTest=isTest) qq.close()
- 运行python脚本
#] python3 qqplatform.py
四. Selenium取消浏览器页面控制提示
selenium打开浏览器后,会有 浏览器受selenium控制 的提示,添加如下配置,则可隐藏这个提示
from selenium import webdriver
from selenium.webdriver import ChromeOptions
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
option = ChromeOptions()
option.add_experimental_option('excludeSwitches', ['enable-automation'])
browser = webdriver.Remote(
command_executor='http://hub:4444/wd/hub',
desired_capabilities=DesiredCapabilities.CHROME,
options=option
)
此类配置并不能影响window.navigator.webdriver的值,只要使用selenium,此值仍然为true。
五. Selenium 控制浏览器标签页相关操作
方法 | 说明 |
---|---|
set_window_size() | 设置浏览器的大小 |
maximize_window() | 最大化浏览器 |
refresh() | 刷新当前页面 |
clear() | 清除文本 |
send_keys(value) | 模拟按键输入 |
click() | 单击元素 |
is_displayed() | 设置该元素是否用户可见 |
get_attribute(name) | 获取元素属性值 |
back() | 控制浏览器后退 |
forward() | 控制浏览器前进 |
size | 返回元素的尺寸 |
text | 获取元素的文本 |
switch_to.frame(id) | 切换到指定iframe框 |
current_url | 获取当前浏览器url |
current_window_handle | 获取当前浏览器句柄 |
window_handles | 获取当前浏览器所有句柄 |
switch_to.window(newhandle) | 获取新的浏览器句柄 |
switch_to.default_content() | 返回iframe上一层 |
六. Selenium 验证码识别技术
1. 介绍
很多网站都会做Selenium自动化识别,你手动登录的时候无需验证码直接登录,但是当你使用Selenium控制登录时就会有各种各样的验证码操作,此处只针对<字母数字干扰>验证码图做讲解
2. 百度图像识别API服务
- 文档地址: https://ai.baidu.com/ai-doc/REFERENCE/Ck3dwjhhu
- vin码识别地址: https://cloud.baidu.com/doc/OCR/s/zk3h7y51e
3. 百度图片识别API服务操作
- 获取 client_id 和 client_secret
- 登录 百度智能云
- 进入 控制台 > 图像识别 > 创建应用(例:vin码识别) > 填写相应内容 > 立即创建 > 应用列表 > 根据名称可以找到API Key 和 Secret Key
- client_id 就是 API Key,client_secret 就是 Secret Key
- 通过API Key和Secret Key获取的access_token
import requests client_id = "xxx" client_secret = "jjj" host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=%s&client_secret=%s' % ( client_id, client_secret) response = requests.get(host) if response: res1 = response.json() access_token = res1['access_token']
- 通过 access_token 获取验证码
import requests import base64 f = open('/data/03.png', 'rb') img = base64.b64encode(f.read()) params = {"image": img} request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/vin_code" access_token = "xxxaaafff" request_url = request_url + "?access_token=" + access_token headers = {'content-type': 'application/x-www-form-urlencoded'} response = requests.post(request_url, data=params, headers=headers) res = response.json() if "words_result" in res.keys(): words_result = res['words_result'] words = words_result[0]['words']
4. Selenium 使用 百度图像功能 集成验证码识别
# coding = utf-8
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import time
from PIL import Image, ImageEnhance
import requests
import base64
from selenium.webdriver import ChromeOptions
class TreePlatform():
def __init__(self):
""
self.url = "http://dev.360.cn/"
self.username = "xxx"
self.password = "yyy"
self.hub = '192.168.89.133'
self.option = ChromeOptions()
self.option.add_experimental_option('excludeSwitches', ['enable-automation'])
self.browser = webdriver.Remote(
command_executor='http://%s:4444/wd/hub' % (self.hub),
desired_capabilities=DesiredCapabilities.CHROME,
options=self.option
)
self.browser.maximize_window()
self.num = 0
def checkYZM(self):
self.num += 1
self.browser.save_screenshot("/data/01.png")
imgElement = self.browser.find_element_by_class_name('quc-captcha-img')
ran = Image.open("/data/01.png")
left = imgElement.location['x'] # 区块截图左上角在网页中的x坐标
top = imgElement.location['y'] # 区块截图左上角在网页中的y坐标
right = left + imgElement.size['width'] # 区块截图右下角在网页中的x坐标
bottom = top + imgElement.size['height'] # 区块截图右下角在网页中的y坐标
box = (left, top, right, bottom)
ran.crop(box).save("/data/02.png")
imageCode = Image.open("/data/02.png")
sharp_img = ImageEnhance.Contrast(imageCode).enhance(2.0)
sharp_img.save("/data/03.png")
sharp_img.load() # 对比度增强
time.sleep(2)
request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/vin_code"
# request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/webimage_loc"
# 二进制方式打开图片文件
f = open('/data/03.png', 'rb')
img = base64.b64encode(f.read())
params = {"image": img}
# 获取百度图片验证的access_token
client_id = "xxxxxx"
client_secret = "yyyyyy"
host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=%s&client_secret=%s' % (
client_id, client_secret)
response = requests.get(host)
access_token = "afaefefadf"
if response:
res1 = response.json()
access_token = res1['access_token']
# access_token = '24.6446d03681246f106b5778014157565c.2592000.1588924992.282335-19323883'
request_url = request_url + "?access_token=" + access_token
headers = {'content-type': 'application/x-www-form-urlencoded'}
response = requests.post(request_url, data=params, headers=headers)
res = response.json()
print(res)
if "words_result" in res.keys():
words_result = res['words_result']
if len(words_result) >= 1:
words = words_result[0]['words']
return words
else:
return -1
else:
return -1
def whileCheckYZM(self):
if self.num >= 20:
print("尝试次数过多,已退出,请手动发布")
exit(2)
result = self.checkYZM()
print("result", result)
if result == -1:
self.browser.find_element_by_class_name("quc-captcha-change-link").click()
return self.whileCheckYZM()
else:
print(result)
if len(result) <= 3:
self.browser.find_element_by_class_name("quc-captcha-change-link").click()
return self.whileCheckYZM()
self.browser.find_element_by_name('phrase').clear()
self.browser.find_element_by_name('phrase').send_keys(result)
time.sleep(3)
self.browser.find_element_by_class_name("quc-button-sign-in").click()
time.sleep(3)
try:
if self.browser.find_element_by_class_name('quc-tip-error').is_displayed():
return self.whileCheckYZM()
except:
print("登录成功")
def login(self):
print("开始登录")
self.browser.get(self.url)
time.sleep(0.2)
self.browser.implicitly_wait(30)
self.browser.find_element_by_class_name("js-signIn").click()
time.sleep(0.2)
self.browser.find_element_by_name("account").send_keys(self.username)
self.browser.find_element_by_name("password").send_keys(self.password)
time.sleep(2)
# 难点: 输入验证码 请输入验证码 可见,则需要输入验证码
self.browser.find_element_by_class_name("quc-button-sign-in").click()
try:
if self.browser.find_element_by_class_name('quc-tip-error').is_displayed():
self.whileCheckYZM()
except:
print("登录成功")
def close(self):
time.sleep(2)
print("退出浏览器")
self.browser.quit()
七. Selenium跳过登录验证
1.使用 cookies 屏蔽 selenium 登录验证问题
-
操作流程
- 打开要测试的网页,获取登录前的cookie(可以抓包获取,可以代码实现,下面会附上代码)。
- 手动登录,再获取登录后的cookie。
- 对比两次获取的cookie,找出登录后多出来的cookie,只要多出来的name和value就行(一般name就是token)。
- 在代码里加上写入cookie,把找出来的name和value写入。然后再写一遍打开网页的代码。
可以删除某个cookie,刷新看看影响登录状态不,如果影响,则需要找出,精简,越少越好
-
通过浏览器获取当前网站的cookies
- control控制台: document.cookie,将结果复制到字符串中
-
脚本配置
from selenium import webdriver from selenium.webdriver import ChromeOptions from selenium.webdriver.common.desired_capabilities import DesiredCapabilities option = ChromeOptions() option.add_experimental_option('excludeSwitches', ['enable-automation']) bro=webdriver.Remote( command_executor='http://192.168.89.144:4444/wd/hub', desired_capabilities=DesiredCapabilities.CHROME, options=option ) bro.get('http://www.xxx.com') bro.delete_all_cookies() cookie_str="Hm_lvt_15a5a39cc30c333ba9fae9270351ef30=1603075219,1603075249,1603075436,1603076535; Hm_lpvt_15a5a39cc30c333ba9fae9270351ef30=1603076844; HW_refts_developer_huawei_com/consumer_developer_huawei_com=1602816770937; HW_id_developer_huawei_com/consumer_developer_huawei_com=83c25345ea61434d85d371a96e1f914d; HW_idts_developer_huawei_com/consumer_developer_huawei_com=1602816770938; HW_idn_developer_huawei_com/consumer_developer_huawei_com=0eaa0805460b40e2a03dfccf964e80c7; HW_viewts__developer_huawei_com=1603075212712; _ga=GA1.2.1762325546.1603075273; _gid=GA1.2.69869945.1603075273; urlBeforeLogin=https%3A%2F%2Fdeveloper.huawei.com%2Fconsumer%2Fcn%2F; state=2127401; hwsso_login=""; Hm_lvt_48e5a2ca327922f1ee2bb5ea69bdd0a6=1603075273,1603076561; Hm_lpvt_48e5a2ca327922f1ee2bb5ea69bdd0a6=1603076825; X-HD-SESSION=737a07c5-0b9d-b4c0-22f8-a688197a5093; x-uid=890086000102320858; x-siteId=1; x-hd-grey=0; developer_userinfo=%7B%22siteid%22%3A%221%22%2C%22expiretime%22%3A%2220201019T040721Z%22%2C%22csrftoken%22%3A%225FA3FBE57EE0CA455F2E782A7FD528A84C26D0C31FF2C0F0FE%22%7D; HW_viewts_developer_huawei_com/consumer_developer_huawei_com=1603076842837; x-userType=2; HW_idvc_developer_huawei_com/consumer_developer_huawei_com=43" cookis_list=cookie_str.split('; ') for item in cookis_list: cookie={u'domain': u'.developer.huawei.com',u'httpOnly': False,u'name': item.split('=')[0],u'path': u'/',u'secure': False,u'value': item.split('=')[1]} bro.add_cookie(cookie) bro.refresh()
-
该方案缺点
- 该方法只能解决部分网站的登录,并不是所有的网站都能使用 cookies 实现登录;
- cookies有时间限制,超时需要重新手动登录生成新的cookies并更新到代码中。