WebDriver协议
早期Selenium
和WebDriver
是两个独立的项目,Selenium
早期主要依靠将操作转换为JavaScript然后注入浏览器实现浏览器的自动化,WebDriver
将浏览器原生API封装成一套更加面向对象的API,但是由于不同浏览器内核的差异,因此为了适配,必须对不同浏览器实现不同的API。在Selenium2
之后两个项目合并,我们常说的Selenium
其实主要指的就是WebDriver
API。当前Selenium
稳定版本的主版本为3,beta版本的Selenium4
已经在路上。
Selenium
或者WebDriver
与浏览器交互的基础是WebDriver
协议(W3C推荐标准)。WebDriver
协议是一个与操作系统、编程语言无关的通过HTTP协议使用JSON作为传输格式的RESTful Web服务。
WebDriver
协议目前有两个版本:
WebDriver 1
:https://www.w3.org/TR/webdriver1/WebDriver 2
:https://www.w3.org/TR/webdriver/
WebDriver
协议之前还有一个废弃的JsonWireProtocol
协议,具体见https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol
在WebDriver
协议模型中牵扯三个对象:
- 客户端(Client或local end):可以简单认为就是调用WebDriver API的程序或机器。我们常用的Python WebDriver API就是客户端的一个Python语言绑定。
- 服务端(Server或remote end):运行
RemoteWebDriver
的机器,或者支持WebDriver
协议的浏览器,比如支持geckodriver
的Firefox浏览器、支持chromedriver
的Chrome浏览器。WebDriver
协议其实只定义了服务端的行为。 WebDriver
:即各种浏览器的WebDriver
协议实现,比如geckodriver
、chromedriver
,WebDriver
则相当于客户端和服务端的中介,提供WebDriver
协议RESTful 服务。客户端通过WebDriver
操控服务端(浏览器)。比如Firefox依靠webdriver.xpi
扩展来控制浏览器。
客户端运行后,启动1个会话(session),客户端在该会话中通过WebDriver
协议 RESTful 服务端点(endpoint,可简单理解为url)向服务端请求,服务器端接受请求后向客户端返回响应。
!
WebDriver
协议将端点映射为命令(command)。端点与命令的关系如下(https://www.w3.org/TR/webdriver/#endpoints):
Method | URI Template | Command |
---|---|---|
POST | /session | New Session |
DELETE | /session/{session id} | Delete Session |
GET | /status | Status |
GET | /session/{session id}/timeouts | Get Timeouts |
POST | /session/{session id}/timeouts | Set Timeouts |
POST | /session/{session id}/url | Navigate To |
GET | /session/{session id}/url | Get Current URL |
POST | /session/{session id}/back | Back |
POST | /session/{session id}/forward | Forward |
POST | /session/{session id}/refresh | Refresh |
GET | /session/{session id}/title | Get Title |
GET | /session/{session id}/window | Get Window Handle |
DELETE | /session/{session id}/window | Close Window |
POST | /session/{session id}/window | Switch To Window |
GET | /session/{session id}/window/handles | Get Window Handles |
POST | /session/{session id}/window/new | New Window |
POST | /session/{session id}/frame | Switch To Frame |
POST | /session/{session id}/frame/parent | Switch To Parent Frame |
GET | /session/{session id}/window/rect | Get Window Rect |
POST | /session/{session id}/window/rect | Set Window Rect |
POST | /session/{session id}/window/maximize | Maximize Window |
POST | /session/{session id}/window/minimize | Minimize Window |
POST | /session/{session id}/window/fullscreen | Fullscreen Window |
GET | /session/{session id}/element/active | Get Active Element |
POST | /session/{session id}/element | Find Element |
POST | /session/{session id}/elements | Find Elements |
POST | /session/{session id}/element/{element id}/element | Find Element From Element |
POST | /session/{session id}/element/{element id}/elements | Find Elements From Element |
GET | /session/{session id}/element/{element id}/selected | Is Element Selected |
GET | /session/{session id}/element/{element id}/attribute/{name} | Get Element Attribute |
GET | /session/{session id}/element/{element id}/property/{name} | Get Element Property |
GET | /session/{session id}/element/{element id}/css/{property name} | Get Element CSS Value |
GET | /session/{session id}/element/{element id}/text | Get Element Text |
GET | /session/{session id}/element/{element id}/name | Get Element Tag Name |
GET | /session/{session id}/element/{element id}/rect | Get Element Rect |
GET | /session/{session id}/element/{element id}/enabled | Is Element Enabled |
GET | /session/{session id}/element/{element id}/computedrole | Get Computed Role |
GET | /session/{session id}/element/{element id}/computedlabel | Get Computed Label |
POST | /session/{session id}/element/{element id}/click | Element Click |
POST | /session/{session id}/element/{element id}/clear | Element Clear |
POST | /session/{session id}/element/{element id}/value | Element Send Keys |
GET | /session/{session id}/source | Get Page Source |
POST | /session/{session id}/execute/sync | Execute Script |
POST | /session/{session id}/execute/async | Execute Async Script |
GET | /session/{session id}/cookie | Get All Cookies |
GET | /session/{session id}/cookie/{name} | Get Named Cookie |
POST | /session/{session id}/cookie | Add Cookie |
DELETE | /session/{session id}/cookie/{name} | Delete Cookie |
DELETE | /session/{session id}/cookie | Delete All Cookies |
POST | /session/{session id}/actions | Perform Actions |
DELETE | /session/{session id}/actions | Release Actions |
POST | /session/{session id}/alert/dismiss | Dismiss Alert |
POST | /session/{session id}/alert/accept | Accept Alert |
GET | /session/{session id}/alert/text | Get Alert Text |
POST | /session/{session id}/alert/text | Send Alert Text |
GET | /session/{session id}/screenshot | Take Screenshot |
GET | /session/{session id}/element/{element id}/screenshot | Take Element Screenshot |
POST | /session/{session id}/print | Print Page |
WebDriver Python API执行流程(反推)
WebDriver
客户端API其实就是实现属性或方法到命令,再到WebDriver
协议端点的变换过程。
例如:WebDriver
类的current_url
特性
-
current_url
特性其实是命令Command.GET_CURRENT_URL
的执行结果。@property def current_url(self): return self.execute(Command.GET_CURRENT_URL)['value']
-
WebDriver
类的execute
方法通过command_executor.execute(driver_command, params)
执行命令。
command_executor
是一个字符串(服务器url)或者remote_connection.RemoteConnection
对象。 -
remote_connection.RemoteConnection
类位于selenium\webdriver\remote\remote_connection.py
。
remote_connection.RemoteConnection
类的_commands
属性定义了命令和端点之间的关系。
Command.GET_CURRENT_URL: ('GET', '/session/$sessionId/url')
remote_connection.RemoteConnection
类的execute
方法控制服务端执行命令。 -
WebDriver
类在实例化时会传入command_executor
参数,command_executor
参数即服务端地址。
WebDriver Python API执行流程(正推)
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.baidu.com")
# 服务端地址
url = driver.command_executor._url
print(url)
上述代码的执行流程大致如下:
-
driver = webdriver.Firefox()
即对selenium\webdriver\firefox\webdriver.py
中的WebDriver
类实例化,WebDriver
类继承自selenium\webdriver\remote\webdriver.py
中的WebDriver
类。WebDriver
类构造方法中command_executor
参数即服务端地址。
此时WebDriver
启动浏览器,加载webdriver.xpi
扩展,监听基于WebDriver
协议的请求。
在日志文件geckodriver.log
中可以看到类似1116855012551 geckodriver INFO Listening on 127.0.0.1:54499
的记录。 -
driver.get("http://www.baidu.com")
相当于执行Command.GET
命令。WebDriver
对象依靠remote_connection.RemoteConnection
对象command_executor
执行命令。remote_connection.RemoteConnection
类中实现了命令到WebDriver
协议的映射Command.GET: ('POST', '/session/$sessionId/url')
。执行命令相当于向浏览器发送对应请求。WebDriver
将请求发送给浏览器。def get(self, url): self.execute(Command.GET, {'url': url})
-
浏览器在接受请求后执行操作(打开百度页面)并返回响应(服务端地址)。
WebDriver
将响应结果传递给代码。