背景
虽然掌握selenium webdriver提供的各种方法,就可以做web的UI自动化,但是本着知其然,也要知其所以然的态度,也为了提升自己。了解下selenium webdriver的原理是必要的。搞清楚webdriver是怎么连接和操作浏览器,对于使用webdriver的各种方法,也会更加明悟。况且源码就在本地,花些精力读一下源码,不费事。
selenium概述
selenium经过不断地发展,目前的webdriver已经是web自动化开源技术中非常优秀的。之前的selenium RC据说有效率低,稳定性弱,且不支持鼠标事件等痛点,既然已为过去式,那就不深入研究。笔者主要熟悉selenium webdriver。
根据笔者自己的理解,webdriver的特点在于,通过各浏览器厂商自己的驱动程序去操作浏览器。也就是这一特点,使得webdriver的稳定性非常稳定,如同人工操作浏览器一样。不过与此带来的副作用是,不同的浏览器,需要匹配相应的浏览器驱动程序,这一点,对selenium的开发者应该影响较大,因为他们需要使得webdriver对不同的浏览器做到兼容,但也由于优秀的开发者的努力,使得这一副作用对我等使用者几乎忽略不计。比如我们自动化控制谷歌浏览器和火狐浏览器时,仅仅实例化webdriver对象有区别罢了,其他操作浏览器的方式都是一样的。操作谷歌浏览器和操作火狐浏览器的不同点如下:
# 火狐
driver = webdriver.Firefox()
# 谷歌
driver = webdriver.Chrome()
使用selenium webdriver的web自动化,大致分为三个部分
1. 使用webdriver的方法设计编写的web自动化代码
2. 浏览器自身,
3. 浏览器对应的驱动程序,这是测试代码与浏览器窗口交互的桥梁。可以把它比作BS架构中的服务端。
从以上得知,web自动化的目的是编码实现浏览器的自动化操作,理想情况下,有1,2两个部分即可实现。但是由于我们的代码无法操作浏览器对象,可以理解为浏览器对象提供的操作浏览器的接口,根本不认识我们编写的代码。所以,就需要第3部分来做沟通的桥梁,可以理解为,浏览器对应的驱动程序是一个翻译官,把我们使用webdriver方法写的代码翻译为浏览器能够认识的语言,这样浏览器就知道我们希望它做什么操作了。
搞明白这个翻译过程大概是怎样的,就是本文的目标。
selenium webdriver 原理概述
笔者才疏学浅,通过阅读源码,只敢说对其原理也是略知皮毛, 了解其表象。在此仅仅为了做个总结,加深自己对其的理解和印象。
首先看如下代码:
from selenium import webdriver
chrome_driver = r'D:\Python3.7\Lib\site-packages\selenium\webdriver\chrome\chromedriver.exe'
driver = webdriver.Chrome(executable_path = chrome_driver)
运行这段简单的代码,会自动启动一个受自动化程序运行的浏览器窗口。如下图:
笔者认为,仅这个简单过程,包含的webdriver做web自动化的原理,对于框架使用人员,差不多已经受用了。
总的来说,这段代码就做了一件事,实例化一个webdriver.Chrome()对象。那么,在示例化该对象的过程中,就完成了上面3个部分的构建和连接。
接下来,通过源码,一步步了解其过程。
首先,通过from selenium import webdriver
找到源码路径,发现webdriver是一个包,不是一个模块,那么按照python导入机制,此时导致的就是webdriver包里__init__.py了。看看导入了什么:
from .firefox.webdriver import WebDriver as Firefox # noqa
from .firefox.firefox_profile import FirefoxProfile # noqa
from .firefox.options import Options as FirefoxOptions # noqa
#请注意这一行,因为示例代码是一chromen浏览器为例的
from .chrome.webdriver import WebDriver as Chrome # noqa
from .chrome.options import Options as ChromeOptions # noqa
from .ie.webdriver import WebDriver as Ie # noqa
from .ie.options import Options as IeOptions # noqa
from .edge.webdriver import WebDriver as Edge # noqa
from .opera.webdriver import WebDriver as Opera # noqa
from .safari.webdriver import WebDriver as Safari # noqa
from .blackberry.webdriver import WebDriver as BlackBerry # noqa
from .phantomjs.webdriver import WebDriver as PhantomJS # noqa
from .android.webdriver import WebDriver as Android # noqa
from .webkitgtk.webdriver import WebDriver as WebKitGTK # noqa
from .webkitgtk.options import Options as WebKitGTKOptions # noqa
from .remote.webdriver import WebDriver as Remote # noqa
from .common.desired_capabilities import DesiredCapabilities # noqa
from .common.action_chains import ActionChains # noqa
from .common.touch_actions import TouchActions # noqa
from .common.proxy import Proxy # noqa
__version__ = '3.14.1'
实例化的是webdriver.Chrome()对象,或许有人会好奇,webdriver包里没有Chrome对象呢。原因在于:from .chrome.webdriver import WebDriver as Chrome 。这里将webdriver包里的chrome包里的webdriver模块中的WebDriver对象导入了,并将这个对象取别名为Chrome,所以实例化webdriver.Chrome()对象,就是实例化一个匹配谷歌浏览器的webdriver对象。
接下俩,找到这个对象,看看它的构造函数做了啥:如果不喜欢看代码的可以跳过,笔者尽量使得文字能够描述清楚。
import warnings
from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
from .remote_connection import ChromeRemoteConnection
from .service import Service
from .options import Options
# site-packages\selenium\webdriver\chrome\webdriver.py
class WebDriver(RemoteWebDriver):
"""
Controls the ChromeDriver and allows you to drive the browser.
You will need to download the ChromeDriver executable from
http://chromedriver.storage.googleapis.com/index.html
"""
def __init__(self, executable_path="chromedriver", port=0,
options=None, service_args=None,
desired_capabilities=None, service_log_path=None,
chrome_options=None, keep_alive=True):
"""
Creates a new instance of the chrome driver.
Starts the service and then creates new instance of chrome driver.
:Args:
- executable_path - path to the executable. If the default is used it assumes the executable is in the $PATH
- port - port you would like the service to run, if left as 0, a free port will be found.
- options - this takes an instance of ChromeOptions
- service_args - List of args to pass to the driver service
- desired_capabilities - Dictionary object with non-browser specific
capabilities only, such as "proxy" or "loggingPref".
- service_log_path - Where to log information from the driver.
- chrome_options - Deprecated argument for options
- keep_alive - Whether to configure ChromeRemoteConnection to use HTTP keep-alive.
"""
if chrome_options:
warnings.warn('use options instead of chrome_options',
DeprecationWarning, stacklevel=2)
options = chrome_options
if options is None:
# desired_capabilities stays as passed in
if desired_capabilities is None:
desired_capabilities = self.create_options().to_capabilities()
else:
if desired_capabilities is None:
desired_capabilities = options.to_capabilities()
else:
desired_capabilities.update(options.to_capabilities())
self.service = Service(
executable_path,
port=port,
service_args=service_args,
log_path=service_log_path)
self.service.start()
try:
RemoteWebDriver.__init__(
self,
command_executor=ChromeRemoteConnection(
remote_server_addr=self.service.service_url,
keep_alive=keep_alive),
desired_capabilities=desired_capabilities)
except Exception:
self.quit()
raise
self._is_remote = False
通过这一句class WebDriver(RemoteWebDriver):
,我们了解到这个webdriver对象是继承RemoteWebDriveru对象。
而通过from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
这一句表明,RemoteWebDriver对象其实就是selenium.webdriver.remote.webdriver这个模块里的另一个webdriver对象。先暂时不管这个对象。
回到上面的构造函数,我们编写的测试代码,在实例化webdriver.chrome对象时仅仅传了一个参数,即executable_path,那么其他的参数都是按照默认值处理。
接下来上面的构造函数中的这段代码:
self.service = Service(
executable_path,
port=port,
service_args=service_args,
log_path=service_log_path)
self.service.start()
这段代码示例化了一个Service对象,并调用了这个对象的start()方法。根据这个线索,我们需要直到示例化这个对象和调用start()方法到底做了什么
通过from .service import Service
这一句,找到Service对象的源代码,如下:
# site-packages\selenium\webdriver\chrome\service.py
from selenium.webdriver.common import service
class Service(service.Service):
"""
Object that manages the starting and stopping of the ChromeDriver
"""
def __init__(self, executable_path, port=0, service_args=None,
log_path=None, env=None):
"""
Creates a new instance of the Service
:Args:
- executable_path : Path to the ChromeDriver
- port : Port the service is running on
- service_args : List of args to pass to the chromedriver service
- log_path : Path for the chromedriver service to log to"""
self.service_args = service_args or []
if log_path:
self.service_args.append('--log-path=%s' % log_path)
service.Service.__init__(self, executable_path, port=port, env=env,
start_error_message="Please see https://sites.google.com/a/chromium.org/chromedriver/home")
def command_line_args(self):
return ["--port=%d" % self.port] + self.service_args
通过其注释可以了解到,该对象管理ChromeDriver的启动和停止。其构造函数的作用是:
创建Service的新实例。
与此同时,我们发现这个Service对象也是继承自另一个Service对象。start()方法也是来自于这个父对象。
虽然中间还有些细节,但为了突出核心主线,就略过了,如果影响了理解,请按照本文的数据流顺序依次阅读每行源码。现在,开始看Service对象的start()方法:
def start(self):
"""
Starts the Service.
:Exceptions:
- WebDriverException : Raised either when it can't start the service
or when it can't connect to the service
"""
try:
cmd = [self.path]
cmd.extend(self.command_line_args())
self.process = subprocess.Popen(cmd, env=self.env,
close_fds=platform.system() != 'Windows',
stdout=self.log_file,
stderr=self.log_file,
stdin=PIPE)
except TypeError:
raise
except OSError as err:
if err.errno == errno.ENOENT:
raise WebDriverException(
"'%s' executable needs to be in PATH. %s" % (
os.path.basename(self.path), self.start_error_message)
)
elif err.errno == errno.EACCES:
raise WebDriverException(
"'%s' executable may have wrong permissions. %s" % (
os.path.basename(self.path), self.start_error_message)
)
else:
raise
except Exception as e:
raise WebDriverException(
"The executable %s needs to be available in the path. %s\n%s" %
(os.path.basename(self.path), self.start_error_message, str(e)))
count = 0
while True:
self.assert_process_still_running()
if self.is_connectable():
break
count += 1
time.sleep(1)
if count == 30:
raise WebDriverException("Can not connect to the Service %s" % self.path)
这个方法的作用启动服务。启动什么服务呢,主要是通过subprocess启动一个子进程,如下代码:
self.process = subprocess.Popen(cmd, env=self.env,
close_fds=platform.system() != 'Windows',
stdout=self.log_file,
stderr=self.log_file,
stdin=PIPE)
那么是哪个子进程呢,主要看这段代码里的cmd
往上追溯,可以看到cmd的原形:
cmd = [self.path]
再追踪self.path,根据其特性,直接定位到该类的构造函数:
self.path = executable
executable是构造函数的参数,搞清楚这参数是什么,就直到启动了哪个进程,启动了怎样的一个Services。
又跳到如下Services实例化的地方
# site-packages\selenium\webdriver\chrome\webdriver.py
self.service = Service(
executable_path,
port=port,
service_args=service_args,
log_path=service_log_path)
#site-packages\selenium\webdriver\chrome\service.py
from selenium.webdriver.common import service
class Service(service.Service):
"""
Object that manages the starting and stopping of the ChromeDriver
"""
def __init__(self, executable_path, port=0, service_args=None,
log_path=None, env=None):
"""
Creates a new instance of the Service
:Args:
- executable_path : Path to the ChromeDriver
- port : Port the service is running on
- service_args : List of args to pass to the chromedriver service
- log_path : Path for the chromedriver service to log to"""
self.service_args = service_args or []
if log_path:
self.service_args.append('--log-path=%s' % log_path)
service.Service.__init__(self, executable_path, port=port, env=env,
start_error_message="Please see https://sites.google.com/a/chromium.org/chromedriver/home")
好了,已经到了源头,现在直到,self.path就是executable_path,如下:
chrome_driver = r'D:\Python3.7\Lib\site-packages\selenium\webdriver\chrome\chromedriver.exe'
driver = webdriver.Chrome(executable_path = chrome_driver)
也就是谷歌浏览器对应的浏览器驱动,一个可执行文件。
那么启动这个浏览器驱动,就相当于启动了上文所说的第三部分,也就是桥梁,翻译官,也可以理解为启动了web 服务端。
按照上文所说的沟通模式,使用webdriver方法写的自动化代码,都是先发指令给这个服务端,然后这个服务端告诉浏览器怎么做。
我们写代码发指令给服务端,就是我们调用webdriver方法的过程。
好的,到了这一步,因为此时还没有能够打开浏览器呢。所以让我们回到webdriver.chrome()的构造函数,看看self.service.start()之后还有什么,就只剩下这些:
try:
RemoteWebDriver.__init__(
self,
command_executor=ChromeRemoteConnection(
remote_server_addr=self.service.service_url,
keep_alive=keep_alive),
desired_capabilities=desired_capabilities)
except Exception:
self.quit()
raise
self._is_remote = False
上面的代码,我们还需要了解的就是RemoteWebDriver对象的实例化,看看实例化这个对象有什么影响。
通过from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
这一句我们直到,RemoteWebDriver对象应该在哪个位置找到,其构造函数如下:
class WebDriver(object):
"""
Controls a browser by sending commands to a remote server.
This server is expected to be running the WebDriver wire protocol
as defined at
https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol
:Attributes:
- session_id - String ID of the browser session started and controlled by this WebDriver.
- capabilities - Dictionaty of effective capabilities of this browser session as returned
by the remote server. See https://github.com/SeleniumHQ/selenium/wiki/DesiredCapabilities
- command_executor - remote_connection.RemoteConnection object used to execute commands.
- error_handler - errorhandler.ErrorHandler object used to handle errors.
"""
_web_element_cls = WebElement
def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
desired_capabilities=None, browser_profile=None, proxy=None,
keep_alive=False, file_detector=None, options=None):
"""
Create a new driver that will issue commands using the wire protocol.
:Args:
- command_executor - Either a string representing URL of the remote server or a custom
remote_connection.RemoteConnection object. Defaults to 'http://127.0.0.1:4444/wd/hub'.
- desired_capabilities - A dictionary of capabilities to request when
starting the browser session. Required parameter.
- browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object.
Only used if Firefox is requested. Optional.
- proxy - A selenium.webdriver.common.proxy.Proxy object. The browser session will
be started with given proxy settings, if possible. Optional.
- keep_alive - Whether to configure remote_connection.RemoteConnection to use
HTTP keep-alive. Defaults to False.
- file_detector - Pass custom file detector object during instantiation. If None,
then default LocalFileDetector() will be used.
- options - instance of a driver options.Options class
"""
capabilities = {}
if options is not None:
capabilities = options.to_capabilities()
if desired_capabilities is not None:
if not isinstance(desired_capabilities, dict):
raise WebDriverException("Desired Capabilities must be a dictionary")
else:
capabilities.update(desired_capabilities)
if proxy is not None:
warnings.warn("Please use FirefoxOptions to set proxy",
DeprecationWarning, stacklevel=2)
proxy.add_to_capabilities(capabilities)
self.command_executor = command_executor
if type(self.command_executor) is bytes or isinstance(self.command_executor, str):
self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)
self._is_remote = True
self.session_id = None
self.capabilities = {}
self.error_handler = ErrorHandler()
self.start_client()
if browser_profile is not None:
warnings.warn("Please use FirefoxOptions to set browser profile",
DeprecationWarning, stacklevel=2)
self.start_session(capabilities, browser_profile)
self._switch_to = SwitchTo(self)
self._mobile = Mobile(self)
self.file_detector = file_detector or LocalFileDetector()
通过看注释,我们了解到,该对象的作用:通过向远程服务器(上文提到的浏览器驱动程序启动的服务,即桥梁)发送命令来控制浏览器。此服务器将运行selenium定义的WebDriver连接协议。
其构造函数的作用为:创建一个使用连接协议发出命令的新驱动程序。
在该构造函数中,我们需要关注这一句:
self.start_session(capabilities, browser_profile)
这一句的作用是创建具有所需功能的新会话,通俗的说,启动一个具有指定功能的浏览器,该浏览器被分配一个唯一sessionid。唯一sessionid的作用在于当启动多个浏览器时,通过这个唯一的sessionid,对浏览器的操作指令不会冲突和干扰。
自此,最开始的实例化webdriver对象,启动浏览器的过程原理,其数据流已经大致描述。当然,笔者看源代码时,不只看了上文记录的源代码,所以可能所表达与我所理解存在偏差。如果仍未理解,请按照该数据流阅读源代码。
自动化操作浏览器,主要是打开网址,对浏览器元素、窗口等进行操作,这些原理都是一样的:
通过调用webdriver方法,像上文提到的服务端发送指令,该服务端将指令按照WebDriver连接协议翻译为浏览器能够识别的指令。其中主要涉及到使用urllib3(Python中请求url连接的官方标准库)发送http请求给服务端(浏览器启动程序)来完成通过桥梁操作浏览器的过程。
继续了解该过程,我们继续从self.start_session()方法开始:
# site-packages\selenium\webdriver\remote\webdriver.py
def start_session(self, capabilities, browser_profile=None):
"""
Creates a new session with the desired capabilities.
:Args:
- browser_name - The name of the browser to request.
- version - Which browser version to request.
- platform - Which platform to request the browser on.
- javascript_enabled - Whether the new session should support JavaScript.
- browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object. Only used if Firefox is requested.
"""
if not isinstance(capabilities, dict):
raise InvalidArgumentException("Capabilities must be a dictionary")
if browser_profile:
if "moz:firefoxOptions" in capabilities:
capabilities["moz:firefoxOptions"]["profile"] = browser_profile.encoded
else:
capabilities.update({'firefox_profile': browser_profile.encoded})
w3c_caps = _make_w3c_caps(capabilities)
parameters = {"capabilities": w3c_caps,
"desiredCapabilities": capabilities}
response = self.execute(Command.NEW_SESSION, parameters)
if 'sessionId' not in response:
response = response['value']
self.session_id = response['sessionId']
self.capabilities = response.get('value')
# if capabilities is none we are probably speaking to
# a W3C endpoint
if self.capabilities is None:
self.capabilities = response.get('capabilities')
# Double check to see if we have a W3C Compliant browser
self.w3c = response.get('status') is None
self.command_executor.w3c = self.w3c
上面的代码,我们直接跳到这一句:
response = self.execute(Command.NEW_SESSION, parameters)
首先关注下Command.NEW_SESSION
通过:from .command import Command
,我们可以知道:Command.NEW_SESSION == “newSession”
# site-packages\selenium\webdriver\remote\command.py
class Command(object):
"""
Defines constants for the standard WebDriver commands.
While these constants have no meaning in and of themselves, they are
used to marshal commands through a service that implements WebDriver's
remote wire protocol:
https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol
"""
# Keep in sync with org.openqa.selenium.remote.DriverCommand
STATUS = "status"
NEW_SESSION = "newSession"
好的,这只是个参数,我们继续了解self.execute()方法:
def execute(self, driver_command, params=None):
"""
Sends a command to be executed by a command.CommandExecutor.
:Args:
- driver_command: The name of the command to execute as a string.
- params: A dictionary of named parameters to send with the command.
:Returns:
The command's JSON response loaded into a dictionary object.
"""
if self.session_id is not None:
if not params:
params = {'sessionId': self.session_id}
elif 'sessionId' not in params:
params['sessionId'] = self.session_id
params = self._wrap_value(params)
response = self.command_executor.execute(driver_command, params)
if response:
self.error_handler.check_response(response)
response['value'] = self._unwrap_value(
response.get('value', None))
return response
# If the server doesn't send a response, assume the command was
# a success
return {'success': 0, 'value': None, 'sessionId': self.session_id}
这个方法的作用是:发送要由command. commandexecutor执行的命令。
好的,我们现在关注这一句:response = self.command_executor.execute(driver_command, params)
,调用了self.command_executor.execute()方法,其入参是:执行命令,参数。
通过其构造函数和实例化函数,我们可以知道command_executor是什么:
# site-packages\selenium\webdriver\chrome\webdriver.py
from .remote_connection import ChromeRemoteConnection
from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
class WebDriver(RemoteWebDriver):
def __init__(self, executable_path="chromedriver", port=0,
options=None, service_args=None,
desired_capabilities=None, service_log_path=None,
chrome_options=None, keep_alive=True):
RemoteWebDriver.__init__(
self,
command_executor=ChromeRemoteConnection(
remote_server_addr=self.service.service_url,
keep_alive=keep_alive),
desired_capabilities=desired_capabilities)
# site-packages\selenium\webdriver\remote\webdriver.py
class WebDriver(object):
def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
desired_capabilities=None, browser_profile=None, proxy=None,
keep_alive=False, file_detector=None, options=None):
self.command_executor = command_executor
通过以上代码,我们知道command_executor是一个实例化的ChromeRemoteConnection对象。
来看看ChromeRemoteConnection对象及其构造函数
# site-packages\selenium\webdriver\chrome\remote_connection.py
from selenium.webdriver.remote.remote_connection import RemoteConnection
class ChromeRemoteConnection(RemoteConnection):
def __init__(self, remote_server_addr, keep_alive=True):
RemoteConnection.__init__(self, remote_server_addr, keep_alive)
self._commands["launchApp"] = ('POST', '/session/$sessionId/chromium/launch_app')
self._commands["setNetworkConditions"] = ('POST', '/session/$sessionId/chromium/network_conditions')
self._commands["getNetworkConditions"] = ('GET', '/session/$sessionId/chromium/network_conditions')
self._commands['executeCdpCommand'] = ('POST', '/session/$sessionId/goog/cdp/execute')
ChromeRemoteConnection对象主要是继承了RemoteConnection对象,并扩充了self._commands这个字典,这个字典保存了许多指令的名字和指令的具体内容。
再来看看RemoteConnection对象,下面仅截取了部分代码,全文请见源代码
class RemoteConnection(object):
"""A connection with the Remote WebDriver server.
def __init__(self, remote_server_addr, keep_alive=False, resolve_ip=True):
# Attempt to resolve the hostname and get an IP address.
self._url = remote_server_addr
if keep_alive:
self._conn = urllib3.PoolManager(timeout=self._timeout)
self._commands = {
Command.STATUS: ('GET', '/status'),
Command.NEW_SESSION: ('POST', '/session'),
Command.GET_ALL_SESSIONS: ('GET', '/sessions'),
Command.QUIT: ('DELETE', '/session/$sessionId'),
Command.GET_CURRENT_WINDOW_HANDLE:
('GET', '/session/$sessionId/window_handle'),
Command.W3C_GET_CURRENT_WINDOW_HANDLE:
('GET', '/session/$sessionId/window'),
Command.GET_WINDOW_HANDLES:
('GET', '/session/$sessionId/window_handles'),
......
......
def execute(self, command, params):
"""
Send a command to the remote server.
Any path subtitutions required for the URL mapped to the command should be
included in the command parameters.
:Args:
- command - A string specifying the command to execute.
- params - A dictionary of named parameters to send with the command as
its JSON payload.
"""
def _request(self, method, url, body=None):
"""
Send an HTTP request to the remote server.
:Args:
- method - A string for the HTTP method to send the request with.
- url - A string for the URL to send the request to.
- body - A string for request body. Ignored unless method is POST or PUT.
:Returns:
A dictionary with the server's parsed JSON response.
"""
上面我们要找的self.command_executor.execute()方法,就是这里的def execute(self, command, params):其作用是:将命令发送到远程服务器,任何映射到该命令的URL所需的路径替换都应该包含在命令参数中。
# site-packages\selenium\webdriver\remote\remote_connection.py
def execute(self, command, params):
"""
Send a command to the remote server.
Any path subtitutions required for the URL mapped to the command should be
included in the command parameters.
:Args:
- command - A string specifying the command to execute.
- params - A dictionary of named parameters to send with the command as
its JSON payload.
"""
command_info = self._commands[command]
assert command_info is not None, 'Unrecognised command %s' % command
path = string.Template(command_info[1]).substitute(params)
if hasattr(self, 'w3c') and self.w3c and isinstance(params, dict) and 'sessionId' in params:
del params['sessionId']
data = utils.dump_json(params)
url = '%s%s' % (self._url, path)
return self._request(command_info[0], url, body=data)
这里在继续关注这一句:
self._request(command_info[0], url, body=data)
看看这个方法做了啥:
# site-packages\selenium\webdriver\remote\remote_connection.py
def _request(self, method, url, body=None):
"""
Send an HTTP request to the remote server.
:Args:
- method - A string for the HTTP method to send the request with.
- url - A string for the URL to send the request to.
- body - A string for request body. Ignored unless method is POST or PUT.
:Returns:
A dictionary with the server's parsed JSON response.
"""
LOGGER.debug('%s %s %s' % (method, url, body))
parsed_url = parse.urlparse(url)
headers = self.get_remote_connection_headers(parsed_url, self.keep_alive)
resp = None
if body and method != 'POST' and method != 'PUT':
body = None
if self.keep_alive:
resp = self._conn.request(method, url, body=body, headers=headers)
statuscode = resp.status
else:
http = urllib3.PoolManager(timeout=self._timeout)
resp = http.request(method, url, body=body, headers=headers)
statuscode = resp.status
if not hasattr(resp, 'getheader'):
if hasattr(resp.headers, 'getheader'):
resp.getheader = lambda x: resp.headers.getheader(x)
elif hasattr(resp.headers, 'get'):
resp.getheader = lambda x: resp.headers.get(x)
data = resp.data.decode('UTF-8')
try:
if 300 <= statuscode < 304:
return self._request('GET', resp.getheader('location'))
if 399 < statuscode <= 500:
return {'status': statuscode, 'value': data}
content_type = []
if resp.getheader('Content-Type') is not None:
content_type = resp.getheader('Content-Type').split(';')
if not any([x.startswith('image/png') for x in content_type]):
try:
data = utils.load_json(data.strip())
except ValueError:
if 199 < statuscode < 300:
status = ErrorCode.SUCCESS
else:
status = ErrorCode.UNKNOWN_ERROR
return {'status': status, 'value': data.strip()}
# Some of the drivers incorrectly return a response
# with no 'value' field when they should return null.
if 'value' not in data:
data['value'] = None
return data
else:
data = {'status': 0, 'value': data}
return data
finally:
LOGGER.debug("Finished Request")
resp.close()
这个方法的作用:向远程服务器发送一个HTTP请求。这里远程服务器就是通过启动浏览器驱动程序而启动的一个web services,上文有介绍。
通过这句:http = urllib3.PoolManager(timeout=self._timeout)
我们可以了解到HTTP请求是通过urllib3这个模块发送的。
我们打开网址,对浏览器窗口的各种操作,对浏览器元素的各种操作,以及定位元素,都是通过webdriver方法,那么使用的原理都是一样的,即上文描述的原理。
以打开网址的get方法为例:
def get(self, url):
"""
Loads a web page in the current browser session.
"""
self.execute(Command.GET, {'url': url})
再看self.execute(),似曾相识,上文有提到这里。
def execute(self, driver_command, params=None):
"""
Sends a command to be executed by a command.CommandExecutor.
:Args:
- driver_command: The name of the command to execute as a string.
- params: A dictionary of named parameters to send with the command.
:Returns:
The command's JSON response loaded into a dictionary object.
"""
if self.session_id is not None:
if not params:
params = {'sessionId': self.session_id}
elif 'sessionId' not in params:
params['sessionId'] = self.session_id
params = self._wrap_value(params)
response = self.command_executor.execute(driver_command, params)
if response:
self.error_handler.check_response(response)
response['value'] = self._unwrap_value(
response.get('value', None))
return response
# If the server doesn't send a response, assume the command was
# a success
return {'success': 0, 'value': None, 'sessionId': self.session_id}
这一句:response = self.command_executor.execute(driver_command, params)
,调用的是如下方法:
def execute(self, command, params):
"""
Send a command to the remote server.
Any path subtitutions required for the URL mapped to the command should be
included in the command parameters.
:Args:
- command - A string specifying the command to execute.
- params - A dictionary of named parameters to send with the command as
its JSON payload.
"""
command_info = self._commands[command]
assert command_info is not None, 'Unrecognised command %s' % command
path = string.Template(command_info[1]).substitute(params)
if hasattr(self, 'w3c') and self.w3c and isinstance(params, dict) and 'sessionId' in params:
del params['sessionId']
data = utils.dump_json(params)
url = '%s%s' % (self._url, path)
return self._request(command_info[0], url, body=data)
这里:command_info = self._commands[command]
等于command_info = self._commands[Command.GET]
通过Command.GET: ('POST', '/session/$sessionId/url'),
得知:
command_info = ('POST', '/session/$sessionId/url')
这个方法的过程主要是获取self._request()方法需要的:method,url, body三个参数,然后传给self._request()并调用。
所以webdriver.get(a_url)其实就是向指定的请求地址发送一个包含指定请求内容(此处是目标网址)的post请求。
同理可得,webdriver的其它方法也是通过向指定的请求地址发送一个包含指定请求内容的post或者get请求,来实现控制浏览器的。
最后
由于笔者水平有限,selenium原理的诸多细节未讲清楚,若有错误,请指正。