selenium学习(二)

最新推荐文章于 2023-09-02 15:25:51 发布

python蛀虫

最新推荐文章于 2023-09-02 15:25:51 发布

阅读量487

点赞数

文章标签： selenium 学习测试工具

本文链接：https://blog.csdn.net/wdnmdbga/article/details/130734120

版权

前面看了驱动获取、选项创建等，着重对selenium该部分源码进行了分析学习

一、selenium库

主体分为两大目录webdriver：主要用于浏览器的驱动创建以及操作，common目录下主要用于selenium的异常抛出，下面列出常用的操作

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains  # 用于进行部分操作使用
from selenium.webdriver.common.keys import Keys  # 用于模拟键鼠操作
from selenium.webdriver.common.by import By  # 用于模拟各种By操作
from selenium.webdriver.support.ui import WebDriverWait  # 用于设置元素等待
from selenium.webdriver.support import expected_conditions  # 用于判断函数
import selenium.common.exceptions as sce

二、common目录

目录下只有一个exceptions以及init文件

exceptions的介绍只有一句，点名了这个模块的作用就是用于抛出异常

"""
Exceptions that may happen in all the webdriver code.
"""

基本错误为WebDriverException，继承父类为Exception

class WebDriverException(Exception):
    """
    Base webdriver exception.
    """

    def __init__(self, msg=None, screen=None, stacktrace=None):
        self.msg = msg
        self.screen = screen
        self.stacktrace = stacktrace

    def __str__(self):
        exception_msg = "Message: %s\n" % self.msg
        if self.screen is not None:
            exception_msg += "Screenshot: available via screen\n"
        if self.stacktrace is not None:
            stacktrace = "\n".join(self.stacktrace)
            exception_msg += "Stacktrace:\n%s" % stacktrace
        return exception_msg

其余错误类均为WebDriverException的子类或者子类的子类。。。

三、webdriver目录

__init__.py标注了部分可以直接导入的模块，涉及大部分模块均差不多，这里以chrome做记录以及common的操作模块

from .firefox.webdriver import WebDriver as Firefox  # noqa
from .firefox.firefox_profile import FirefoxProfile  # noqa
from .firefox.options import Options as FirefoxOptions  # noqa
from .chrome.webdriver import WebDriver as Chrome  # noqa
from .chrome.options import Options as ChromeOptions  # noqa
from .ie.webdriver import WebDriver as Ie  # noqa
from .ie.options import Options as IeOptions  # noqa
from .edge.webdriver import WebDriver as Edge  # noqa
from .opera.webdriver import WebDriver as Opera  # noqa
from .safari.webdriver import WebDriver as Safari  # noqa
from .blackberry.webdriver import WebDriver as BlackBerry  # noqa
from .phantomjs.webdriver import WebDriver as PhantomJS  # noqa
from .android.webdriver import WebDriver as Android  # noqa
from .webkitgtk.webdriver import WebDriver as WebKitGTK # noqa
from .webkitgtk.options import Options as WebKitGTKOptions # noqa
from .remote.webdriver import WebDriver as Remote  # noqa
from .common.desired_capabilities import DesiredCapabilities  # noqa
from .common.action_chains import ActionChains  # noqa
from .common.touch_actions import TouchActions  # noqa
from .common.proxy import Proxy  # noqa

chrome文件夹下包含五个文件

1.webdriver

用于控制浏览器并进行对应操作，创建时继承大类为RemoteWebDriver，该库后续进行介绍，默认参数有以下：

- executable_path - 可执行文件的路径。如果使用默认值，则假定可执行文件位于$PATH中
- port - 希望服务运行的端口，如果设置为0，将找到一个空闲端口。
- options - 接受chrome支持的选项设置
- service_args - 要传递给驱动服务的参数列表
- desired_capabilities - 仅具有非浏览器特定功能的字典对象，例如"proxy"或"loggingPref"。
- service_log_path - 日志目录
- chrome_options - 已经被弃用，目前使用options
            if chrome_options:
                warnings.warn('use options instead of chrome_options',
                          DeprecationWarning, stacklevel=2)
                options = chrome_options
- keep_alive - 是否配置ChromeRemoteConnection使用HTTP保持连接。

options传递进来的时候也会进行一系列的处理，实际上我们最后在使用的时候并非使用options，而是通过desired_capabilities

if options is None:
   # desired_capabilities stays as passed in
   if desired_capabilities is None:
      desired_capabilities = self.create_options().to_capabilities()
else:
   if desired_capabilities is None:
      desired_capabilities = options.to_capabilities()
   else:
      desired_capabilities.update(options.to_capabilities())

Service主要控制驱动开启或者停止，后续进行介绍

service = Service(
            executable_path,
            port=port,
            service_args=service_args,
            log_path=service_log_path)
service.start()

ChromeRemoteConnection主要用于远程控制，后续进行介绍

try:
    RemoteWebDriver.__init__(
        self,
        command_executor=ChromeRemoteConnection(
            remote_server_addr=self.service.service_url,
            keep_alive=keep_alive),
        desired_capabilities=desired_capabilities)
except Exception:
     self.quit()
     raise

(1)库里常用的为quit()函数，用于退出，实际上在调用这个quit函数时也是调用的RemoteWebDriver的quit()

    def quit(self):
        """
        Closes the browser and shuts down the ChromeDriver executable
        that is started when starting the ChromeDriver
        """
        try:
            RemoteWebDriver.quit(self)
        except Exception:
            # We don't care about the message because something probably has gone wrong
            pass
        finally:
            self.service.stop()

(2)create_options函数用于返回一个Options对应，关于Options对象后续进行介绍

def create_options(self):
     return Options()

2.options模块

主要用于返回一个浏览器选项对象，默认参数有以下

self._binary_location = ''   # 二进制文件的路径
self._arguments = []         # 浏览器的参数，列表形式
self._extension_files = []   # 浏览器的扩展文件
self._extensions = []        # 浏览器的编码扩展列表
self._experimental_options = {}  # 浏览器的其他选项，非args等
self._debugger_address = None    # 浏览器的debug地址
self._caps = DesiredCapabilities.CHROME.copy()  # 浏览器的参数字典
默认如下：
CHROME = {
   "browserName": "chrome",
   "version": "",
   "platform": "ANY",
}

类中函数分为三种：带@property装饰器的函数、变量操作函数

(1)带@property装饰器的函数，主要用于返回默认参数的值，因为默认属性均为self._xxx，在python中这表示为私有属性(虽然外部仍然可以访问)，所以这个使用通过这类函数用不带_开头的名称表示对应的属性，这样在调用的时候就可以是instancename.xxx而不是instancename._xxx

def binary_location(self):
    """
    Returns the location of the binary otherwise an empty string
    """
    return self._binary_location

(2)变量的操作函数

按照列表和字典的操作区分，实际等同于外部对字典进行update，对列表进行append

def set_capability(self, name, value):
    """Sets a capability."""
    self._caps[name] = value

def add_argument(self, argument):
    """
    Adds an argument to the list

    :Args:
    - Sets the arguments
    """
    if argument:
       self._arguments.append(argument)
    else:
       raise ValueError("argument can not be null")

其中有一个headless，他并非类默认变量，只有当调用add_argument传入'--headless'或者调用

同时headless为可读可写(增加了@headless.setter，如果单纯使用@property则表示只读)

@property
def headless(self):
    """
    Returns whether or not the headless argument is set
    """
    return '--headless' in self._arguments

@headless.setter
def headless(self, value):
    """
    Sets the headless argument

    Args:
      value: boolean value indicating to set the headless option
    """
    args = {'--headless'}
    if platform.system().lower() == 'windows':
        args.add('--disable-gpu')
    if value is True:
        self._arguments.extend(args)
    else:
        self._arguments = list(set(self._arguments) - args)

def set_headless(self, headless=True):
    """ Deprecated, options.headless = True """
    warnings.warn('use setter for headless property instead of set_headless',
                  DeprecationWarning, stacklevel=2)
    self.headless = headless

to_capabilities则是webDriver中使用的将options格式化的函数，最后返回一个字典

def to_capabilities(self):
    """
        Creates a capabilities with all the options that have been set and

        returns a dictionary with everything
    """
    caps = self._caps
    chrome_options = self.experimental_options.copy()
    chrome_options["extensions"] = self.extensions
    if self.binary_location:
        chrome_options["binary"] = self.binary_location
    chrome_options["args"] = self.arguments
    if self.debugger_address:
        chrome_options["debuggerAddress"] = self.debugger_address

    caps[self.KEY] = chrome_options

    return caps

3.remote_connection模块

ChromeRemoteConnection类，继承RemoteWebDriver，后续进行介绍

def __init__(self, remote_server_addr, keep_alive=True):
    RemoteConnection.__init__(self, remote_server_addr, keep_alive)
    self._commands["launchApp"] = ('POST', '/session/$sessionId/chromium/launch_app')
    self._commands["setNetworkConditions"] = ('POST', '/session/$sessionId/chromium/network_conditions')
    self._commands["getNetworkConditions"] = ('GET', '/session/$sessionId/chromium/network_conditions')
    self._commands['executeCdpCommand'] = ('POST', '/session/$sessionId/goog/cdp/execute')

4.service模块

Service模块在webDriver模块中提到过，主要控制驱动开启或者停止，它继承selenium的service大类，后续进行介绍，默认接受参数

- executable_path : 浏览器驱动的执行路径
- port : 服务端口，与控制模块默认一致
- service_args : 浏览器驱动对应的参数
- log_path : 执行过程中的日志路径

init函数中近进行service的初始化

self.service_args = service_args or []
if log_path:
   self.service_args.append('--log-path=%s' % log_path)

service.Service.__init__(self, executable_path, port=port, env=env,
                                 start_error_message="Please see https://sites.google.com/a/chromium.org/chromedriver/home")