Appium 点击操作梳理_appium转义符怎么使用-CSDN博客

本文链接：https://blog.csdn.net/liuqinhou/article/details/126009276

Appium框架的客户端脚本中执行UI操作的原理是：脚本中需要执行UI操作时，会发送一条http请求（请求数据中包括了控件相关信息）到Appium的服务器，服务器再把接收到的数据转义一下，然后转发给安装在手机端的插桩程序。这时候插桩程序调用android sdk提供的uiautomator相关ui操作库来执行真正的UI操作。然后再把结果沿路一直返回到脚本中形成闭环。

下面注意分析下脚本中如何给Appium服务器发送http请求。

1、启动Appim服务

首先需要启动Appium服务，让Appium服务监听端口4723，这样脚本就可以往这个端口发送http请求了；

2、脚本中执行用例前需要创建webDriver对象

这个对象可以理解为appium提供给脚本中执行UI操作的封装函数库。

webDriver类的构造方法中会根据”desired capabilities“信息向appium服务器发起了一次请求，服务器拿到”desired capabilities“后会根据这些信息创建一个SessionId并返回给用例脚本。

并且Appium拿到”desired capabilities“后就能知道要和哪个连接PC的手机设备进行连接了。Appium服务在这里会做很多事情来确保和手机端的插桩服务程序连接成功。具体做了哪些事情可以参考我的另外一篇文章”appium 从启动到测试再到结束流程梳理“。

也许你会问什么要执行用例前要获取session id呢？

因为执行测试时，脚本用例本质上是给服务端发送http请求，但是http请求是无状态的，服务器收到的每条http请求都被认为和之前的请求没有任何关系。这会导致每一条http请求都必须带上”desired capabilities“信息，这样服务器才知道要和哪个手机设备通信，并且”desired capabilities“还有很多其他信息，才能确保appium服务按照”desired capabilities“指定的参数运行。即需要保证每条http请求的运行环境是一致的。

每一条http请求都带上”desired capabilities“信息这显然是不可取的。所以appium采取的是session机制。appium服务第一次拿到”desired capabilities“后会存在本地，并返回一个session id给脚本用例，这样后续的用例再发送http请求到appium服务，则appium服务根据session id就能知道对应哪个”desired capabilities“，然后就能知道运行环境是怎样的了（运行环境指的是和哪台设手机设备通信，用例执行延时或重试这些在”desired capabilities“中指定的信息）。

一、获取webdriver对象，并得到session id

self.caps = {}
self.caps["platformName"] = "Android"
self.caps["platformVersion"] = devices.dev[Constant.phone]["platformVersion"]
self.caps["deviceName"] = devices.dev[Constant.phone]["phone"]
self.caps["appPackage"] = Constant.appPackage
self.caps["appActivity"] = Constant.appActivity
self.caps['app'] = Constant.app
self.caps["unicodeKeyboard"] = True
self.caps["autoAcceptAlerts"] = True  # 对权限弹窗进行授权
self.caps["resetKeyboard"] = True
self.caps["noReset"] = True
self.caps["newCommandTimeout"] = 6000
self.driver = webdriver.Remote('http://127.0.0.1:4723/wd/hub', self.caps)  # localhost

上面是在appium的脚本中用一个字典来存储”desired capabilities“，然后创建了一个webDriver对象。

'http://127.0.0.1:4723/wd/hub'代表的是向本机的4723端口发送请求，appium服务运行时监听的端口就是4723，url中的路径部分/wd/hub，其中wd是webdriver的缩写，hub表示中心节点。这些在appium服务的node.js源码中能找到对应路径。

class WebDriver(webdriver.Remote):
    def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
                 desired_capabilities=None, browser_profile=None, proxy=None, keep_alive=False):

        super(WebDriver, self).__init__(command_executor, desired_capabilities, browser_profile, proxy, keep_alive)

        if self.command_executor is not None:
            self._addCommands()

class WebDriver(object):
    def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
                 desired_capabilities=None, browser_profile=None, proxy=None,
                 keep_alive=False, file_detector=None):
        ...
        if type(self.command_executor) is bytes or isinstance(self.command_executor, str):
            self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)
        ...
        self.start_session(desired_capabilities, browser_profile)
        ...

先看简单的吧，当self.command_executor不为None时，调用self._addCommands()往self.command_executor._commands命令映射列表中新增一些命令。因为appium是基于selenium二次开发的，self.command_executor._commands是selenium框架中原有的命令字映射表，appium在这基础上新增了一些。

然后构造器中主要就是创建RemoteConnection对象和start_session()。

1、创建RemoteConnection对象并赋值给command_executor属性

来看看RemoteConnection的构造器，发现其内部主要是检查我们最开始传入的url的参数格式是否正确。并把url解析出来后赋值给自己的属性保存。并使用_commands字典来保存http请求中请求方法和url路径（这个应该是为后面做准备吧）。形式为：命令描述字符串-->(http请求方法，http请求路径)。这样只需要只要命令描述字符串就知道了命令请求方法和路径。

self._commands = {
	Command.STATUS: ('GET', '/status'),
	Command.NEW_SESSION: ('POST', '/session'),
	Command.GET_ALL_SESSIONS: ('GET', '/sessions'),
	Command.QUIT: ('DELETE', '/session/$sessionId'),
	Command.GET_CURRENT_WINDOW_HANDLE:
	   ('GET', '/session/$sessionId/window_handle'),
	Command.GET_WINDOW_HANDLES:
	   ('GET', '/session/$sessionId/window_handles'),
	Command.GET: ('POST', '/session/$sessionId/url'),
	Command.GO_FORWARD: ('POST', '/session/$sessionId/forward'),
	Command.GO_BACK: ('POST', '/session/$sessionId/back'),
	Command.REFRESH: ('POST', '/session/$sessionId/refresh'),
	Command.EXECUTE_SCRIPT: ('POST', '/session/$sessionId/execute'),
	Command.GET_CURRENT_URL: ('GET', '/session/$sessionId/url'),
	Command.GET_TITLE: ('GET', '/session/$sessionId/title'),
	Command.GET_PAGE_SOURCE: ('GET', '/session/$sessionId/source'),
	Command.SCREENSHOT: ('GET', '/session/$sessionId/screenshot'),
	Command.ELEMENT_SCREENSHOT: ('GET', '/session/$sessionId/element/$id/screenshot'),
	Command.FIND_ELEMENT: ('POST', '/session/$sessionId/element'),
	Command.FIND_ELEMENTS: ('POST', '/session/$sessionId/elements'),
    ...

Command类保存了用例中所有的命令

class Command(object):
    """
    Defines constants for the standard WebDriver commands.

    While these constants have no meaning in and of themselves, they are
    used to marshal commands through a service that implements WebDriver's
    remote wire protocol:

        https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol

    """

    # Keep in sync with org.openqa.selenium.remote.DriverCommand

    STATUS = "status"
    NEW_SESSION = "newSession"
    GET_ALL_SESSIONS = "getAllSessions"
    DELETE_SESSION = "deleteSession"
    CLOSE = "close"
    QUIT = "quit"
    GET = "get"
    GO_BACK = "goBack"
    GO_FORWARD = "goForward"
    REFRESH = "refresh"
    ADD_COOKIE = "addCookie"
    GET_COOKIE = "getCookie"
    GET_ALL_COOKIES = "getCookies"
    DELETE_COOKIE = "deleteCookie"
    DELETE_ALL_COOKIES = "deleteAllCookies"
    FIND_ELEMENT = "findElement"
    FIND_ELEMENTS = "findElements"
    FIND_CHILD_ELEMENT = "findChildElement"
    FIND_CHILD_ELEMENTS = "findChildElements"
    CLEAR_ELEMENT = "clearElement"
    CLICK_ELEMENT = "clickElement"
    SEND_KEYS_TO_ELEMENT = "sendKeysToElement"
    SEND_KEYS_TO_ACTIVE_ELEMENT = "sendKeysToActiveElement"
    SUBMIT_ELEMENT = "submitElement"
    UPLOAD_FILE = "uploadFile"
    GET_CURRENT_WINDOW_HANDLE = "getCurrentWindowHandle"
    GET_WINDOW_HANDLES = "getWindowHandles"
    ...

这样就把命令关键字和命令关联起来起来了。

2、执行start_session(desired_capabilities, browser_profile)生成session id

 def start_session(self, desired_capabilities, browser_profile=None):
	"""
	Creates a new session with the desired capabilities.

	:Args:
	 - browser_name - The name of the browser to request.
	 - version - Which browser version to request.
	 - platform - Which platform to request the browser on.
	 - javascript_enabled - Whether the new session should support JavaScript.
	 - browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object. Only used if Firefox is requested.
	"""
	capabilities = {'desiredCapabilities': {}, 'requiredCapabilities': {}}
	for k, v in desired_capabilities.items():
	    if k not in ('desiredCapabilities', 'requiredCapabilities'):
	        capabilities['desiredCapabilities'][k] = v
	    else:
	        capabilities[k].update(v)
	if browser_profile:
	    capabilities['desiredCapabilities']['firefox_profile'] = browser_profile.encoded
	response = self.execute(Command.NEW_SESSION, capabilities)
	if 'sessionId' not in response:
	    response = response['value']
	self.session_id = response['sessionId']
	self.capabilities = response['value']

	# Quick check to see if we have a W3C Compliant browser
	self.w3c = response.get('status') is None

其中response = self.execute(Command.NEW_SESSION, capabilities)执行。

appium中发送http请求都是在excute()方法中执行。excute()中又执行了command_executor.execute(), 最终在这个execute()中调用request发送http请求。

def execute(self, driver_command, params=None):
	"""
	Sends a command to be executed by a command.CommandExecutor.
	
	:Args:
	 - driver_command: The name of the command to execute as a string.
	 - params: A dictionary of named parameters to send with the command.
	
	:Returns:
	  The command's JSON response loaded into a dictionary object.
	"""
	if self.session_id is not None:
	    if not params:
	        params = {'sessionId': self.session_id}
	    elif 'sessionId' not in params:
	        params['sessionId'] = self.session_id
	
	params = self._wrap_value(params)
	response = self.command_executor.execute(driver_command, params)
	if response:
	    self.error_handler.check_response(response)
	    response['value'] = self._unwrap_value(
	        response.get('value', None))
	    return response
	# If the server doesn't send a response, assume the command was
	# a success
	return {'success': 0, 'value': None, 'sessionId': self.session_id}

可以看到，函数内部首先会检查session_id是不是为none。在获取sessioid的时候，这个session_id是none，直接走下面的逻辑。获取完session之后的所有请求，sessioid不为null，则会检查参数params加上sessionid参数。所以服务器就知道了请求来自哪个客户端。

原来客户端的session id是在这里获取，并每次请求时在这里加上session id的呀！

调用流程有点复杂，来个流程图吧。

二、执行查找控件和执行UI操作

用一个基本的点击操作来梳理这个过程。

self.driver.find_element_by_id("xxx").click()

1、driver.find_element_by_id()获取控件

def find_element_by_id(self, id_):
    """Finds an element by id.

    :Args:
     - id\_ - The id of the element to be found.

    :Usage:
        driver.find_element_by_id('foo')
    """
    return self.find_element(by=By.ID, value=id_)

其内部调用的是自身的find_element()方法

def find_element(self, by=By.ID, value=None):
    """
    'Private' method used by the find_element_by_* methods.

    :Usage:
        Use the corresponding find_element_by_* instead of this.

    :rtype: WebElement
    """
    if self.w3c:
        if by == By.ID:
            by = By.CSS_SELECTOR
            value = '[id="%s"]' % value
        elif by == By.TAG_NAME:
            by = By.CSS_SELECTOR
        elif by == By.CLASS_NAME:
            by = By.CSS_SELECTOR
            value = ".%s" % value
        elif by == By.NAME:
            by = By.CSS_SELECTOR
            value = '[name="%s"]' % value
    return self.execute(Command.FIND_ELEMENT, {
        'using': by,
        'value': value})['value']

这里根据不同查找方式，对by和value参数进行了处理，然后再调用自身的excute()方法。注意看注释里的rtype: WebElement，说明find_element()返回的是一个WebElement对象。

self.execute(Command.FIND_ELEMENT, {'using': by,'value': value})['value']

就需要去excute()中看看是如何返回一个WebElement对象了。

def execute(self, driver_command, params=None):
    """
    Sends a command to be executed by a command.CommandExecutor.

    :Args:
     - driver_command: The name of the command to execute as a string.
     - params: A dictionary of named parameters to send with the command.

    :Returns:
      The command's JSON response loaded into a dictionary object.
    """
    if self.session_id is not None:
        if not params:
            params = {'sessionId': self.session_id}
        elif 'sessionId' not in params:
            params['sessionId'] = self.session_id

    params = self._wrap_value(params)
    response = self.command_executor.execute(driver_command, params)
    if response:
        self.error_handler.check_response(response)
        response['value'] = self._unwrap_value(
            response.get('value', None))
        return response
    # If the server doesn't send a response, assume the command was
    # a success
    return {'success': 0, 'value': None, 'sessionId': self.session_id}

通过注释可以看到，excute()发送了一条需要被执行的命令到command.CommandExecutor，然后得到返回结果response。并且response是json格式的字典类型。

excute()方法前面是给params参数加上session id，这个在前面已经分析过了。

然后是包装params参数，并执行command_executor.execute(driver_command, params)得到返回response。然后再检查response格式这些是否正常，如果response中不包含错误，则对response中的value进行解包装。就是在这个地方生成WebElement对象的。

def _unwrap_value(self, value):
    if isinstance(value, dict) and ('ELEMENT' in value or 'element-6066-11e4-a52e-4f735466cecf' in value):
        wrapped_id = value.get('ELEMENT', None)
        if wrapped_id:
            return self.create_web_element(value['ELEMENT'])
        else:
            return self.create_web_element(value['element-6066-11e4-a52e-4f735466cecf'])

    elif isinstance(value, list):
        return list(self._unwrap_value(item) for item in value)
    else:
        return value

其中create_web_element()内部实现为

def create_web_element(self, element_id):
    """Creates a web element with the specified `element_id`."""
    return self._web_element_cls(self, element_id, w3c=self.w3c)

而_web_element_cls为

_web_element_cls = WebElement

所以，终于明白了是如何调用find_element_by_id()一步步如何最终获取到WebElement了。

2、WebElement对象上执行click()点击

def click(self):
    """Clicks the element."""
    self._execute(Command.CLICK_ELEMENT)

进入到_excute(）

# Private Methods
def _execute(self, command, params=None):
    """Executes a command against the underlying HTML element.

    Args:
      command: The name of the command to _execute as a string.
      params: A dictionary of named parameters to send with the command.

    Returns:
      The command's JSON response loaded into a dictionary object.
    """
    if not params:
        params = {}
    params['id'] = self._id
    return self._parent.execute(command, params)

self._parent是什么对象呢？在WebElement的构造器中，self._parent是构造器传入的第一个参数

class WebElement(object):

    def __init__(self, parent, id_, w3c=False):
        self._parent = parent
        self._id = id_
        self._w3c = w3c

那么再回到上面创建WebElement的地方，发现传入的是WebDriver对象

def create_web_element(self, element_id):
    """Creates a web element with the specified `element_id`."""
    return self._web_element_cls(self, element_id, w3c=self.w3c)

def _unwrap_value(self, value):
    if isinstance(value, dict) and ('ELEMENT' in value or 'element-6066-11e4-a52e-4f735466cecf' in value):
        wrapped_id = value.get('ELEMENT', None)
        if wrapped_id:
            return self.create_web_element(value['ELEMENT'])
        else:
            return self.create_web_element(value['element-6066-11e4-a52e-4f735466cecf'])

    elif isinstance(value, list):
        return list(self._unwrap_value(item) for item in value)
    else:
        return value

所以self._parent.execute(command, params)调用的还是webdriver对象的excute()方法。

到这里，就理清了控件是如何点击的，其本质也是向appium服务发送一个http请求。