自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识

最新推荐文章于 2024-05-04 16:01:36 发布

暴走的测试工程师

最新推荐文章于 2024-05-04 16:01:36 发布

阅读量197

点赞数

分类专栏： webUI自动化测试文章标签：图像识别 ocr python

本文链接：https://blog.csdn.net/Mr_Deng_/article/details/115477774

版权

webUI自动化测试专栏收录该内容

10 篇文章 0 订阅

订阅专栏

点击这里获取更多学习资料 --->>> 获取更多学习资料

1、调用第三方图片识别接口，这里没有找到比较好的免费的接口，有的同学可以分享哈~

2、使用cookie绕过登录直接访问页面，webdriver 中有操作cookie的方法，获取、添加、删除

# 思路：登录成功获取cookie保存到文件中 -> 添加 cookie 打开页面直接访问，

注意：.ini 配置文件中不能写入 % 字符，保存cookie时候需要替换下

    def get_cookie(self, option):
        """
        获取登录后的cookie值
        :param option: 配置文件中标题的值
        :param section: 对应key值
        :return:
        """
        # 由于.ini文件中不能写入 % 字符，会报错，替换成 $ 后写入
        cookieValue = str(self.driver.get_cookies()).replace("%", "$")
        Log.info("获取浏览器cookie值: {}...".format(cookieValue[:50]))
        # cookie值保存到配置文件
        rwParams.write_ini_file(section="LOGIN_COOKIE", option=option, value=cookieValue) 
  
    def set_cookie(self, option):
        """
        设置携带cookie访问网页
        :param option: 索要获取的cookie对应key值
        :return:
        """
        cookieStr = rwParams.read_ini_file(section="LOGIN_COOKIE", option=option)
        # 保存的cookie值中需要把 $ 字符替换为 % 字符, 且转化为原格式
        cookieValue = eval(cookieStr.replace("$", "%"))
        # 添加cookie, 原格式是个列表循环添加
        [self.driver.add_cookie(cookie) for cookie in cookieValue]
        Log.info("给浏览器添加cookie值: {}...".format(cookieStr[:50]))

    def del_cookie(self):
        """删除cookie"""
        self.driver.delete_all_cookies()
        Log.info("删除当前添加的所有cookie值")

3、最简单的方法，联系开发再项目代码中设置万能验证码，直接输万能码

4、降噪方法可以多执行几次，多次处理图片，还可以对图片进行切割处理，单个图片进行识别，目前只能识别不扭曲的验证码和干扰线不粗的验证码图片

# 思路：验证码图片转称黑白色 -> 遍历每个像素点判断周围8个点的颜色 -> 有4个以上的不同的颜色就判断该点为底色 -> 使用ocr工具进行识别处理后的验证码

# -*- coding: utf-8 -*-

# @Author  : Mr.Deng
# @Time    : 2021/3/21 14:34

"""
图片识别，登录页面验证码文字识别
"""
from util.basePages import BasePages
from config.filePathConfig import FilePathConfig
from config.varConfig import SysConfig as SC

from PIL import Image
from pytesseract import pytesseract

import re


class ImageRecognize:

    def __init__(self, driver):
        self.base = BasePages(driver)
        self.imageSavePath = FilePathConfig.CODE_IMG_SAVE_PATH + "\\" + "shotCode.png"

    def save_code_image(self, elementPath, zoomNum=1.25):
        """
        截图保存验证码图片
        :param elementPath:
        :param zoomNum: 电脑屏幕缩放比例，125% ， zoom = 1.25
        :return:
        """
        self.base.driver.origin_driver.get_screenshot_as_file(self.imageSavePath)
        imageData = self.base.driver.get_location(elementPath)
        # 图片左右高低尺寸坐标，要乘以屏幕缩放比例
        left = imageData["x"] * zoomNum
        top = imageData["y"] * zoomNum
        right = left + imageData["width"] * zoomNum
        bottom = top + imageData["height"] * zoomNum
        self.imageObj = Image.open(self.imageSavePath)
        codeImage = self.imageObj.crop((left, top, right, bottom))
        codeImage.save(self.imageSavePath)
        return codeImage

    def binarization_image(self, image):
        """
        验证码图片二值化转化成黑白色
        :param image: 图片保存对象
        :return:
        """
        imageCode = image.convert("L")
        pixelData = imageCode.load()
        row, col = image.size
        threshold = 150 # 150 灰色
        for i in range(row):
            for y in range(col):
                if pixelData[i, y] > threshold:
                    pixelData[i, y] = 0
                else:
                    pixelData[i, y] = 255
        return imageCode

    def delete_noisy_point(self, image):
        """
        降噪，删除多余的干扰线像素点
        :param image: 图片对象
        :return:
        """
        pixelData = image.load()
        row, col = image.size
        # 判断图片中黑白像素点的多少，判断那种颜色是背景色，那个是验证码颜色
        poxList = []
        for x in range(row - 1):
            for y in range(col - 1):
                poxList.append(pixelData[x, y])
        # 按像素点多少降序排列，多的是背景，少的是验证码
        newList = sorted(set(poxList), key=lambda x: poxList.count(x), reverse=True)
        # 循环判断每个像素点上下左右，左上，右上，左下，右下八个像素点的颜色值
        for a in range(row - 1):
            for b in range(col - 1):
                count = 0
                if pixelData[a, b - 1] == newList[0]: count += 1 # 上
                if pixelData[a, b + 1] == newList[0]: count += 1 # 下
                if pixelData[a - 1, b] == newList[0]: count += 1 # 左
                if pixelData[a + 1, b] == newList[0]: count += 1 # 右
                if pixelData[a - 1, b - 1] == newList[0]: count += 1 # 左上
                if pixelData[a - 1, b + 1] == newList[0]: count += 1 # 左下
                if pixelData[a + 1, b - 1] == newList[0]: count += 1 # 右上
                if pixelData[a + 1, b + 1] == newList[0]: count += 1 # 右下
                # 统计周围四个以上的点都是背景色，则该点就是背景色，否则验证码色
                if count > 4:  pixelData[a, b] = newList[0]
        image.save(self.imageSavePath.replace("shotCode", "ProcessedImage")) # 保存处理后的验证码
        return image

    def image_str(self, image):
        """识别处理后的验证码图片"""
        img = self.binarization_image(image)
        afterSpotImg = self.delete_noisy_point(img)
        pytesseract.tesseract_cmd = SC.PYTESSERACT_OCR
        # 图片转文字
        result = pytesseract.image_to_string(afterSpotImg)
        return result


if __name__ == '__main__':

    from util.pySelenium import PySelenium

    p = PySelenium(openType="pc")
    # p.open_url(url="https://XXXX/admin/login?redirect=%2Fadmin%2Fdashboard")
    # p.sleep(2)
    # IM = ImageRecognize(p).save_code_image('xpath->//div[@class="imgs"]/img')
    code = ImageRecognize(p).image_str(image=Image.open(r"C:\Users\kk\Desktop\下载.jpg"))
    print(code)

暴走的测试工程师

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
自动化测试验证码处理不了？看这里 -＞＞＞ pytesseract库登录验证码识

部分代码可以替换成自己的，除了比较扭曲的验证码识别不了，其他都可以实现识别点击查看详情from PIL import Imagefrom pytesseract import pytesseractimport reclass ImageRecognize: def __init__(self, driver): self.base = BasePages(driver) self.imageSavePath = FilePathConfig....
复制链接

扫一扫