一、前言
乱序拼图验证是一种较少见的验证码防御,市面上更多的是拖动滑块,被完美攻克的有不少,都在行为轨迹上下足了功夫,本文不讨论轨迹模拟范畴,就只针对拼图还原进行研究。
找一个市面比较普及的顶像乱序拼图进行验证,它号称的防御能力4星,用户体验3星,通过研究发现,它的还原程度相当高,思路也很简单,下面一步步的讲解还原过程。
二、环境准备
1.依赖
- 采集模拟 selenium
- 特征匹配 python+opencv
2.安装环境
!pip install setuptools
!pip install selenium
!pip install numpy Matplotlib
!pip install opencv-python
3.chormedriver 下载
找到对应浏览器版本+系统平台的webdriver后,macOS 建议存放到 /usr/local/bin
!wget https://npm.taobao.org/mirrors/chromedriver/95.0.4638.69/chromedriver_mac64.zip
三、采集样本
引入依赖库,使用 webdriver 打开官方网站的产品演示页面
import os
import cv2
import time
import urllib.request
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
创建下载样本的代码,主要流程是打开官网的demo页后,截图并保存
# 采集代码
class CrackPuzzleCaptcha():
# 初始化webdriver
def init(self):
self.url = 'https://www.dingxiang-inc.com/business/captcha'
chrome_options = webdriver.ChromeOptions()
# chrome_options.add_argument("--start-maximized")
chrome_options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors","enable-automation"]) # 设置为开发者模式
path = r'/usr/local/bin/chromedriver' #macOS
# path = r'D:\Anaconda3\chromedriver.exe' #windows
self.browser = webdriver.Chrome(executable_path=path,chrome_options=chrome_options)
#设置显示等待时间
self.wait = WebDriverWait(self.browser, 20)
self.browser.get(self.url)
# 打开验证码demo页面,并强制元素在浏览器可视区域
def openTest(self):
time.sleep(1)
self.browser.execute_script('setTimeout(function(){document.querySelector("body > div.wrapper-main > div.wrapper.wrapper-content > div > div.captcha-intro > div.captcha-intro-header > div > div > ul > li.item-8").click();},0)')
self.browser.execute_script('setTimeout(function(){document.querySelector("body > div.wrapper-main > div.wrapper.wrapper-content > div > div.captcha-intro > div.captcha-intro-body > div > div.captcha-intro-demo").scrollIntoView();},0)')
time.sleep(1)
# 找到原图,webp格式,直接下载保存
def download(self):
onebtn = self.browser.find_element_by_css_selector('#dx_captcha_oneclick_bar-logo_2 > span')
ActionChains(self.browser).move_to_element(onebtn).perform()
time.sleep(1)
#下载webp
img_url = self.browser.find_element_by_css_selector('#dx_captcha_jigsaw_fragment-top-left_3 > img').get_attribute("src")
img_address = "test.png" # 样本文件
response = urllib.request.urlopen(img_url)
img = response.read()
with open(img_address, 'wb') as f:
f.write(img)
print('已保存', img_address)
return self.browser
def crack(self):
pass
开始采集
crack = CrackPuzzleCaptcha()
crack.init()
crack.openTest()
browser2 = crack.download()
已保存 test.png
四、调研结果
- 关键1:显示的拼图的原图就是已经乱序的状态