近年来,极验验证码作为一种高级验证技术被广泛应用。然而,对于爬虫开发者来说,破解验证码始终是一大挑战。本文将详细介绍如何使用D语言结合图像处理和自动化技术,破解极验第三代滑块验证码。
一、前置准备
1. 安装依赖库
为了实现验证码破解,我们需要安装requests、arsd、dlib和selenium库。在DUB的dub.json文件中添加这些依赖:
json
{
"dependencies": {
"requests": "~>0.8.1",
"arsd-official": "~>10.7.0",
"dlib": "~>0.19.0",
"selenium": "~>0.2.1"
}
}
二、实现步骤
1. 获取验证码图片
首先,我们使用requests库来获取验证码图片并保存到本地。这些图片将用于后续的图像处理操作。
d
复制代码
import requests;
import std.file : write;
import std.path : buildPath;
import std.exception : enforce;
void downloadImage(string url, string path) {
auto response = get(url);
enforce(response.statusCode == 200, "Failed to download image");
write(path, response.body);
}
void main() {
string bgUrl = "https://static.geetest.com/pictures/gt/3999642ae/3999642ae.webp";
string fullBgUrl = "https://static.geetest.com/pictures/gt/3999642ae/bg/fbdb18152.webp";
string bgPath = buildPath("images", "bg_image.webp");
string fullBgPath = buildPath("images", "full_bg_image.webp");
downloadImage(bgUrl, bgPath);
downloadImage(fullBgUrl, fullBgPath);
}
2. 图像处理
接下来,使用Dlib库加载和处理图片,找到滑块缺口的位置。
d
import dlib.image_io;
import dlib.image_processing;
import std.stdio;
int findGap(string bgImagePath, string fullBgImagePath) {
auto bgImage = load_image(bgImagePath);
auto fullBgImage = load_image(fullBgImagePath);
enforce(!bgImage.empty() && !fullBgImage.empty(), "Could not open or find the images!");
foreach (x; 0 .. bgImage.width()) {
foreach (y; 0 .. bgImage.height()) {
if (bgImage[y][x] != fullBgImage[y][x]) {
return x;
}
}
}
return -1;
}
void main() {
string bgPath = buildPath("images", "bg_image.webp");
string fullBgPath = buildPath("images", "full_bg_image.webp");
int gapPosition = findGap(bgPath, fullBgPath);
writeln("Gap position: ", gapPosition);
}
3. 模拟拖动滑块
为了模拟人类拖动滑块行为,我们使用Python生成拖动轨迹,然后在D语言中调用Python脚本。这里使用Selenium来控制浏览器执行拖动操作。
python
# generate_tracks.py
import numpy as np
def bezier_curve(t):
return 3 * t * (1 - t)**2 + 3 * (1 - t) * t**2 + t**3
def generate_tracks(distance):
tracks = []
for i in range(101):
t = i / 100
x = int(bezier_curve(t) * distance)
tracks.append(x)
return tracks
if __name__ == "__main__":
import sys
distance = int(sys.argv[1])
tracks = generate_tracks(distance)
print(tracks)
python
# simulate_drag.py
from selenium import webdriver
import time
import subprocess
import json
# 获取 gap_position
gap_position = 100 # 假设值,实际应从 D 程序获取
# 生成拖动轨迹
result = subprocess.run(['python3', 'generate_tracks.py', str(gap_position)], stdout=subprocess.PIPE)
tracks = json.loads(result.stdout.decode('utf-8'))
# 使用 Selenium 模拟拖动滑块
browser = webdriver.Chrome()
browser.get('https://account.ch.com/NonRegistrations-Regist')
knob = browser.find_element_by_class_name('gt_slider_knob')
actions = webdriver.ActionChains(browser)
actions.click_and_hold(knob).perform()
for track in tracks:
actions.move_by_offset(track, 0).perform()
time.sleep(0.02) # 模拟人类行为
actions.release().perform()
browser.quit()
4. 调用Python脚本生成轨迹并拖动滑块
在D语言中使用std.process调用Python脚本,生成拖动轨迹并控制浏览器完成滑块拖动。
d
import std.process;
import std.stdio;更多内容联系1436423940
void main() {
int gapPosition = 100; // 从图像处理步骤获取
auto output = pipeProcess(["python3", "generate_tracks.py", gapPosition.to!string]).stdout.byLineCopy.array;
writeln("Tracks generated: ", output);
auto status = system("python3 simulate_drag.py");
enforce(status == 0, "Failed to execute command");
}