一、极验验证码破解概述
极验验证码的破解难点主要在于其复杂的参数和混淆的代码。我们需要绕过这些难点,利用图像处理技术和模拟人类操作的方法来完成破解。
二、环境准备
首先,我们需要安装以下Java库:
Selenium
Apache Commons IO
Apache Commons Lang
JSON (org.json)
可以通过Maven在项目中添加这些依赖:
xml
<dependencies>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>3.141.59</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.12.0</version>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20210307</version>
</dependency>
</dependencies>
三、浏览器爬虫的破解思路
相比传统爬虫,浏览器爬虫更加直观和有效。浏览器爬虫的思路如下:
图像识别,找到滑块和缺口的位置;
模拟鼠标拖动,将滑块拖到缺口位置。
接下来,我们详细实现这个思路。
四、验证码图片处理
1. 解析图片位置
验证码图片是乱序的,需要根据CSS样式还原。首先,我们抓取验证码图片并解析位置。
java
import org.apache.commons.io.IOUtils;
import org.json.JSONArray;
import org.json.JSONObject;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.net.URL;
public class SliderCaptcha {
public static BufferedImage getImage(String url) throws IOException {
ByteArrayInputStream bis = new ByteArrayInputStream(IOUtils.toByteArray(new URL(url)));
return ImageIO.read(bis);
}
public static BufferedImage mergeImage(BufferedImage image, JSONArray css) {
int width = 260;
int height = 116;
BufferedImage mergedImage = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
for (int i = 0; i < css.length(); i++) {
String[] positions = css.getString(i).replace("px", "").split(" ");
int x = Integer.parseInt(positions[0]);
int y = Integer.parseInt(positions[1]);
BufferedImage piece = image.getSubimage(x, y, 10, 58);
mergedImage.getGraphics().drawImage(piece, (i % 26) * 10, (i / 26) * 58, null);
}
return mergedImage;
}
public static int getOffset(BufferedImage image, BufferedImage bgImage) {
for (int x = 0; x < image.getWidth(); x++) {
for (int y = 0; y < image.getHeight(); y++) {
if (Math.abs(image.getRGB(x, y) - bgImage.getRGB(x, y)) > 50) {
return x;
}
}
}
return -1;
}
}
五、模拟滑块拖动
有了滑块偏移量后,我们可以通过Selenium模拟拖动滑块。
java
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.interactions.Actions;
public class SliderCaptchaAutomation {
public static void dragAndDrop(WebDriver driver, int offset) {
WebElement slider = driver.findElement(By.className("gt_slider_knob"));
Actions actions = new Actions(driver);
actions.clickAndHold(slider).perform();
for (int move = 0; move < offset; move += 10) {
actions.moveByOffset(10, 0).perform();
}
actions.moveByOffset(offset % 10, 0).perform();
actions.pause(500).release().perform();
}
}
六、实现人性化的拖动轨迹
为了模拟更自然的拖动行为,我们引入缓动函数(Easing Functions)。
java
import java.util.ArrayList;
import java.util.List;
public class Easing {
public static double easeOutQuad(double x) {
return 1 - (1 - x) * (1 - x);
}
public static double easeOutExpo(double x) {
if (x == 1) {
return 1;
} else {
return 1 - Math.pow(2, -10 * x);
}
}
public static List<Integer> getTracks(int distance, double seconds, String easingFunc) {
List<Integer> tracks = new ArrayList<>();
List<Integer> offsets = new ArrayList<>();
offsets.add(0);
for (double t = 0.0; t < seconds; t += 0.1) {
double offset = 0;
if ("easeOutQuad".equals(easingFunc)) {
offset = Math.round(easeOutQuad(t / seconds) * distance);
} else if ("easeOutExpo".equals(easingFunc)) {
offset = Math.round(easeOutExpo(t / seconds) * distance);
}
tracks.add((int) (offset - offsets.get(offsets.size() - 1)));
offsets.add((int) offset);
}
return tracks;
}
}
七、综合实现
我们可以将以上各个部分组合起来,完成对极验验证码的破解。
java
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
public class GeetestCrack {
public static void main(String[] args) {
System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver");
WebDriver driver = new ChromeDriver();
driver.get("https://account.ch.com/NonRegistrations-Regist");
// 模拟用户行为,点击并输入信息
// 省略相关代码...
// 获取验证码图片
BufferedImage image = SliderCaptcha.getImage("https://static.geetest.com/pictures/gt/3999642ae/3999642ae.webp");
BufferedImage bgImage = SliderCaptcha.getImage("https://static.geetest.com/pictures/gt/3999642ae/bg/fbdb18152.webp");
// 解析CSS并还原图片
JSONArray css = new JSONArray("[... CSS positions ...]");
BufferedImage mergedImage = SliderCaptcha.mergeImage(image, css);
BufferedImage mergedBgImage = SliderCaptcha.mergeImage(bgImage, css);
// 计算偏移量
int offset = SliderCaptcha.getOffset(mergedImage, mergedBgImage);
// 模拟滑块拖动
SliderCaptchaAutomation.dragAndDrop(driver, offset);
driver.quit();
}
}
更多内容联系1436423940