这是一份使用 Puppeteer 模仿人类行为来绕过 CAPTCHA 的快速指南。 如果你想省去阅读本指南的步骤,可以直接注册 Bright Data 并选择 Web Unlocker API。
第 1 步:项目搭建
mkdir bypass_captcha_puppeteer
cd bypass_captcha_puppeteer
npm init -y
npm install puppeteer
目录结构:
bypass_captcha_puppeteer/
├── index.js
└── package.json
在 package.json
中添加 "type"
字段并设为 "module"
:
{
"name": "bypass_captcha_puppeteer",
"version": "1.0.0",
"description": "",
"main": "index.js",
"type": "module",
"scripts": {
"start": "node index.js"
},
"dependencies": {
"puppeteer": "^23.10.4"
}
}
第 2 步:测试 Puppeteer(无 Stealth)
import puppeteer from 'puppeteer';
const visitBotAnalyzerPage = async () => {
try {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const url = 'https://bot.sannysoft.com/';
console.log(`Navigating to ${url}...`);
await page.goto(url, { waitUntil: 'networkidle2' });
console.log('Taking full-page screenshot...');
await page.screenshot({ path: 'anti-bot-analysis.png', fullPage: true });
console.log('Screenshot taken');
await browser.close();
console.log('Browser closed');
} catch (error) {
console.error('An error occurred:', error);
}
};
// 运行脚本
visitBotAnalyzerPage();
执行:
node index.js
你可能会在一些检测下失败,从而触发 CAPTCHA 验证。
第 3 步:安装 Stealth 插件
npm install puppeteer-extra puppeteer-extra-plugin-stealth
将原本的 puppeteer
import 替换为 puppeteer-extra
,并添加插件:
import puppeteer from 'puppeteer-extra';
import StealthPlugin from 'puppeteer-extra-plugin-stealth';
// 添加 stealth 插件
puppeteer.use(StealthPlugin());
const visitBotAnalyzerPage = async () => {
try {
const browser = await puppeteer.launch();
console.log('Launching browser in stealth mode...');
const page = await browser.newPage();
const url = 'https://bot.sannysoft.com/';
console.log(`Navigating to ${url}...`);
await page.goto(url, { waitUntil: 'networkidle2' });
console.log('Taking full-page screenshot...');
await page.screenshot({ path: 'anti-bot-analysis.png', fullPage: true });
console.log(`Screenshot taken`);
await browser.close();
console.log('Browser closed. Script completed successfully');
} catch (error) {
console.error('Error occurred:', error);
}
};
// 运行脚本
visitBotAnalyzerPage();
再次执行:
node index.js
Stealth 模式能够减少被识别为机器人以及显示 CAPTCHA 的概率。
如果仍然不够隐蔽
对于更高级的 WAF 和机器人检测机制,可以尝试更多的插件(例如 puppeteer-extra-plugin-anonymize-ua
)或者使用专业工具。像 Bright Data 的 Web Unlocker API 可以处理 reCAPTCHA、hCaptcha 等更多验证方式。