一、背景
通过puppeteer爬取目标网站时,经常会被对方网站检测到,比如原生puppeteerCDP特征非常明显,另外指纹如果一直不变,也会引发风控
二、实现
通过以下几行代码即可轻松过大部分检测点,并且能够切换指纹,我的电脑是windows系统,显卡是AMD 7800XT,先放效果图:
2.1 检测网站
https://www.browserscan.net/
Webdriver、CDP等机器人常规检测:
2.2 代码
使用的是开源工具,代码非常少,首先是依赖安装
其中fingerprint-generator
负责生成指纹,fingerprint-injector
负责注入指纹,rebrowser-puppeteer
是puppeteer的一个补丁库,主要用于绕过CDP检测
npm install fingerprint-generator fingerprint-injector rebrowser-puppeteer
示例js:
const { FingerprintGenerator } = require('fingerprint-generator');
const { FingerprintInjector } = require('fingerprint-injector');
const puppeteer = require('rebrowser-puppeteer');
(async () => {
const generator = new FingerprintGenerator(
// {
// browsers: [
// {
// name: 'chrome',
// minVersion: 131,
// minVersion: 131
// }
// ]
// }
);
const fingerprint = generator.getFingerprint();
const browser = await puppeteer.launch({
// executablePath: 'C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe',
headless: false,
args: [
'--disable-blink-features=AutomationControlled',
'--disable-web-security'
],
ignoreDefaultArgs: ['--enable-automation']
});
const page = await browser.newPage();
const injector = new FingerprintInjector();
await injector.attachFingerprintToPuppeteer(page, fingerprint);
// https://www.browserscan.net
await page.goto('https://www.browserscan.net/', {
waitUntil: 'networkidle0',
timeout: 600000
});
})();