网站可以检测到您何时在chromedriver中使用硒吗?

本文翻译自:Can a website detect when you are using selenium with chromedriver?

I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. 我一直在使用Chromedriver测试Selenium,但我注意到,即使根本没有自动化功能,某些页面也可以检测到您正在使用Selenium。 Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. 即使当我只是通过Selenium和Xephyr使用chrome手动浏览时,我也经常得到一个页面,指出检测到可疑活动。 I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser. 我已经检查了用户代理和浏览器指纹,它们与普通的chrome浏览器完全相同。

When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected. 当我以普通的chrome浏览到这些站点时,一切正常,但是当我使用Selenium时,我被检测到。

In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it. 从理论上讲,chromedriver和chrome在任何Web服务器上看起来都应该完全相同,但是它们可以通过某种方式检测到它。

If you want some testcode try out this: 如果您想要一些测试代码,请尝试以下方法:

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')

If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. 如果浏览stubhub,您将在一个或两个请求中被重定向和“阻止”。 I've been investigating this and I can't figure out how they can tell that a user is using Selenium. 我一直在对此进行调查,无法弄清楚他们如何分辨用户正在使用Selenium。

How do they do it? 他们是如何做到的呢?

EDIT UPDATE: 编辑更新:

I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin. 我在Firefox中安装了Selenium IDE插件,当我在普通的Firefox浏览器中仅使用附加插件访问stubhub.com时就被禁止了。

EDIT: 编辑:

When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser\\'s' requests often have 'no-cache' in the response header. 当我使用Fiddler查看来回发送的HTTP请求时,我注意到“假浏览器”的请求通常在响应标头中具有“ no-cache”。

EDIT: 编辑:

results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. 像这样的结果是否有办法从Javascript检测到我在Selenium Webdriver页面中,这表明应该没有办法检测何时使用Webdriver。 But this evidence suggests otherwise. 但这证据表明并非如此。

EDIT: 编辑:

The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome. 该站点将指纹上传到他们的服务器,但是我检查了一下,硒的指纹与使用chrome时的指纹相同。

EDIT: 编辑:

This is one of the fingerprint payloads that they send to their servers 这是它们发送到服务器的指纹有效载荷之一

{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}

Its identical in selenium and in chrome 硒和铬相同

EDIT: 编辑:

VPNs work for a single use but get detected after I load the first page. VPN只能使用一次,但是在加载第一页后会被检测到。 Clearly some javascript is being run to detect Selenium. 显然,正在运行一些JavaScript以检测Selenium。


#1楼

参考:https://stackoom.com/question/2FPaN/网站可以检测到您何时在chromedriver中使用硒吗


#2楼

It sounds like they are behind a web application firewall. 听起来好像它们在Web应用程序防火墙后面。 Take a look at modsecurity and owasp to see how those work. 看一下modsecurity和owasp,看看它们是如何工作的。 In reality, what you are asking is how to do bot detection evasion. 实际上,您要问的是如何进行漫游器检测规避。 That is not what selenium web driver is for. 这不是Selenium Web驱动程序的用途。 It is for testing your web application not hitting other web applications. 它用于测试您的Web应用程序,而不打其他Web应用程序。 It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. 有可能,但基本上,您必须查看WAF在其规则集中查找的内容,并且如果可以的话,特别要避免使用硒。 Even then, it might still not work because you don't know what WAF they are using. 即使那样,它仍然可能无法正常工作,因为您不知道他们在使用什么WAF。 You did the right first step, that is faking the user agent. 您做了正确的第一步,就是伪造用户代理。 If that didn't work though, then a WAF is in place and you probably need to get more tricky. 如果仍然不能解决问题,那么WAF已经到位,您可能需要变得更加棘手。

Edit: Point taken from other answer. 编辑:点取自其他答案。 Make sure your user agent is actually being set correctly first. 确保首先正确设置了用户代理。 Maybe have it hit a local web server or sniff the traffic going out. 可能是它撞到了本地Web服务器,还是嗅探了流量。


#3楼

Even if you are sending all the right data (eg Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system. 即使您发送了所有正确的数据(例如,Selenium并未显示为扩展,您也具有合理的分辨率/位深度&c),但仍有许多服务和工具可以对访问者的行为进行分析,以确定访问者的行为是否演员是用户或自动化系统。

For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do. 例如,访问一个站点然后立即通过将鼠标直接移到相关按钮上不到一秒钟立即执行一些操作,这实际上是用户不会做的。

It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; 作为调试工具,使用https://panopticlick.eff.org/这样的站点来检查浏览器的独特性可能也很有用。 it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium. 它还将帮助您验证是否有任何特定参数表明您正在Selenium中运行。


#4楼

Firefox is said to set window.navigator.webdriver === true if working with a webdriver. 如果使用webdriver,据说Firefox设置window.navigator.webdriver === true That was according to one of the older specs (eg: archive.org ) but I couldn't find it in the new one except for some very vague wording in the appendices. 这是根据较早的规范之一(例如: archive.org )得出的,但是我在新的规范中找不到它,除了附录中一些非常模糊的措词。

A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple grep ing, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree. 对它的测试是在文件Fingerprint_test.js中的硒代码中,其末尾的注释显示“当前仅在firefox中实现”,但我无法通过一些简单的grep ing识别该方向上的任何代码,在当前(41.0.2)Firefox发行树或Chromium树中。

I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015 . 从2015年1月起,我还发现了有关firefox驱动程序b82512999938中有关指纹的较早提交的评论。 That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec. 昨天在javascript/firefox-driver/extension/content/server.js上下载的Selenium GIT-master中仍然包含该代码,并带有注释,该注释链接到当前w3c Webdriver规范中措辞略有不同的附录。


#5楼

Write an html page with the following code. 用以下代码编写一个html页面。 You will see that in the DOM selenium applies a webdriver attribute in the outerHTML 您将看到,在DOM硒中,在externalHTML中应用了webdriver属性

 <html> <head> <script type="text/javascript"> <!-- function showWindow(){ javascript:(alert(document.documentElement.outerHTML)); } //--> </script> </head> <body> <form> <input type="button" value="Show outerHTML" onclick="showWindow()"> </form> </body> </html> 


#6楼

Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags. 尝试将selenium与特定的chrome用户配置文件一起使用,这样,您就可以将其用作特定用户并定义所需的任何内容。这样做时,它将以“真实”用户身份运行,请使用一些进程浏览器查看chrome进程。您会看到标签的区别。

For example: 例如:

username = os.getenv("USERNAME")
userProfile = "C:\\Users\\" + username + "\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir={}".format(userProfile))
# add here any tag you want.
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
chromedriver = "C:\Python27\chromedriver\chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)

chrome tag list here chrome标签列表在这里

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值