Python+Selenium+Firefox headless 配置

最新推荐文章于 2023-07-29 16:55:34 发布

漫画家

最新推荐文章于 2023-07-29 16:55:34 发布

阅读量4.4k

点赞数 1

文章标签： Firefox python about:config selenium

本文链接：https://blog.csdn.net/kunorz/article/details/80739138

版权

最近爬虫要用到Python爬一个比较复杂的网站，PhantomJs好像停止维护了，所以选择了Selenium+Firefox headless,查了一些Firefox相关的配置，记录一下

查看支持的配置：

Firefox版本：60.0.2（64）

地址栏输入about:config打开配置页

英语好的大神可以去About:config英文地址查看原注释（网页加载很慢，有时一次加载不出来，刷新一下就行了）

下载火狐浏览器驱动：

火狐浏览器驱动可直接从网上下载：geckodriver的下载链接：https://github.com/mozilla/geckodriver/releases

下载后扔到Python根目录

安装Selenium：

CMD -> pip3 install selenium

Python中的配置代码：


from selenium import webdriver

#无头模式
options = webdriver.FirefoxOptions()
options.add_argument('-headless')

profile = webdriver.FirefoxProfile()
#禁用图片
profile.set_preference('permissions.default.image', 2)
#禁用Flash
profile.set_preference('dom.ipc.plugins.enabled.npswf32.dll', 'false')#Windows
profile.set_preference('dom.ipc.plugins.enabled.libflashplayer.so', 'false')#Linux
#禁用Js
profile.set_preference('javascript.enabled', 'false')

browser = webdriver.Firefox(options=options,firefox_profile = profile)
#查看拥有的各种方法、属性
print(dir(browser))

browser.get("https://blog.csdn.net/kunorz")
#截图
browser.get_screenshot_as_file('myblog.png')
#获取网页源码
page = browser.page_source
print(page)

#关闭
browser.close()

浏览器记得要关闭，不然会打开很多个

更多方法请看Selenium package API