依旧先上代码。
# -*- coding:utf-8 -*-
from lxml import etree
from fake_useragent import UserAgent
from selenium import webdriver
ua = UserAgent()
ua_header = {
'User-Agent': ua.random,
'Cookie': ''
}
def conn_weibo():
index_url = "https://www.weibo.com/"
proxy = {
'host': '172.17.18.80',
'port': 8080
}
profile = webdriver.FirefoxProfile()
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.http', proxy['host'])
profile.set_preference('network.proxy.http_port', proxy['port'])
profile.set_preference('network.proxy.ssl', proxy['host'])
profile.set_preference('network.proxy.ssl_port', proxy['port'])
profile.update_preferences()
driver = webdriver.Firefox(profile)
driver.get(index_url)
if __name__ == '__main__':
conn_weibo()
这里看上去就只有几句有用的代码,但是实际运用的时候对于初学者埋了不少坑,我把遇到的问题和解决方式记录下。
安装geckodriver
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH
- ubuntu16.04环境下 解决方法:
* 下载 geckodriver,地址: https://github.com/mozilla/geckodriver/releases
* 解压后将geckodriver 存放至 /usr/local/bin/ 路径下即可
2. Windows环境下:
* 下载 geckodriver,地址: https://github.com/mozilla/geckodriver/releases
* 将geckodriver.exe放到Firefox的安装目录下(如D:\Program Files\Mozilla Firefox)
* 将火狐安装目录(如D:\Program Files\Mozilla Firefox)添加到环境变量Path中
* 重启IDE
selenium配置Firefox代理
- 注意端口号是整数;
- ssl和ssl_port是针对https请求设置的,但是这里不用判断请求方式,以防后面的请求变成http后无法使用代理访问。