python爬虫之requests库与Selenium库代理设置详解

1. requests库代理的设置

1.1 不需要认证的代理

import requests
proxy='127.0.0.1:1943'
proxies={
	'http':'http://'+proxy,
	'http':'https://'+proxy,
}
html=requests.get(url,proxies=proxies)
print(html.text)

1.2 对于需要认证的代理,写法如下

proxy='username:password@127.0.0.1:1943'
#uesrname:用户名
#password:密码

1.3 如果需要使用SOCK5代理,则可以使用如下方式来设置

proxy='127.0.0.1:1943'
proxies={
	'http':'sock5://'+proxy,
	'http':'sock5://'+proxy,
}
#以上方法需要安装requests[socks]模块

1.4 另外一种设置代理的方法

pip3 install PySocks

import requests
import socks
import socketsocks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 9742)
socket.socket = socks.socksocket
try:
    response = requests.get('http://httpbin.org/get')
    print(response.text)
except requests.exceptions.ConnectionError as e:
    print('Error', e.args)

2. Selenium库代理设置

2.1 Chrome代理设置

2.1.1 不需要认证

from selenium import webdriver
proxy='127.0.0.1:9743'
chrome_options=webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=http://'+proxy)
browser = webdriver.Chrome(chrome_options=chrome_options)
browser.get('https://www.baidu.com')

2.1.2 若代理需要认证(相对比较复杂)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import zipfile
ip = '127.0.0.1'
port = 9743
username = 'foo'#用户名
password = 'bar'#密码

manifest_json = """
{
    "version": "1.0.0",
    "manifest_version": 2,
    "name": "Chrome Proxy",
    "permissions": [
        "proxy",
        "tabs",
        "unlimitedStorage",
        "storage",
        "<all_urls>",
        "webRequest",
        "webRequestBlocking"
    ],
    "background": {
        "scripts": ["background.js"]
    }
}
"""

background_js = """
var config = {
        mode: "fixed_servers",
        rules: {
          singleProxy: {
            scheme: "http",
            host: "%(ip)s",
            port: %(port)s
          }
        }
      }

chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

function callbackFn(details) {
    return {
        authCredentials: {
            username: "%(username)s",
            password: "%(password)s"
        }
    }
}

chrome.webRequest.onAuthRequired.addListener(
            callbackFn,
            {urls: ["<all_urls>"]},
            ['blocking']
)
""" % {'ip': ip, 'port': port, 'username': username, 'password': password}

plugin_file = 'proxy_auth_plugin.zip'
with zipfile.ZipFile(plugin_file, 'w') as zp:
    zp.writestr("manifest.json", manifest_json)
    zp.writestr("background.js", background_js)
chrome_options = Options()
chrome_options.add_argument("--start-maximized")
chrome_options.add_extension(plugin_file)
browser = webdriver.Chrome(chrome_options=chrome_options)
browser.get('http://httpbin.org/get')

2.2 PhantomJS

2.2.1 不需要认证

from selenium import webdriver

service_args=[
	'--proxy=127.0.0.1:9743',
	'--proxy-type=http'
]
browser=webdriver.PhantomJS(service_args=service_args)
browser.get('http://httpbin.org/get')

2.2.2 需要认证,只需要加入–proxy–auth选项即可

service_args=[
	'--proxy=127.0.0.1:9743',
	'--proxy-type=http''--proxy-auth=username:password'
]
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值