python爬虫如何使用代理_【猿技术】Python3 爬虫中代理的使用方法

本文介绍了如何使用Python的Urllib和Requests库配置不同类型的代理服务器(包括HTTP代理和SOCKS5代理),并展示了如何利用Selenium进行浏览器级别的代理设置。
摘要由CSDN通过智能技术生成

from urllib.error import URLErrorfrom urllib.request import

ProxyHandler, build_openerproxy = '127.0.0.1:9743'proxy_handler =

ProxyHandler({ 'http': 'http://' proxy, 'https': 'https://'

proxy})opener = build_opener(proxy_handler)try: response =

opener.open('http://httpbin.org/get')

print(response.read().decode('utf-8'))except URLError as e:

print(e.reason)

{ "args": {}, "headers": { "Accept-Encoding": "identity",

"Connection": "close", "Host": "httpbin.org", "User-Agent":

"Python-urllib/3.6" }, "origin": "106.185.45.153", "url":

"http://httpbin.org/get"}

from urllib.error import URLErrorfrom urllib.request import

ProxyHandler, build_openerproxy =

'username:password@127.0.0.1:9743'proxy_handler = ProxyHandler({

'http': 'http://' proxy, 'https': 'https://' proxy})opener =

build_opener(proxy_handler)try: response =

opener.open('http://httpbin.org/get')

print(response.read().decode('utf-8'))except URLError as e:

print(e.reason)

import socksimport socketfrom urllib import requestfrom

urllib.error import URLErrorsocks.set_default_proxy(socks.SOCKS5,

'127.0.0.1', 9742)socket.socket = socks.socksockettry: response =

request.urlopen('http://python培训:http://www.atguigu.comhttpbin.org/get')

print(response.read().decode('utf-8'))except URLError as e:

print(e.reason)

本地我有一个 SOCKS5 代理,运行在 9742 端口,运行成功之后和上文 HTTP 代理输出结果是一样的:

{ "args": {}, "headers": { "Accept-Encoding": "identity",

"Connection": "close", "Host": "httpbin.org", "User-Agent":

"Python-urllib/3.6" }, "origin": "106.185.45.153", "url":

"http://httpbin.org/get"}

import requestsproxy = '127.0.0.1:9743'proxies = { 'http':

'http://' proxy, 'https': 'https://' proxy,}try: response =

requests.get('http://httpbin.org/get', proxies=proxies)

print(response.text)except requests.exceptions.Connecti as e:

print('Error', e.args)

{ "args": {}, "headers": { "Accept": "**", "Accept-Encoding":

"gzip, deflate", "Connection": "close", "Host": "httpbin.org",

"User-Agent": "python-requests/2.18.1" }, "origin":

"106.185.45.153", "url": "http://httpbin.org/get"}

另外还有一种设置方式,和 Urllib 中的方法相同,使用 socks 模块,也需要像上文一样安装该库,设置方法如下:

import requestsimport socksimport

socketsocks.set_default_proxy(socks.SOCKS5, '127.0.0.1',

9742)socket.socket = socks.socksockettry: response =

requests.get('http://httpbin.org/get') print(response.text)except

requests.exceptions.Connecti as e: print('Error', e.args)

from selenium import webdriverproxy =

'127.0.0.1:9743'chrome_options =

webdriver.ChromeOptions()chrome_options.add_argument('--proxy-server=http://'

proxy)browser =

webdriver.Chrome(chrome_options=chrome_options)browser.get('http://httpbin.org/get')

{ "args": {}, "headers": { "Accept":

"text/html,application/xhtml

xml,application/xml;q=0.9,image/webp,image/apng,**;q=0.8",

"Accept-Encoding": "gzip, deflate", "Accept-Language":

"zh-CN,en,*", "Connection": "close", "Host": "httpbin.org",

"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X)

AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.0 Safari/538.1"

}, "origin": "106.185.45.153", "url": "http://httpbin.org/get"}

service_args = [ '--proxy=127.0.0.1:9743', '--proxy-type=http',

'--proxy-auth=username:password']

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值