【Python自学笔记】爬虫代理ip的配置以及使用

 

http://www.xicidaili.com/wn/

https://www.ipip.net/

在进行爬虫项目时,如果没有针对特定网站制定相应的爬虫策略,那么会很容易造成自己的ip被网站封锁。

下面使用Requests库进行代理ip的配置何使用,同时也包括了新建ip池列表的操作

# -*- coding: utf-8 -*-

import requests
# 随机选择一个代理
import random

# 生成代理IP池
proxy_list = [
    '106.75.226.36:808',
    '222.76.74.214:808',
    '123.185.81.64:8118'
]
# 随机从ip池中选出一个ip
proxy = random.choice(proxy_list)
# 打印出随机选择的代理ip
print(proxy)

E:\Anaconda3\python.exe F:/Workspace/PycharmProjects/Txsst.py
106.75.226.36:808

Process finished with exit code 0

 

# -*- coding: utf-8 -*-

import requests
# 随机选择一个代理
import random

# 生成代理IP池
proxy_list = [
    '106.75.226.36:808',
    '222.76.74.214:808',
    '123.185.81.64:8118'
]
# 随机从ip池中选出一个ip
proxy = random.choice(proxy_list)
# 打印出随机选择的代理ip
print(proxy)


proxies = {
    'http': 'http://' + proxy,
    'https': 'https://' + proxy,
}
# -----------------------------------------------------------------------------------
# 返回一个随机的请求头 headers
def getheaders():
    user_agent_list = [
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1" \
        "Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11", \
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6", \
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6", \
        "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/19.77.34.5 Safari/537.1", \
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5", \
        "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5", \
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", \
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", \
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3", \
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24", \
        "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24"
    ]
    UserAgent=random.choice(user_agent_list)
    headers = {'User-Agent': UserAgent}
    return headers
# ---------------------------------------------------------------------------------------
try:
    headers = getheaders() # 定制请求头
    print(headers)
    response = requests.get(
        'https://www.ipip.net',
        proxies=proxies,
        headers=headers,
        timeout = 5
    )
    print(response.text)
except requests.exceptions.ConnectionError as e:
    print('Error', e.args)

若其运行结果的 origin 也是代理的 IP,证明代理已经设置成功

E:\Anaconda3\python.exe F:/Workspace/PycharmProjects/Txsst.py
106.75.226.36:808
{'User-Agent': 'Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3'}
Error (MaxRetryError('HTTPSConnectionPool(host=\'www.ipip.net\', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: SysCallError(-1, \'Unexpected EOF\')")))'),)

Process finished with exit code 0

 

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值