selenium通过设置代理IP的方法进行爬虫

最新推荐文章于 2024-04-22 20:18:15 发布

苏格尼塞

最新推荐文章于 2024-04-22 20:18:15 发布

阅读量2.5k

点赞数

文章标签： selenium 爬虫设置代理IP

本文链接：https://blog.csdn.net/qq_17495489/article/details/109453379

版权

selenium设置代理IP爬虫的方法

1：首先生成可以产生新的IP地址的API链接
2：添加IP白名单：
这两步的操作方法详见我的上篇文章：
Request爬虫使用代理的方法
3：使用如下代码即可在使用代理的情况下进行数据的爬取（避免IP被封掉导致爬取不了内容）

from selenium import webdriver
from selenium.webdriver import ChromeOptions
import requests


def get_ip():
    url = 'http://piping.mogumiao.com/proxy/api/get_ip_bs?appKey=6226c130427f487385ad7b5235bc603c&count=5&expiryDate=0&format=2&newLine=3'
    response = requests.get(url)
    if response.status_code == 200:
        if response.text[0] == '{':
            print('获取ip失败')
        else:
            return [x for x in response.text.split('\n') if x]
    else:
        print('请求失败')


ips = get_ip()
if ips:
    # 添加代理
    options = ChromeOptions()
    options.add_argument(f'--proxy-server=http://{ips[0]}')
    b = webdriver.Chrome(options=options)
    b.get('https://cd.fang.anjuke.com/')
    print(b.page_source)
else:
    print('获取ip失败')

苏格尼塞

关注

0
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
selenium通过设置代理IP的方法进行爬虫

selenium设置代理IP爬虫的方法1：首先生成可以产生新的IP地址的API链接2：添加IP白名单：这两步的操作方法详见我的上篇文章：Request爬虫使用代理的方法3：使用如下代码即可在使用代理的情况下进行数据的爬取（避免IP被封掉导致爬取不了内容）from selenium import webdriverfrom selenium.webdriver import ChromeOptionsimport requestsdef get_ip(): url = 'http
复制链接

扫一扫