账号池搭建
必要性
常见登录方式:
- 基于Session + Cookie的登录
- 基于JWT的登录:登录生成JWT字符串
账号池存储cookie或者JWT字符串 方便后续发请求爬取数据
本地部署
-
conda建立一个虚拟环境
conda create -n new_env python=3.x # 替换 x 为你需要的 Python 版本
-
激活新建环境
conda activate new_env
-
安装依赖项
pip install -r requirments.txt
-
修改setting.py配置文件
-
配置redis数据库
-
# redis host REDIS_HOST = env.str('REDIS_HOST', '127.0.0.1') # redis port REDIS_PORT = env.int('REDIS_PORT', 6379) # redis password, if no password, set it to None REDIS_PASSWORD = env.str('REDIS_PASSWORD', None) # redis db, if no choice, set it to 0 REDIS_DB = env.int('REDIS_DB', 0)
-
-
配置检测网址
-
GENERATOR_MAP = { 'antispider6': 'Antispider6Generator', 'antispider7': 'Antispider7Generator' } # integrated tester TESTER_MAP = { 'antispider6': 'Antispider6Tester', 'antispider7': 'Antispider7Tester', } TEST_URL_MAP = { 'antispider6': 'https://antispider6.scrape.center/', 'antispider7': 'https://antispider7.scrape.center/' }
-
-
配置生成网址
-
def generate(self, username, password): """ generate main process """ if self.credential_operator.get(username): logger.debug(f'credential of {username} exists, skip') return login_url = 'https://antispider7.scrape.center/api/login' s = requests.Session() r = s.post(login_url, json={ 'username': username, 'password': password }) if r.status_code != 200: logger.error(f'error occurred while generating credential of {username}, error code {r.status_code}') return token = r.json().get('token') logger.debug(f'get credential {token}') self.credential_operator.set(username, token)
-
-
-
配置账号密码的生成机制
可以利用虚拟号接受验证码注册账号密码,需要花钱但不贵
def init(self): """ do init """ for i in range(1, self.MAX_COUNT + 1): self.account_operator.set(f'admin{i}', f'admin{i}')
-
运行redis服务
-
运行项目
python run.py antispider7
-
通过访问http://127.0.0.1:6379即可访问代理IP池的前台
- //random:随机JWT字符串或者cookie
- //count:数量
项目源码
GitHub - Python3WebSpider/AccountPool: Account Pool