user-agent池:
首先写一个user-agent的序列:
ua_list = [
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1",
"Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11",
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6",
"Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6",
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'
]
从队列中随机抽取一个出来:
user_agent = random.choice(ua_list)
把随机取出的user-agent设置到请求头里面:
head = {
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'zh-CN,zh;q=0.9',
'Connection': 'keep-alive',
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Host': 'www.lagou.com',
'Origin': 'https://www.lagou.com',
'Referer': 'https://www.lagou.com/jobs/list_python?px=default&city=%E5%B9%BF%E5%B7%9E',
'User-Agent': user_agent,
'X-Anit-Forge-Code': '0',
'X-Anit-Forge-Token': None,
'X-Requested-With': 'XMLHttpRequest',
}
发送请求:
result = session.post(url=url, headers=head, data=data, proxies=ip).json()
IP连接池:
写一个ip的队列:
proxies = [{'http': '58.218.200.223:30059'},
{'http': '58.218.200.223:30209'},
{'http': '58.218.200.223:30181'},
{'http': '58.218.200.223:30476'},
{'http': '58.218.200.223:30343'},
{'http': '58.218.200.223:30424'},
{'http': '58.218.200.223:30464'},
{'http': '58.218.200.223:30352'},
{'http': '58.218.200.223:30431'},
{'http': '58.218.200.223:30342'},
{'http': '58.218.200.223:30121'},
{'http': '58.218.200.223:30101'},
{'http': '58.218.200.223:30470'},
{'http': '58.218.200.223:30327'},
{'http': '58.218.200.223:30407'},
{'http': '58.218.200.223:30247'},
{'http': '58.218.200.223:30014'},
{'http': '58.218.200.223:30222'},
{'http': '58.218.200.223:30354'},
{'http': '58.218.200.223:30445'},
{'http': '58.218.200.223:30058'},
{'http': '58.218.200.223:30013'},
{'http': '58.218.200.223:30359'},
{'http': '58.218.200.223:30231'},
{'http': '58.218.200.223:30486'},
{'http': '58.218.200.223:30116'},
{'http': '58.218.200.223:30151'},
{'http': '58.218.200.223:30367'},
{'http': '58.218.200.223:30421'},
{'http': '58.218.200.223:30117'},
{'http': '58.218.200.223:30202'},
{'http': '58.218.200.223:30115'},
{'http': '58.218.200.223:30175'},
{'http': '58.218.200.223:30010'},
{'http': '58.218.200.223:30457'},
{'http': '58.218.200.223:30264'},
{'http': '58.218.200.223:30085'},
{'http': '58.218.200.223:30095'},
{'http': '58.218.200.223:30339'},
{'http': '58.218.200.223:30307'},
{'http': '58.218.200.223:30114'},
{'http': '58.218.200.223:30073'},
{'http': '58.218.200.223:30428'},
{'http': '58.218.200.223:30299'},
{'http': '58.218.200.223:30096'},
{'http': '58.218.200.223:30499'},
{'http': '58.218.200.223:30271'},
{'http': '58.218.200.223:30230'},
{'http': '58.218.200.223:30303'},
{'http': '58.218.200.223:30072'}, ]
从队列中随机取出一个ip:
ip = random.choice(proxies)
在发送请求的时候,把代理ip放进去:
result = session.post(url=url, headers=head, data=data, proxies=ip).json()