一、如何查询当前IP?
win+r--输入cmd--输入ipconfig--查看ipv4数字
二、使用代理--将上一篇的HTTpHandler换成ProxyHandler【详情见https://mp.csdn.net/mp_blog/creation/editor/131574569】
ProxyHandler里有一个proxies参数,这个参数是你的代理ip,需要以字典形式存储。【快代理】
案例【快代理】
import urllib.request
# 这里使用快代理的url
url = 'https://cn.bing.com/search?pglt=41&q=%E5%BF%AB%E4%BB%A3%E7%90%86'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.67'
}
proxies = {
'http': '61.216.156.222:60808'
# 这些代理IP需要自己找,61.216.156.222是ip,60808是port,中间用冒号连接
}
request = urllib.request.Request(url=url, headers=headers)
handler = urllib.request.ProxyHandler(proxies=proxies)
opener = urllib.request.build_opener(handler)
response = opener.open(request)
content = response.read().decode('utf-8')
with open('daili.html', 'w', encoding='utf-8')as fp:
fp.write(content)
三、代理池
在列表中放入多组IP,利用随机的特性,每次用不同的IP去访问服务器【利用random的随机特性来选IP】
制造代理IP池
import random
proxies_pool = [
{'http': 'ip1'},
{'http': 'ip2'},
{'http:' 'ip3'}
]
proxies = random.choice(proxies_pool)
IP数量无上限
完整代码
import urllib.request
import random
url = 'http://www.baidu.com'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.67'
}
# ip1之类的应该要真实的代理IP
proxies_pool = [
{'http': 'ip1'},
{'http': 'ip2'},
{'http:' 'ip3'}
]
proxies = random.choice(proxies_pool)
request = urllib.request.Request(url=url, headers=headers)
handler = urllib.request.ProxyHandler(proxies=proxies)
opener = urllib.request.build_opener(handler)
response = opener.open(request)
content = response.read().decode('utf-8')
with open('baidu.html', 'w', encoding='utf-8')as fp:
fp.write(content)