1. 先行条件
将scrapy源代码,相对路径为:Lib/site-packages/scrapy/core/downloader/handlers/http11.py的文件中,
if isinstance(agent, self._TunnelingAgent):
headers.removeHeader(b'Proxy-Authorization')
给注释掉。
必须将此注释掉,否则proxy-authorization会被去除,动态转发失效。
2. 示例middleware
class EpDownloaderMiddleware(object):
def __init__(self):
self.orderno = "XXXXXXXXXXXXXXXXXXXXXXX"
self.secret = "XXXXXXXXXXXXXXXXXXXXXXX"
def process_request(self, request, spider):
request.meta['proxy'] = 'http://forward.xdaili.cn:80'
timestamp = str(int(time.time())) # timestamp
string = "orderno=" + self.orderno + "," + "secret=" + self.secret + "," + "timestamp=" + timestamp
md5_string = hashlib.md5(string.encode('utf-8')).hexdigest() # sign
sign = md5_string.upper()
auth = "sign=" + sign + "&" + "orderno=" + self.orderno + "&" + "timestamp=" + timestamp
request.headers["Proxy-Authorization"] = auth