Scrapy动态设置User-Agent
1、middlewares.py里添加
‘’’
这个类主要用于产生随机User-Agent
‘’’
class RandomUserAgent(object):
def __init__(self, agents):
self.agents = agents
@classmethod
def from_crawler(cls, crawler):
return cls(crawler.settings.getlist('USER_AGENTS'))
def process_request(self, request, spider):
request.headers.setdefault('User-Agent', random.choice(self.agents))
2、settings.py里修改DOWNLOADER_MIDDLEWARES :
DOWNLOADER_MIDDLEWARES = {
‘xxxxx.middlewares.RandomUserAgent’: 544,
}
3、settings.py再添加:
USER_AGENTS = [
“Mozilla/5.0 (Linux; U; Android 2.3.6; en-us; Nexus S Build/GRK39F) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1”,
“Avant Browser/1.2.789rel1 (http://www.avantbrowser.com)”,