在scrapy的middleware或者pipeline中,如果想使用settings.py
里自定义的配置,可以通过
middleware.py
def process_request(self, request, spider):
print spider.settings['YOUR_SETTINGS']
pipelines.py
def process_item(self, item, spider):
print spider.settings['YOUR_SETTINGS']
这是因为在spider初始化时把crawler的settings赋给了spider
scrapy/spiders/__init__.py
def _set_crawler(self, crawler):
self.crawler = crawler
self.settings = crawler.settings
crawler.signals.connect(self.close, signals.spider_closed)
如果想在middleware初始化时得到,可以这样写
middleware.py
class testMiddleware(object):
def __init__(self, settings):
print type(settings)
print settings.get('YOUR_SETTINGS')
@classmethod
def from_crawler(cls, crawler):
return cls(crawler.settings)
因为在scrapy/settings/__init__.py
中有如下定义:
def setmodule(self, module, priority='project'):
self._assert_mutability()
if isinstance(module, six.string_types):
module = import_module(module)
for key in dir(module):
if key.isupper():
self.set(key, getattr(module, key), priority)
但是只导入了大写的名字