单独爬虫配置
custom_settings = {
'SOME_SETTING': 'some value',
}
不同爬虫pipeline设置
custom_settings = {
'ITEM_PIPELINES': {
'video.pipelines.VideoPipeline': 301,
}
}
cookie设置
custom_settings = {
'COOKIES_ENABLED':True, # 在配置文件settings中可以设置成False, 在这个spider中这样设置就可以开启cookies了,其他的配置一样适用
}
settings/在settings同目录下新建custom_settings.py
# -*- coding: utf-8 -*-
custom_settings_for_spider1 = {
'LOG_LEVEL': 'INFO',
'DOWNLOAD_DELAY': 0,
'COOKIES_ENABLED': False, # enabled by default
'DOWNLOADER_MIDDLEWARES': {
'video_spider.middlewares.ProxiesMiddleware': 400,
'video_spider.middlewares.SeleniumMiddleware': 543,
# 将scrapy默认的user-agent中间件关闭 12
'scrapy.downloadmiddlewares.useragent.UserAgentMiddleware': None,
},
}
在spider文件中引入custom_settings
import scrapy
from scrapy import Request
from scrapy.utils.project import get_project_settings
from scrapy import signals
from pydispatch import dispatcher
# setting
class ShanbaySpider(scrapy.Spider):
name = 'shanbay'
allowed_domains = ['shanbay.com']
start_urls = ['http://shanbay.com/']
custom_settings = custom_settings_for_spider1