Scrapy多个spider时item和PIPELINES的设置:
一.同时运行多个spider
在项目目录下创建crawl.py文件,代码如下:
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
runs = CrawlerProcess(get_project_settings())
runs.crawl("zt_ls")//spider1 name
runs.crawl("city_ls")//spider2 name
runs.start()
直接运行crawl.py文件即可
二.为不同的spider指定item:
在item.py文件中:
# Define here the models for your scraped items
#
# See documentation in:
# https://docs.scrapy.org/en/latest/topics/items.html
import scrapy
class Keshi1huaItem(scrapy.Item):
# define the fields for your item here like:
name1 = scrapy.Field()
name2 = scrapy.Field()
其实就是将所有的都放在一个item里面,后面直接在管道进行判断即可
三.PIPELINES判断:
def process_item(self, item, spider):
if spider.name == "zt_ls":
print("dome1")
elif spider.name == "city_ls":
print("dome2")
四.去setting.py开启管道即可
ITEM_PIPELINES = {
'keshihua.pipelines.Keshi1huaPipeline': 300,
}
= {
‘keshihua.pipelines.Keshi1huaPipeline’: 300,
}