1. 在你的Scrapy工程下面新建一个与spiders平级的目录commands:
cd path/to/your_project
mkdir commands
2. 在commands下面添加一个文件crawlall.py,代码如下:
from scrapy.command import ScrapyCommand
from scrapy.utils.project import get_project_settings
from scrapy.crawler import Crawler
class Command(ScrapyCommand):
requires_project = True
def syntax(self):
return '[options]'
def short_desc(self):
return 'Runs all of the spiders'
def run(self, args, opts):
settings = get_project_settings()
for spider_name in self.crawler.spiders.list():
crawler = Crawler(settings)
crawler.configure()
spider = crawler.spiders.create(spider_name)
crawler.crawl(spider)
crawler.start()
self.crawler.start()
3. 在settings.py中添加配置:
COMMANDS_MODULE = 'yourprojectname.commands'
4. 在cronjob中添加:scrapy crawlall命令即可
??可是在windows里没有cronjob怎么办?
windows中同时运行多个scrapy爬虫
最新推荐文章于 2022-03-14 18:20:39 发布