/Library/Frameworks/Python.framework/Versions/3.7/bin/scrapyd-deploy:23: ScrapyDeprecationWarning: Module scrapy.utils.http
is deprecated, Please import from w3lib.http
instead.
from scrapy.utils.http import basic_auth_header
Unknown target: default
首先进入所创建的虚拟环境,安装
pip3 install scrapyd
pip3 installscrapyd-client
安装完成修改项目scrapy.cfg文件:
[settings]
default = ZJSpider.settings
[deploy:fengyu]
url = http://localhost:6800/
project = ZJSpider
然后通过命令:scrapyd 启动service端。
通过:scrapyd-deploy fengyu -p ZJSpider 打包工程。
然后将需要执行的爬虫启动:
curl http://localhost:6800/schedule.json -d project=ZJSpider -d spider=cnblog
启动完成后:
这样爬虫就启动了,开始时间,结束时间都可以去查看,出现finish说明程序出错或者退出了。可查看日志。
想取消某个爬虫可以使用命令:
curl http://localhost:6800/cancel.json -d project=myproject -d job=6487ec79947edab326d6db28a2d86511e8247444
如果出现报错
先删除工程打包
curl http://localhost:6800/delproject.json -d project=ZJSpider
然后再重新打包
scrapyd-deploy fengyu -p qctt
然后将当前工程爬虫添加进去
curl http://localhost:6800/schedule.json -d project=ZJSpider -d spider=cnblog
典型错误如:
result = g.send(result)
File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py”, line 85, in crawl
self.spider = self._create_spider(*args, **kwargs)
File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/crawler.py”, line 108, in _create_spider
return self.spidercls.from_crawler(self, *args, **kwargs)
File “/private/var/folders/k7/rxv0nc4x64bd1_mq90bt1wrc0000gn/T/qctt-1585613591-rgjmnot0.egg/qctt/spiders/toutiaohao.py”, line 30, in from_crawler
File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/scrapy/spiders/init.py”, line 50, in from_crawler
spider = cls(*args, **kwargs)
TypeError: init() got an unexpected keyword argument ‘_job’
解决办法:
def init(self,kwargs): init添加kargs即可。