1.在服务器上安装scrapyd :pip3 install scrapyd
2.从usr/local/lib/python3.5/dist-package/scrapyd拷贝出defalt_scrapyd.conf放到etc/scrapyd/scrapy.conf
3.修改etc/scrapyd/scrapy.conf下的bind_address 为本机地址
4.重新安装twisted
pip uninstall twisted \ pip install tweisted==18.9.0#默认版本太高intxxx的错误
5.在开发的电脑上安装 pip install scrapyd-client
6.修改scrapy/script/scrapyd-deploy 改为scrapy-deploy.py
7.在项目中找到scrapy.cfg然后配置如下:
setting中修改 服务器地址
[deploy:lanjia01] url = http://localhost:6800/ #分布式多服务器IP地址 project = lanjia01 #[deploy:lanjia02] #url = http://localhost:6800/ #服务器IP地址 #project = lanjia01
8.在目录的所在路径生成版本号: scrapyd-deploy default -p lianjia(为scrapy.cfg中的project项目名称);分布式部署 scrapyd-deploy -a 部署到服务器上前提是服务器上已经运行了 scrapyd
9.下载curl安装打开bin/curl.exe
10.在cmd中用命令发布爬虫
运行爬虫 linux:curl http:localhost:6800/schedule.json/ -d project=lianjia -d
关闭爬虫 linux: curl http:localhost:6800/cancel.json/ -d project=lianjia -d job=xxxx