安装scrapy
pip install scrapyd
pip install scrapyd-client
安装好后输入scrapyd,结果如下
部署scrapy爬虫,先配置好需要部署的爬虫的scrapy.cfg文件
原先
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.io/en/latest/deploy.html
[settings]
default = biqu.settings
[deploy]
#url = http://localhost:6800/
project = biquge
部署后
# Automatically created by: scrapy startproject
#
# For more information about the [deploy] section see:
# https://scrapyd.readthedocs.io/en/latest/deploy.html
[settings]
default = biquge.settings
[deploy:bq] #在deploy加上:部署名
url = http://localhost:6800/ #把url前面的#去掉
project = biquge
再打开一个cmd路径切换到要部署的scrapy爬虫文件下
执行
scrapyd-deploy bq -p biquge
输出
网页查看
运行
格式
curl http://localhost:6800/schedule.json -d project=default -d spider=somespider
运行部署好的爬虫
curl http://localhost:6800/schedule.json -d project=biquge -d spider=biquge_spider
project=biquge #项目名
spider=biquge_spider #爬虫名
执行后
停止
curl http://localhost:6800/schedule.json -d project=biquge -d job=87384812df5311ec8f257470fd3a1483
#job爬虫运行返回的jobid
py文件执行部署的爬虫,运行,停止,删除
import requests
def open():
url="http://localhost:6800/schedule.json"
data={
'project':'biquge',
'spider':'biquge_spider'
}
r=requests.post(url,data=data)
text=r.json()
print(text)
def close():
url="http://localhost:6800/schedule.json"
data={
'project':'biquge',
'job':'87384812df5311ec8f257470fd3a1483'
}
r=requests.post(url,data=data)
text=r.json()
print(text)
def delete():
# curl http://localhost:6800/delproject.json -d project=biquge
url="http://localhost:6800/delproject.json"
data={
"project":"biquge"
}
r=requests.post(url,data=data)
# print(r.json())
if __name__ == '__main__':
delete()