参考文章:centos7分布式部署pyspider
在这篇文章中介绍了在centos7上进行安装python pyspider surpervisor的方法,基本安装流程可以按他的走,本文主要讲在此过程中遇到问题和解决方法,此过程并不是分布式部署。
基本介绍:pyspider爬虫框架,surpervisor是pyspider推荐的监控器,可以理解为实时管理软件进程的后台软件,你可以通过surpervisor监控你正在运行的软件,对出错退出的软件进行重启,安装过程中大部分的错误出自surpervisor的配置问题。
以下问题和解决方法都建立在你需要surpervisor进行监控的进程能够正常运行,例如在本例中pyspider以能够在服务器下正常运行。
问题1
启动surpervisor后,用查询命令status显示如下
# supervisorctl status
unix:///var/tmp/supervisor.sock refused connection
解决方法
查看supervisord.conf 配置文件
unlink /tmp/supervisor.sock
# supervisord -c /etc/supervisord.conf
Error: could not find config file /etc/supervisor/supervisord.conf
For help, use /usr/bin/supervisord -h
# whereis supervisord.conf
supervisord: /usr/bin/supervisord /etc/supervisord.conf /etc/supervisord
# supervisord -c /etc/supervisord.conf
Unlinking stale socket /var/tmp/supervisor.sock
# unlink /tmp/supervisor.sock
unlink: cannot unlink `/tmp/supervisor.sock’: No such file or directory
# unlink /var/tmp/supervisor.sock
# supervisorctl status
unix:///var/tmp/supervisor.sock no such file
# supervisord -c /etc/supervisord.conf
# supervisorctl status
m-tomcat RUNNING pid 11125, uptime 0:00:03
platform RUNNING pid 11124, uptime 0:00:03
rap-tomcat RUNNING pid 11123, uptime 0:00:03
redis_6379 BACKOFF Exited too quickly (process log may have details)
tomcat RUNNING pid 11130, uptime 0:00:03
————————————————
版权声明:本文为CSDN博主「叫我泽西哥好吗」的原创文章,遵循CC 4.0 by-sa版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/u013897685/article/details/81866709
问题2
启动supervisor后,supervisorctl status 没有看到你在supervisord.conf配置的进程
supervisorctl start pyspider pyspider: ERROR (no such process)
解决方法
检查supervisord.conf,看看你配置的 program: pyspider是否正常
以下是我的部分配置文件supervisord.conf,其中知保留了我修改的部分
[program:pyspider]
command=/root/anaconda3/bin/pyspider -c /etc/pyspider/pyspider.conf.json
autorestart=true
autostart=true
startsecs=0
user=root
group=pyspider
directory=/root/pyspider
stderr_logfile=/etc/pyspider/pyspider_err.log
stdout_logfile=/etc/pyspider/pyspider.log
;[program:theprogramname]
;command=/bin/cat ; the program (relative uses PATH, can take args
)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=999 ; the relative start priority (default 999)
;autostart=true ; start at supervisord start (default: true)
;autorestart=true ; retstart at unexpected quit (default: true)
;startsecs=10 ; number of secs prog must stay running (def. 1)
;startretries=3 ; max # of serial start failures (default 3)
;exitcodes=0,2 ; 'expected' exit codes for process (default 0,2
)
#[include]
#files = supervisord.d/*.ini
问题3
supervisorctl status 显示你的进程有如下错误
fatal Exited too quickly (process log may have details)
解决方法
检查你的supervisord.conf
确保你配置的进程有
startsecs=0
,可以参照上个问题的配置。
确保你的command路径正确,例如我的
command=/root/anaconda3/bin/pyspider -c /etc/pyspider/pyspider.conf.json
当你command命令无法确定时,你可能在supervisord的错误反馈中
(我在supervisord.conf设置的反馈路径为参数stderr_logfile=/etc/pyspider/pyspider_err.log)
看到如下提示
supervisor: couldn't chdir to /pyspider: ENOENT
supervisor: child process was not spawned
supervisor: couldn't chdir to /pyspider: ENOENT
supervisor: child process was not spawned
supervisor: couldn't chdir to /pyspider: ENOENT
supervisor: child process was not spawned
supervisor: couldn't chdir to /pyspider: ENOENT
supervisor: child process was not spawned
这是因为在此之前我的command命令为
command=pyspider -c /etc/pyspider/pyspider.conf.json
改为
command=/root/anaconda3/bin/pyspider -c /etc/pyspider/pyspider.conf.json
即可。你可以通过whereis 命令查询要执行的软件具体路径,例如
#whereis pyspider
/root/anaconda3/bin/pyspider
问题4
监控的进程一直在重启,持续时间一直为0
pyspider RUNNING pid 20165, uptime 0:00:00
pyspider RUNNING pid 20169, uptime 0:00:00
pyspider RUNNING pid 20171, uptime 0:00:00
你可以看到pid一直在变,说明进程不断重启。查询错误反馈,回到3个问题,command的问题。
解决方法
同问题3
到此我的pyspider就配置成功了,supervisor确保pyspider一直处于运行,之后就剩下爬取了。