一、介绍
Supervisor是用Python开发的一个client/server服务,是Linux/Unix系统下的一个进程管理工具,不支持Windows系统。它可以很方便的监听、启动、停止、重启一个或多个进程。用Supervisor管理的进程,当一个进程意外被杀死,supervisort监听到进程死后,会自动将它重新拉起,很方便的做到进程自动恢复的功能,不再需要自己写shell脚本来控制。
遇到的问题:我上位机跑python多进程有时出现突然卡死情况,FZ3A插上网线进程又恢复。完全搞不明白,所以试一下Supervisor,监听重启多进程。
二、安装
下载最新的supervisor安装包:https://pypi.python.org/pypi/supervisor
wget https://pypi.python.org/packages/7b/17/88adf8cb25f80e2bc0d18e094fcd7ab300632ea00b601cbbbb84c2419eae/supervisor-3.3.2.tar.gz
tar -zxvf supervisor-3.3.2.tar.gz
cd supervisor-3.3.2
python setup.py install
Tips:错误1:ImportError: No module named setuptools
## 下载pip
wget https://files.pythonhosted.org/packages/c2/f7/c7b501b783e5a74cf1768bc174ee4fb0a8a6ee5af6afa92274ff964703e0/setuptools-40.8.0.zip
unzip setuptools-40.8.0.zip
cd setuptools-40.8.0
python setup.py install
Tips:错误2:error: Could not find suitable distribution for Requirement.parse(‘meld3>=0.6.5‘),安装meld扩展
wget https://pypi.python.org/packages/45/a0/317c6422b26c12fe0161e936fc35f36552069ba8e6f7ecbd99bbffe32a5f/meld3-1.0.2.tar.gz#md5=3ccc78cd79cffd63a751ad7684c02c91
tar zxvf meld3-1.0.2.tar.gz
cd meld3-1.0.2
python setup.py install
三、配置
1.生成配置文件supervisord.conf
echo_supervisord_conf > /etc/supervisord.conf
2.配置supervisord.conf
文件
(1)打开配置文件
vim /etc/supervisord.conf
(2)在配置文件底部,配置include
[include]
files=/etc/supervisor/*.conf
- 注意如果这里前面有“;”,去掉“;”
- /etc/supervisor这个文件如果不存在,先mkdir这个文件
(3)配置子进程文件
cd /etc/supervisor
vim baidu.conf # 这里的文件名称自定义,不影响
(4)往子进程配置文件中加入以下内容
[program:baidu]
command=/bin/bash -c ". /home/root/workspace/baidu/code/search.sh [run.py](http://run.py/)"
directory=/home/root/workspace/baidu/code
user=root
autorestart=true
redirect_stderr=true
startsecs=10
stopsignal=KILL
killasgroup=true
stopasgroup=true
stdout_logfile_maxbytes=10MB
stdout_logfile_backups=2
stderr_logfile_maxbytes=10MB
stderr_logfile_backups=2
stderr_logfile=/home/root/workspace/baidu/log/baidu_error.log
stdout_logfile=/home/root/workspace/baidu/log/baidu_out.log
loglevel=info
子进程配置详解
- command:启动程序使用的命令,可以是绝对路径或者相对路径
- process_name:一个python字符串表达式,用来表示supervisor进程启动的这个的名称,默认值是%(program_name)s
- numprocs:Supervisor启动这个程序的多个实例,如果numprocs>1,则process_name的表达式必须包含%(process_num)s,默认是1
- numprocs_start:一个int偏移值,当启动实例的时候用来计算numprocs的值
- priority:权重,可以控制程序启动和关闭时的顺序,权重越低:越早启动,越晚关闭。默认值是999
- autostart:如果设置为true,当supervisord启动的时候,进程会自动重启。
- autorestart:值可以是false、true、unexpected。false:进程不会自动重启,unexpected:当程序退出时的退出码不是exitcodes中定义的时,进程会重启,true:进程会无条件重启当退出的时候。
- startsecs:程序启动后等待多长时间后才认为程序启动成功
- startretries:supervisord尝试启动一个程序时尝试的次数。默认是3
- exitcodes:一个预期的退出返回码,默认是0,2。
- stopsignal:当收到stop请求的时候,发送信号给程序,默认是TERM信号,也可以是 HUP, INT, QUIT, KILL, USR1, or USR2。
- stopwaitsecs:在操作系统给supervisord发送SIGCHILD信号时等待的时间
- stopasgroup:如果设置为true,则会使supervisor发送停止信号到整个进程组
- killasgroup:如果设置为true,则在给程序发送SIGKILL信号的时候,会发送到整个进程组,它的子进程也会受到影响。
- user:如果supervisord以root运行,则会使用这个设置用户启动子程序
- redirect_stderr:如果设置为true,进程则会把标准错误输出到supervisord后台的标准输出文件描述符。
- stdout_logfile:把进程的标准输出写入文件中,如果stdout_logfile没有设置或者设置为AUTO,则supervisor会自动选择一个文件位置。
- stdout_logfile_maxbytes:标准输出log文件达到多少后自动进行轮转,单位是KB、MB、GB。如果设置为0则表示不限制日志文件大小
- stdout_logfile_backups:标准输出日志轮转备份的数量,默认是10,如果设置为0,则不备份
- stdout_capture_maxbytes:当进程处于stderr capture mode模式的时候,写入FIFO队列的最大bytes值,单位可以是KB、MB、GB
- stdout_events_enabled:如果设置为true,当进程在写它的stderr到文件描述符的时候,PROCESS_LOG_STDERR事件会被触发
- stderr_logfile:把进程的错误日志输出一个文件中,除非redirect_stderr参数被设置为true
- stderr_logfile_maxbytes:错误log文件达到多少后自动进行轮转,单位是KB、MB、GB。如果设置为0则表示不限制日志文件大小
- stderr_logfile_backups:错误日志轮转备份的数量,默认是10,如果设置为0,则不备份
- stderr_capture_maxbytes:当进程处于stderr capture mode模式的时候,写入FIFO队列的最大bytes值,单位可以是KB、MB、GB
- stderr_events_enabled:如果设置为true,当进程在写它的stderr到文件描述符的时候,PROCESS_LOG_STDERR事件会被触发
- environment:一个k/v对的list列表
- directory:supervisord在生成子进程的时候会切换到该目录
- umask:设置进程的umask
- serverurl:是否允许子进程和内部的HTTP服务通讯,如果设置为AUTO,supervisor会自动的构造一个url
- 先解释一下
command
参数和directory
参数:command
存放的是进程启动的脚本命令,directory
存放的是该进程的具体代码路径 - 再解释下
numprocs
参数:
根据网上说法:当同一个脚本,希望启动多个守护进程时。
背景:一台消费者服务器,同一消费者脚本,希望开启多个进程,(多个消费者,消费)需要增加两个参数:
process_name=%(program_name)s_%(process_num)02d ;多进程名称肯定不能相同,匹配多个
numprocs=4 ;启动N个进程
这样系统会自动分配四个进程的名称
我自己的代码是一条command命令,启动四个不同的进程,我一开始打算用numprocs
这种方法,但是top下来发现,这样好像是把一条command命令执行四次,也就是我启动了16个进程。我不懂,选择放弃,这里numprocs
还是设为1。发现也能跑。
(5)更新配置文件
supervisorctl update
update后进程自动启动
(6)服务启动
supervisord -c /etc/supervisord.conf
四、supervisorctl指令
supervisord : 启动supervisor
supervisorctl reload :修改完配置文件后重新启动supervisor
supervisorctl status :查看supervisor监管的进程状态
supervisorctl start all | 进程名 :启动全部或某进程
supervisorctl stop all | 进程名 :停止全部或某进程
supervisorctl stop all:停止进程,注:start、restart、stop都不会载入最新的配置文件。
supervisorctl update:根据最新的配置文件,启动新配置或有改动的进程,配置没有改动的进程不会受影响而重启
常见问题
- error:class ‘socket.error’ [Errno 2] No such file or directory: file: /usr/lib64/python2.7/socke
问题描述:supervisor 配置完毕,使用supervisorctl reload 和supervisorctl update 启动时候报错
解决办法:
/usr/bin/python2 /usr/bin/supervisord -c /etc/supervisor/supervisord.conf
五、supervisor设置开机自启动
方法一:service方法
这里不多做介绍,需要service包,试了一下,FA3A没有,就懒得搞了
方法二:在/etc/init.d创脚本文件设置自启动
(1)创建/etc/init.d/supervisord文件
**#创建/etc/init.d/supervisord文件
vim /etc/init.d/supervisord**
(2)编写脚本文件内容
#!/bin/bash
#
# supervisord This scripts turns supervisord on
#
# chkconfig: - 95 04
#
# description: supervisor is a process control utility. It has a web based
# xmlrpc interface as well as a few other nifty features.
# processname: supervisord
# config: /etc/supervisord.conf
# pidfile: /var/lib/supervisor/supervisord.pid
# source function library
. /etc/init.d/functions
RETVAL=0
PIDFILE=/var/lib/supervisor/supervisord.pid
start() {
echo -n $"Starting supervisord: "
**nohup /usr/bin/supervisord -c /etc/supervisord.conf &**
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/supervisord
}
stop() {
echo -n $"Stopping supervisord: "
killproc -p $PIDFILE supervisord
echo
[ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/supervisord
}
restart() {
stop
start
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart|force-reload|reload)
restart
;;
condrestart)
[ -f /var/lock/subsys/supervisord ] && restart
;;
status)
status -p $PIDFILE supervisord
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|status|restart|reload|force-reload|condrestart}"
exit 1
esac
exit $RETVAL
(3)设置为服务,如下的方法
chmod +x /etc/init.d/supervisord
chkconfig --add supervisord
chkconfig supervisord on
————————————————分割线—————————————————
ok,到这里我以为配置完成结束,万事大吉。
没想到这里才是恶心,自启动后发现,supervisord没有自启动。查一下发现问题。
掉电重启报错:
unix:///tmp/supervisor.sock no such file
原因是 supervisor 默认配置会把 socket 文件和 pid 守护进程生成在Linux的/tmp/
目录下,/tmp/
目录是缓存临时文件的目录,Linux会根据不同情况自动删除其下面的文件。
“unix:///tmp/supervisor.sock no such file“ 错误解决方案_U.R.M.L的博客-CSDN博客
根据网上建议,一般把文件放到/var/run
目录下就行了,测试一下,结果发现还是临时文件,还是会被删除。
经过检验发现,发到/var/lib
下就行了。下面是具体方法:
(1)更改supervisord.conf
文件
vi /etc/supervisord.conf
修改下面四个地址
[unix_http_server]
;file=/tmp/supervisor.sock ; (the path to the socket file)
file=/var/lib/supervisor/supervisord.sock ; 修改为 /var/lib 目录,避免被系统删除
[supervisord]
;logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile=/var/lib/supervisor/supervisord.log ; 修改为 /var/log 目录,避免被系统删除
pidfile=/var/lib/supervisor/supervisord.pid ; 修改为 /var/lib 目录,避免被系统删除
[supervisorctl]
; 必须和'unix_http_server'里面的设定匹配
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
serverurl=unix:///var/lib/supervisor/supervisord.sock ; 修改为 /var/lib 目录,避免被系统删除
(2)创建supervisord.sock文件
cd /var/lib
mkdir supervisor
touch supervisord.sock
chmod 777 supervisord.sock
(3)更新配置文件
supervisorctl update
至此就可以了,socket 文件 supervisor.sock 和守护进程 supervisord.pid 两个文件放在/var/lib/supervisor/下面,log 文件 supervisord.log 也放在/var/lib/supervisor/下面。
以下是我的实例:
; Sample supervisor config file.
;
; For more information on the config file, please see:
; http://supervisord.org/configuration.html
;
; Notes:
; - Shell expansion ("~" or "$HOME") is not supported. Environment
; variables can be expanded using this syntax: "%(ENV_HOME)s".
; - Quotes around values are not supported, except in the case of
; the environment= options as shown below.
; - Comments must have a leading space: "a=b ;comment" not "a=b;comment".
; - Command will be truncated if it looks like a config file comment, e.g.
; "command=bash -c 'foo ; bar'" will truncate to "command=bash -c 'foo ".
[unix_http_server]
file=/var/lib/supervisor/supervisord.sock ; the path to the socket file
;chmod=0700 ; socket file mode (default 0700)
;chown=nobody:nogroup ; socket file uid:gid owner
;username=user ; default is no username (open server)
;password=123 ; default is no password (open server)
;[inet_http_server] ; inet (TCP) server disabled by default
;port=127.0.0.1:9001 ; ip_address:port specifier, *:port for all iface
;username=user ; default is no username (open server)
;password=123 ; default is no password (open server)
[supervisord]
logfile=/var/lib/supervisor/supervisord.log ; main log file; default $CWD/supervisord.log
logfile_maxbytes=50MB ; max main logfile bytes b4 rotation; default 50MB
logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
loglevel=info ; log level; default info; others: debug,warn,trac
pidfile=/var/lib/supervisor/supervisord.pid ; supervisord pidfile; default supervisord.pid
nodaemon=false ; start in foreground if true; default false
minfds=1024 ; min. avail startup file descriptors; default 1024
minprocs=200 ; min. avail process descriptors;default 200
;umask=022 ; process file creation umask; default 022
;user=chrism ; default is current user, required if root
;identifier=supervisor ; supervisord identifier, default is 'supervisor'
;directory=/tmp ; default is not to cd during start
;nocleanup=true ; don't clean up tempfiles at start; default false
;childlogdir=/tmp ; 'AUTO' child log dir, default $TEMP
;environment=KEY="value" ; key value pairs to add to environment
;strip_ansi=false ; strip ansi escape codes in logs; def. false
; The rpcinterface:supervisor section must remain in the config file for
; RPC (supervisorctl/web interface) to work. Additional interfaces may be
; added by defining them in separate [rpcinterface:x] sections.
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
; The supervisorctl section configures how supervisorctl will connect to
; supervisord. configure it match the settings in either the unix_http_server
; or inet_http_server section.
[supervisorctl]
serverurl=unix:///var/lib/supervisor/supervisord.sock ; use a unix:// URL for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris ; should be same as in [*_http_server] if set
;password=123 ; should be same as in [*_http_server] if set
;prompt=mysupervisor ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history ; use readline history if available
; The sample program section below shows all possible program subsection values.
; Create one or more 'real' program: sections to be able to control them under
; supervisor.
;[program:theprogramname]
;command=/bin/cat ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=999 ; the relative start priority (default 999)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=true ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (0 means none, default 10)
;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (0 means none, default 10)
;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions (def no adds)
;serverurl=AUTO ; override serverurl computation (childutils)
; The sample eventlistener section below shows all possible eventlistener
; subsection values. Create one or more 'real' eventlistener: sections to be
; able to handle event notifications sent by supervisord.
;[eventlistener:theeventlistenername]
;command=/bin/eventlistener ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;events=EVENT ; event notif. types to subscribe to (req'd)
;buffer_size=10 ; event buffer queue size (default 10)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=-1 ; the relative start priority (default -1)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; autorestart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=false ; redirect_stderr=true is not allowed for eventlisteners
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (0 means none, default 10)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (0 means none, default 10)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions
;serverurl=AUTO ; override serverurl computation (childutils)
; The sample group section below shows all possible group values. Create one
; or more 'real' group: sections to create "heterogeneous" process groups.
;[group:thegroupname]
;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions
;priority=999 ; the relative start priority (default 999)
; The [include] section can just contain the "files" setting. This
; setting can list multiple files (separated by whitespace or
; newlines). It can also contain wildcards. The filenames are
; interpreted as relative to this file. Included files *cannot*
; include files themselves.
[include]
files = /etc/supervisor/*.conf