在一个节点里面如果有多个程序,但是如果机器重启或者这个那个的,内存爆掉的话,该如何是好呢?姿势一,就是上去后手动启动么。如果程序有十几个来。
想到之前公司同事有用过supervisord,所以我也来测试下,没想到遇到了个坑,也看到了另一个希望。
安装
# 安装
pip install supervisor
# 生成初始配置
echo_supervisord_conf > /etc/supervisord.conf
- 启动文件 /etc/init.d/supervisord
#!/bin/sh
#
# /etc/init.d/supervisord
#
# Supervisor is a client/server system that
# allows its users to monitor and control a
# number of processes on UNIX-like operating
# systems.
#
# chkconfig: - 64 36
# description: Supervisor Server
# processname: supervisord
# Source init functions
. /etc/rc.d/init.d/functions
prog="supervisord"
prefix="/usr/local"
exec_prefix="${prefix}"
prog_bin="${exec_prefix}/bin/supervisord"
PIDFILE="/var/run/$prog.pid"
start()
{
echo -n $"Starting $prog: "
###注意下面这一行一定得有-c /etc/supervisord.conf 不然修改了配置文件根本不生效!
daemon $prog_bin -c /etc/supervisord.conf --pidfile $PIDFILE
[ -f $PIDFILE ] && success $"$prog startup" || failure $"$prog startup"
echo
}
stop()
{
echo -n $"Shutting down $prog: "
[ -f $PIDFILE ] && killproc $prog || success $"$prog shutdown"
echo
}
case "$1" in
start)
start
;;
stop)
stop
;;
status)
status $prog
;;
restart)
stop
start
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
;;
esac
- /etc/supervisord.conf
[unix_http_server]
file=/tmp/supervisor.sock ; (the path to the socket file)
[inet_http_server] ; inet (TCP) server disabled by default
port=0.0.0.0:1009 ; (ip_address:port specifier, *:port for all iface)
username=ops ; (default is no username (open server))
password=123 ; (default is no password (open server))
[supervisord]
logfile=/tmp/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=info ; (log level;default info; others: debug,warn,trace)
pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
nodaemon=false ; (start in foreground if true;default false)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
[include]
files = /etc/supervisor.d/*.conf
- /etc/supervisor.d/xcloud-cm.conf
[program:123]
command=/usr/local/123/bin/start.sh start
autostart=true
autorestart=unexpected
startretries=10
exitcodes=0
stopsignal=KILL
遇到的坑
就是运行的时候,会老实闪退,然后是无尽的重启。后来才发现,原来supervisord就是那种会产生日志,而不是后台daemon,所以如果你在启动脚本里面加上一个sleep 1000没准就能运行起来了。
另外注意,supervisord运行命令的是一个命令,但是启动文件里面会再调用另外一个进程,如下:
希望
所谓的希望,是因为最近在学习dcos+syslog-ng的方案,业务如何看日志成了一个问题,所以如果借助supervisord的tail -f功能,想想空间很大,而且还带有http认证哦。