公司要求:

写一个shell脚本监控keepalived服务,如果服务停了,自动的在后台把服务启动,要求有故障日志:/var/log/run.log

vim run.sh
#!/bin/bash
while : 
do
   killall -0 keepalived  
   test $? -ne 0 && date +"%Y-%m-%d  %H-%M-%S" >> /var/log/run.log &&  service keepalived start >> /var/log/run.log
done
:x
#!/bin/bash
while : 
do
   killall -0 keepalived
   if test $? -ne 0
   then
         date +"%Y-%m-%d  %H-%M-%S" >> /var/log/run.log
         service keepalived start  >> /var/log/run.log
   fi
done

~                     

后台运行:

chmod +x run.sh
./run.sh &


演示:

脚本内容:

amp1:~ # cat run.sh  
#!/bin/bash
while : 
do 
   killall -0 keepalived 
   test $? -ne 0 && date +"%Y-%m-%d  %H-%M-%S" >> /var/log/run.log &&  service keepalived start >> /var/log/run.log
   
done
amp1:~ #
amp1:~ # cat run.sh 
#!/bin/bash
while : 
do
   killall -0 keepalived
   if test $? -ne 0
   then
         date +"%Y-%m-%d  %H-%M-%S" >> /var/log/run.log  #时分秒都是大写
         service keepalived start  >> /var/log/run.log
   fi
done

~                     

在后台运行:

amp1:~ # ./run.sh &
[1] 3999
amp1:~ # 
amp1:~ #

查看keepalived的状态:

amp1:~ # service keepalived status 
Checking for Keepalived daemon                        running

把keepalived手动关了:

amp1:~ # service keepalived stop
Shutting down Keepalived daemon                        done

自动起来了:

amp1:~ # keepalived: no process found
/home/rzrk/keepalived/bin/keepalived: /lib64/libpopt.so.0: no version information available (required by /home/rzrk/keepalived/bin/keepalived)
amp1:~ # service keepalived status 
Checking for Keepalived daemon                          running
amp1:~ # service keepalived stop   #再次关闭 
Shutting down Keepalived daemon                          done
You have new mail in /var/mail/root
amp1:~ # keepalived: no process found
/home/rzrk/keepalived/bin/keepalived: /lib64/libpopt.so.0: no version information available (required by /home/rzrk/keepalived/bin/keepalived)
amp1:~ # service keepalived status  #查看状态,已经重新启动
Checking for Keepalived daemon                           running
amp1:~ #


查看日志文件:

amp1:~ # 
amp1:~ # cat /var/log/run.log 
2015-05-15  15-05-20
Starting Keepalived daemon ..done
2015-05-15  15-05-49
Starting Keepalived daemon ..done
Starting Keepalived daemon ..done
Starting Keepalived daemon ..done
Starting Keepalived daemon ..done
Starting Keepalived daemon ..done
Starting Keepalived daemon ..done
2015-05-15  16-05-59
Starting Keepalived daemon ..done
2015-05-15  16-05-02
Starting Keepalived daemon ..done
2015-05-15  16-05-04
Starting Keepalived daemon ..done
2015-05-15  16-05-05
Starting Keepalived daemon ..done
2015-05-15  16-05-05
Starting Keepalived daemon ..done
2015-05-15  16-05-06
Starting Keepalived daemon ..done
2015-05-15  16-05-07
Starting Keepalived daemon ..done
2015-05-15  16-05-23
Starting Keepalived daemon ..done
2015-05-15  16-05-52
Starting Keepalived daemon ..done
2015-05-15  16-05-31
Starting Keepalived daemon ..done
You have new mail in /var/mail/root


注释:

kill [参数][PID]
killall[参数][进程名]  用来结束同名的的所有进程

I have seen this question -- check if a process is running -- in a lot of forums and all seem a bit complex and unnecessary:
Checking if /proc/PID exists?
ps aux | grep -E "PID|name" ?
etc...
There is a simpler way of doing it: sending signal 0 (zero) to the process and letting the kernel tell if the process is running. So all you have to do is kill -0 PID or killall -0 name and check the return value of kill command using $?
bash
Code:
kill -0 pid
killall -0 name 
echo $?
eg:
kill -0 65535
killall -0 mysql 
killall -0 nginx 
如果该程序是正在运行的,则返回执行代码0



死循环:

while :   ## == while  [ true ]
do
# do something
done

注意”#“在shell中是注释这一行的意思,除了这个除外#!/bin/sh

还有”:“在shell中表示空语句,就是什么也不做!相当于C中的”;“