以前写过一篇,nginx+keepalived 双机互备的文章,写那篇文章的时候没有想过如果apache或者nginx 挂了,而 keepalived 或者 机器没有死,那么主辅是不会切换的,今天就研究了一下该如何监控 nginx进程呢,看官方站看到了。vrrp_script 功能,但是用他的方法实在形不通,可能是我的方法不对,或者是个BUG。所以后来我自己写了个小脚本来完成工作。 <?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

环境

Server 1  :  ubuntu-server 8.04.4          192.168.6.162

Server 2  :  userver-server 8.04.4          192.168.6.188

软件

Keepalived 1.1.15

nginx-0.8.35

pcre-8.02

1. 分别在两台服务器上安装nginx
tar jxvf pcre-8.02.tar.bz2
cd pcre-8.02
./configure --prefix=/usr --enable-utf8 --enable-pcregrep-libbz2 --enable-pcregrep-libz
make
make install
tar zxvf nginx-0.8.35.tar.gz
cd nginx-0.8.35
--prefix=/usr/local/nginx --with-pcre --user=www --group=www --with-file-aio --with-http_ssl_module --with-http_flv_module --with-http_gzip_static_module --with-http_stub_status_module --with-cc-opt=' -O3'
make
make install

2. 分别在两台服务器编写配置文件

vim /usr/local/nginx/conf/nginx.conf
user    www www;
worker_processes    1;
error_log    logs/error.log    notice;
pid                logs/nginx.pid;
events {
        worker_connections    1024;
}
http {
        include             mime.types;
        default_type    application/octet-stream;
        sendfile                on;
        tcp_nopush         on;
        keepalive_timeout    65;
        gzip    on;
        server {
                listen             80;
                server_name    localhost;
                index     index.html index.htm;
                root        /var/www;
                error_page     500 502 503 504    /50x.html;
                location = /50x.html {
                        root     html;
                }
        }
}
3. 分别在两台机器创建测试文件

echo "192.168.6.162" > /var/www/index.html
echo "192.168.6.188" > /var/www/index.html
4. 安装 keepalived 

InBlock.gifapt-get install keepalived

5. server 1服务器编写配置文件

vrrp_script chk_http_port {
                script "/opt/nginx_pid.sh"         ###监控脚本
                interval 2                             ###监控时间
                weight 2                                ###目前搞不清楚
}
vrrp_instance VI_1 {
        state MASTER                            ### 设置为 主
        interface eth0                             ### 监控网卡    
        virtual_router_id 51                    ### 这个两台服务器必须一样
        priority 101                                 ### 权重值 MASTRE 一定要高于 BAUCKUP
        authentication {
                     auth_type PASS             ### 加密
                     auth_pass eric                ### 加密的密码,两台服务器一定要一样,不然会出错
        }
        track_script {
                chk_http_port                     ### 执行监控的服务
        }
        virtual_ipaddress {
             192.168.6.7                            ###    VIP 地址
        }
}
6. 在 server 2 服务器 keepalived 配置

vrrp_script chk_http_port {
                script "/opt/nginx_pid.sh"
                interval 2
                weight 2
}
vrrp_instance VI_1 {
        state BACKUP                                ### 设置为 辅机
        interface eth0
        virtual_router_id 51                        ### 与 MASTRE 设置 值一样
        priority 100                                     ### 比 MASTRE权重值 低
        authentication {
                     auth_type PASS
                     auth_pass eric                    ### 密码 与 MASTRE 一样
        }
        track_script {
                chk_http_port
        }
        virtual_ipaddress {
                 192.168.6.7
        }
}
7. 编写监控nginx监控脚本

vim /opt/nginx_pid.sh
#!/bin/bash
# varsion 0.0.2
# 根据一网友说这样做不科学,如果nginx服务起来了,但是我把keepalived 杀掉了,我的理由是,如果nginx死掉了,我觉得就很难在起来,再有就是nagios 当然要给你报警了啊。不过这位同学说的有道理,所以就稍加改了一下脚本
A=`ps -C nginx --no-header |wc -l`                ## 查看是否有 nginx进程 把值赋给变量A
if [ $A -eq 0 ];then                                         ## 如果没有进程值得为 零
                /usr/local/nginx/sbin/nginx
                sleep 3
                if [ `ps -C nginx --no-header |wc -l` -eq 0 ];then
                       killall keepalived                        ## 则结束 keepalived 进程
                fi
fi
8、 测试,分别在两个服务器 启动 nginx 和 keepalived

/usr/local/nginx/sbin/nginx
/etc/init.d/keepalived start
监控 server 1 的日志

Apr 20 18:37:39 nginx Keepalived_vrrp: Registering Kernel netlink command channel
Apr 20 18:37:39 nginx Keepalived_vrrp: Registering gratutious ARP shared channel
Apr 20 18:37:39 nginx Keepalived_vrrp: Opening file '/etc/keepalived/keepalived.conf'.
Apr 20 18:37:39 nginx Keepalived_healthcheckers: Opening file '/etc/keepalived/keepalived.conf'.
Apr 20 18:37:39 nginx Keepalived_healthcheckers: Configuration is using : 3401 Bytes
Apr 20 18:37:39 nginx Keepalived_vrrp: Configuration is using : 35476 Bytes
Apr 20 18:37:40 nginx Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Apr 20 18:37:41 nginx Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
Apr 20 18:37:41 nginx Keepalived_vrrp: Netlink: skipping nl_cmd msg...
Apr 20 18:37:41 nginx Keepalived_vrrp: VRRP_Script(chk_http_port) succeeded
监控 server 2的日志 

Apr2018:38:23 varnish Keepalived_healthcheckers: Opening file '/etc/keepalived/keepalived.conf'.
Apr 20 18:38:23 varnish Keepalived_healthcheckers: Configuration is using : 3405 Bytes
Apr 20 18:38:23 varnish Keepalived_vrrp: Using MII-BMSR NIC polling thread...
Apr 20 18:38:23 varnish Keepalived_vrrp: Registering Kernel netlink reflector
Apr 20 18:38:23 varnish Keepalived_vrrp: Registering Kernel netlink command channel
Apr 20 18:38:23 varnish Keepalived_vrrp: Registering gratutious ARP shared channel
Apr 20 18:38:23 varnish Keepalived_vrrp: Opening file '/etc/keepalived/keepalived.conf'.
Apr 20 18:38:23 varnish Keepalived_vrrp: Configuration is using : 35486 Bytes
Apr 20 18:38:23 varnish Keepalived_vrrp: VRRP_Instance(VI_1) Entering BACKUP STATE
Apr 20 18:38:25 varnish Keepalived_vrrp: VRRP_Script(chk_http_port) succeeded
看日志可以看出,两台服务器的 MASTRE 和 BACUKUP 已经都正常了

现在我们在 server 1 把 nginx 服务器停到

Server 1 $> killall nginx

这时候看server 1的日志

Apr 20 18:41:26 nginx Keepalived_healthcheckers: Terminating Healthchecker child process on signal
Apr 20 18:41:26 nginx Keepalived_vrrp: Terminating VRRP child process on signal
可以看出keepalived 的进程已经停到

这时候看server 2的日志,看是否已经接管

Apr 20 18:41:23 varnish Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Apr 20 18:41:24 varnish Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
Apr 20 18:41:24 varnish Keepalived_vrrp: Netlink: skipping nl_cmd msg...
很明显的看出 server 2 已经接管了,已经变为 MASTER