1、简介
目前互联网主流的实现WEB网站及数据库服务高可用软件包括:keepalived、heartbeat等。Heartbeat是比较早期的实现高可用软件,而keepalived是目前轻量级的管理方便、易用的高可用软件解决方案。
Keepalived是一个类似于工作在layer 3、4 、7 交换机制的软件,具有监控检查、VRRP冗余协议两种功能。keepalived是模块化设计,不同模块负责不同的功能,keepalived常用模块包括:
Core,是keepalived的核心,负责主进程的启动和维护,全局配置文件的加载解析等 。
Check,负责healthchecker(健康检查),包括了各种健康检查方式,以及对应的配置的解析包括LVS的配置解析;
Vrrp,VRRPD子进程,VRRPD子进程就是来实现VRRP协议;
Libipfwc,iptables(ipchains)库,配置LVS会用到;
Libipvs,虚拟服务集群,配置LVS会使用
Keepalived正常运行,共启动3个进程,一个是父进程,负责监控其子进程,一个是VRRP子进程,另外一个是Checkers子进程。 两个子进程都被系统Watchlog看管,两个子进程各自负责自己的事,Healthcheck子进程检查各自服务器的健康状况,如果Healthcheck进程检查到Master上服务不可用了,就会通知本机上的VRRP子进程,让他删除通告,并且去掉虚拟IP,转换为BACKUP状态。Keepalived的作用就是检测web服务器的状态,如果有一台web服务器、Mysql服务器宕机,或工作出现故障,Keepalived将检测到后,会将有故障的web服务器或者Mysql服务器从系统中剔除,当服务器工作正常后Keepalived自动将web、Mysql服务器加入到服务器群中。Layer3、4、7工作在IP/TCP协议栈的IP层、传输层及应用层,实现原理分别如下:
Layer3:Keepalived使用Layer3的方式工作时,Keepalived会定期向服务器群中的服务器发送一个ICMP的数据包(如果发现某台服务的IP地址无法ping通,Keepalived便报告这台服务器失效,并将它从服务器集群中剔除。Layer3的方式是以服务器的IP地址是否有效作为服务器工作正常与否的标准。)
Layer4: Layer4主要以TCP端口的状态来决定服务器工作正常与否。如WEB server的服务端口一般是80,如果Keepalived检测到80端口没有启动,则Keepalived将把这台服务器从服务器群中剔除。
Layer7:Layer7工作在应用层,Keepalived将根据用户的设定检查服务器程序的运行是否正常,如果与用户的设定不相符,则Keepalived将把服务器从服务器群中剔除。
2、Keepalived VRRP原理剖析
通过VRRP技术可以将两台物理(路由器)主机当成路由器,两台物理机主机组成一个虚拟路由集群,Master高的主机产生VIP,该VIP负责转发用户发起的IP包或者负责处理用户的请求,Nginx+Keepalived组合,用户的请求直接访问keepalived VIP地址,然后访问Master相应服务和端口。
在VRRP虚拟路由器集群中,由多台物理的路由器组成,但是这多台的物理路由器并不能同时工作,而是由一台称为MASTER路由器负责路由工作,其它的都是BACKUP,MASTER并非一成不变,VRRP会让每个VRRP路由器参与竞选,最终获胜的就是MASTER。
MASTER拥有一些特权,例如拥有虚拟路由器的IP地址或者成为VIP,拥有特权的MASTER要负责转发发送给网关地址的包和响应ARP请求。
VRRP通过竞选协议来实现虚拟路由器的功能,所有的协议报文都是通过IP组播(multicast)包(组播地址224.0.0.18)形式发送的。虚拟路由器由VRID(范围0-255)和一组IP地址组成,对外表现为一个周知的MAC地址。所以在一组虚拟路由器集群中,不管谁是MASTER,对外都是相同的MAC和VIP。客户端主机并不需要因为MASTER的改变而修改自己的路由配置。
作为MASTER的VRRP路由器会一直发送VRRP组播包(VRRP Advertisement message),BACKUP不会抢占MASTER,除非它的优先级(Priority)更高。当MASTER不可用时(BACKUP收不到组播包时), 多台BACKUP中优先级最高的这台会抢占为MASTER。这种抢占是非常快速的,以保证服务的连续性。由于安全性考虑VRRP包使用了加密协议进行,基于VRRP技术,可以实现IP地址漂移,是一种容错协议,广泛应用于企业生产环境中。
3、Keepalived安装配置
3.1 yum 方式
yum install -y keepalived*
#配置文件
/etc/keepalived/keepalived.conf
3.2 #源码方式
[root@localhost keepalived-2.0.20]# yum install -y kernel-devel* openssl-* popt-devel openssh-clients libnl libnl-devel popt
[root@localhost src]# wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz
[root@localhost keepalived-2.0.20]# tar -xzvf keepalived-2.0.20.tar.gz
[root@localhost keepalived-2.0.20]# mv keepalived-2.0.20 /usr/local/
[root@localhost keepalived-2.0.20]# cd /usr/local/keepalived-2.0.20/
[root@localhost keepalived-2.0.20]# ./configure --prefix=/usr/local/keepalived
[root@localhost keepalived-2.0.20]# make
[root@localhost keepalived-2.0.20]# make install
[root@localhost keepalived]# cp /usr/local/keepalived/sbin/keepalived /usr/sbin/
[root@localhost keepalived]# mkdir /etc/keepalived
[root@localhost keepalived]# cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
[root@localhost keepalived]# cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
[root@localhost keepalived]# systemctl start keepalived
[root@localhost keepalived]# ps -ef|grep keepalived
root 12068 1 0 11:31 ? 00:00:00 /usr/local/keepalived/sbin/keepalived -D
root 12070 12068 0 11:31 ? 00:00:00 /usr/local/keepalived/sbin/keepalived -D
root 12071 12068 0 11:31 ? 00:00:00 /usr/local/keepalived/sbin/keepalived -D
root 12085 6357 0 11:31 pts/0 00:00:00 grep --color=auto keepalived
3.3 配置文件参数详解
#全局定义块
global_defs {
notification_email { #指定keepalived在发生切换时发送email的收件人,一行一个;
***@163.com
}
notification_email_from ***@163.com #指定发件人
smtp_server mail.***.net #指定smtp服务器地址
smtp_connect_timeout 3 #指定smtp连接超时时间
router_id LVS_DEVEL #运行keepalived机器的标识
}
#监控Nginx进程
vrrp_script chk_nginx {
script "/data/script/nginx.sh" #监控服务脚本,脚本x执行权限;
interval 2 #检测时间间隔(执行脚本间隔)
weight 2 #权重
}
#VRRP实例定义块
vrrp_sync_group VG_1{ 监控多个网段的实例
group {
VI_1 实例名
VI_2
}
notify_master /data/sh/nginx.sh #指定当切换到master时,执行的脚本
notify_backup /data/sh/nginx.sh #指定当切换到backup时,执行的脚本
notify /data/sh/nginx.sh #发生任何切换,均执行的脚本
smtp_alert #使用global_defs中提供的邮件地址和smtp服务器发送邮件通知;
}
vrrp_instance VI_1 {
state BACKUP #设置主机状态,MASTER|BACKUP
nopreempt #设置为不抢占,配置在优先级高的机器,只对状态为backUp的机器生效
interface ens33 #对外提供服务的网络接口
lvs_sync_daemon_inteface eth0 #负载均衡器之间监控接口;
track_interface { #设置额外的监控,网卡出现问题都会切换;
eth0
eth1
}
mcast_src_ip #发送多播包的地址,如果不设置默认使用绑定网卡的primary ip
garp_master_delay #在切换到master状态后,延迟进行gratuitous ARP请求
virtual_router_id 50 #VRID标记 ,路由ID,可通过#tcpdump vrrp查看
priority 90 #优先级,高优先级竞选为master
advert_int 5 #检查间隔,默认5秒
preempt_delay #抢占延时,默认5分钟
debug #debug日志级别
authentication { #设置认证
auth_type PASS #认证方式
auth_pass 1111 #认证密码
}
track_script { #以脚本监控chk_nginx;
chk_nginx
}
virtual_ipaddress { #设置vip
10.0.0.158
}
}
#虚拟服务器定义块
virtual_server 10.0.0.158 3306 {
delay_loop 6 #健康检查时间间隔
lb_algo rr #调度算法rr|wrr|lc|wlc|lblc|sh|dh
lb_kind DR #负载均衡转发规则NAT|DR|TUN
persistence_timeout 5 #会话保持时间
protocol TCP #使用的协议
real_server 10.0.0.147 3306 {
weight 1 #默认为1,0为失效
notify_up <string> | <quoted-string> #在检测到server up后执行脚本;
notify_down <string> | <quoted-string> #在检测到server down后执行脚本;
TCP_CHECK {
connect_timeout 3 #连接超时时间;
nb_get_retry 1 #重连次数;
delay_before_retry 1 #重连间隔时间;
connect_port 3306 #健康检查的端口;
}
HTTP_GET {
url {
path /index.html #检测url,可写多个
digest 24326582a86bee478bac72d5af25089e #检测效验码
#digest效验码获取方法:genhash -s IP -p 80 -u http://IP/index.html
status_code 200 #检测返回http状态码
}
}
}
4、Nginx+Keepalived 双主架构
准备两台主机,分别安装Nginx和Keepalived。
主机一 keepalived.conf 配置:
! Configuration File for keepalived
global_defs {
notification_email {
***@163.com
}
notification_email_from ***@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/data/sh/check_nginx.sh"
interval 2
weight 2
}
# VIP1
vrrp_instance VI_1 {
state MASTER
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 151
priority 100
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.151
}
track_script {
chk_nginx
}
}
# VIP2
vrrp_instance VI_2 {
state BACKUP
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 152
priority 90
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 2222
}
virtual_ipaddress {
10.0.0.152
}
track_script {
chk_nginx
}
}
主机二 keepalived.conf 配置:
! Configuration File for keepalived
global_defs {
notification_email {
***@163.com
}
notification_email_from ***@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/data/sh/check_nginx.sh"
interval 2
weight 2
}
# VIP1
vrrp_instance VI_1 {
state BACKUP
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 151
priority 90
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.151
}
track_script {
chk_nginx
}
}
# VIP2
vrrp_instance VI_2 {
state MASTER
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 152
priority 100
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 2222
}
virtual_ipaddress {
10.0.0.152
}
track_script {
chk_nginx
}
}
两台机器配置脚本/data/sh/check_nginx.sh,内容如下:
#!/bin/bash
#auto check nginx process
systemctl stop nginx
if [[ $? -eq 0 ]];then
systemctl stop keepalived
fi
测试:
分别启动两台机器Nginx和Keepalived服务,查看虚拟IP
# 主机一,虚拟IP为 10.0.0.151
[root@localhost sh]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b7:36:91 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.147/8 brd 10.255.255.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 10.0.0.151/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20ff:bdcd:8409:96e9/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 主机二,虚拟IP为 10.0.0.152
[root@localhost sh]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:68:03:e2 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.150/24 brd 10.0.0.255 scope global noprefixroute dynamic ens33
valid_lft 1528sec preferred_lft 1528sec
inet 10.0.0.152/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::996:6962:aea5:f905/64 scope link noprefixroute
valid_lft forever preferred_lft forever
执行/data/sh/check_nginx.sh脚本
# 主机一执行脚本,虚拟IP消失
[root@localhost sh]# sh /data/sh/check_nginx.sh
[root@localhost sh]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b7:36:91 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.147/8 brd 10.255.255.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20ff:bdcd:8409:96e9/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 查看主机二,原主机一的 10.0.0.151 虚拟IP漂移到主机二
[root@localhost sh]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:68:03:e2 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.150/24 brd 10.0.0.255 scope global noprefixroute dynamic ens33
valid_lft 1358sec preferred_lft 1358sec
inet 10.0.0.152/32 scope global ens33
valid_lft forever preferred_lft forever
inet 10.0.0.151/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::996:6962:aea5:f905/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 重启主机一的 Keepalived,10.0.0.151 虚拟IP漂移回来
[root@localhost sh]# systemctl start keepalived
[root@localhost sh]# ip addr list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:b7:36:91 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.147/8 brd 10.255.255.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 10.0.0.151/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20ff:bdcd:8409:96e9/64 scope link noprefixroute
valid_lft forever preferred_lft forever
5、Redis+Keepalived 高可用集群
准备两台机器,分别安装Redis和Keepalived。Redis配置互为主从,前面文章已提供主从同步方案。Mysql等高可用集群亦同理。
Master配置文件:
! Configuration File for keepalived
global_defs {
notification_email {
***@163.com
}
notification_email_from ***@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_redis {
script "/data/sh/check_redis.sh"
interval 2
weight 2
}
# VIP1
vrrp_instance VI_1 {
state MASTER
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 151
priority 100
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.158
}
track_script {
chk_redis
}
}
Backup配置文件:
! Configuration File for keepalived
global_defs {
notification_email {
***@163.com
}
notification_email_from ***@163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_redis {
script "/data/sh/check_redis.sh"
interval 2
weight 2
}
# VIP1
vrrp_instance VI_1 {
state BACKUP
interface ens33
lvs_sync_daemon_inteface eth0
virtual_router_id 151
priority 90
advert_int 5
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
10.0.0.158
}
track_script {
chk_redis
}
}
两台Redis服务器上配置/data/sh/check_redis.sh脚本,内容如下:
#!/bin/bash
#auto check redis process
NUM=`ps -ef |grep redis|grep -v grep|grep -v check|wc -l`
if [[ $NUM -eq 0 ]];then
systemctl stop keepalived
fi
当Master宕机,虚拟IP会自动漂移到Backup。因为Master的priority 较大,当Master恢复,会进行竞选,虚拟IP则重新漂移回去。