Linux高级—高可用-keepalived
文章目录
一、keepalived的工作原理
keepalived工作原理:两台主机同时安装好Keepalived软件并启动服务,开始正常工作时,由角色为Master的主机获得所有资源并对用户提供服务,角色为Backup的主机作为Master主机的热备;当角色为Master的主机失效或出现故障时,角色为Backup的主机将自动接管Master主机的所有工作,包括接管VIP资源及相应资源服务。
而当角色为Master的主机故障修复后,又会自动接管回它原来处理的工作,角色为Backup的主机则同时释放Master主机失效时它接管的工作,此时,两台主机将恢复到最初启动时各自的原始角色及工作状态。
二、vrrp协议
vrrp协议:虚拟路由器冗余协议
一组路由器协同工作,担任不同的角色,有master角色,也有backup角色
master角色的路由器(的接口)承担实际的数据流量转发任务
Backup路由器侦听Master路由器的状态,并在Master路由器发生故障时,接替其工作,从而保证业务流量的平滑切换。 随时候命,是备胎
选举:
vip: 虚拟ip
在一个VRRP 组内的多个路由器接口共用一个虚拟IP地址,该地址被作为局域网内所有主机的缺省网关地址。
VRRP协议报文使用固定的组播地址224.0.0.18进行发送
帧的组播地址:目的地址[Destination Address] 01:00:5E:00:00:12
vrrp协议的工作原理:
选举的过程:
1.所有的路由器或者服务器发送vrrp宣告报文,进行选举,必须是相同vrid和认证密码的,优先级高的服务器或者路由器会被选举为master,其他的机器都是backup
2.master定时(Advertisement Interval)发送VRRP通告报文,以便向Backup路由器告 知自己的存活情况。 默认是间隔1秒
3.接收Master设备发送的VRRP通告报文,判断Master设备的状态是否正常。 如果超过1秒没有收到vrrp报文,就认为master挂了,开始重新选举新的master,vip会漂移到新的master上
三、安装keepalived并启动
1.安装keepalived软件,在2台负载均衡上都安装
[root@lb-1 conf]# yum install keepalived -y
[root@lb2 conf]# yum install keepalived -y
2.修改配置文件
[root@lb-1 conf]# cd /etc/keepalived/
[root@lb-1 keepalived]# ls
keepalived.conf
[root@lb-1 keepalived]#
vrrp_instance VI_1 { #定义一个vrrp协议的实例 名字叫VI_1 第一个vrrp实例
state MASTER #做master角色
interface ens33 #指定监听网络的接口,其实就是vip绑定到那个网络接口上
virtual_router_id 151 #虚拟路由器id 相当于编号 0~255之间
priority 120 #优先级 0~255
advert_int 1 #宣告消息的时间间隔 1秒 interval 间隔
authentication {
auth_type PASS #密码认证 password
auth_pass 1111 #具体密码
}
virtual_ipaddress { #vip 虚拟ip地址
192.168.0.188
}
[root@lb-1 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
}
[root@lb-1 keepalived]#
# 第二台
[root@lb2 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 58
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
}
[root@lb2 keepalived]#
3.重启服务,观察效果
[root@lb-1 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb-1 keepalived]#
[root@lb2 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb2 keepalived]#
[root@lb2 keepalived]# ps aux|grep keep
root 1708 0.0 0.0 123020 2032 ? Ss 16:14 0:00 /usr/sbin/keepalived -D
root 1709 0.0 0.1 133992 7892 ? S 16:14 0:00 /usr/sbin/keepalived -D
root 1712 0.0 0.1 133860 6160 ? S 16:14 0:00 /usr/sbin/keepalived -D
root 1719 0.0 0.0 112832 2392 pts/0 S+ 16:14 0:00 grep --color=auto keepa
[root@lb2 keepalived]#
keepalived启动后会有三个进程:
父进程:内存管理,子进程管理等等
子进程:VRRP子进程
子进程:healthchecker子进程
keepalived正常启动的时候,共启动3个进程:
一个是父进程,负责监控其子进程;一个是VRRP子进程,另外一个是checkers子进程;
[root@lb-1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:f5:c4 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.152/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.17.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe7b:f5c4/64 scope link
valid_lft forever preferred_lft forever
[root@lb-1 keepalived]#
四、keepalived的架构
单vip 架构: 只有master上有vip,backup上没有vip,这个时候master会比较忙,backup机器会比较闲,设备使用率比较低
双vip 架构: 启动2个vrrp实例,每台机器上都启用2个vrrp实例,一个做master,一个做backup,启用2个vip,每台机器上都会有一个vip,这2个vip都对外提供服务,这样就可以避免单vip的情况下,一个很忙一个很闲。 可以提升设备的使用率,这两台机器互为主备。
1.实现双vip 架构
# 第一台机器上的配置
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
vrrp_instance VI_2 {
state BACKUP
interface ens33
virtual_router_id 59
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.199
}
[root@lb-1 keepalived]#
# 第二台机器的配置
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 58
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
}
vrrp_instance VI_2 {
state MASTER
interface ens33
virtual_router_id 59
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.199
}
[root@lb-2 keepalived]#
2.重启keepalived,查看效果
[root@lb-1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:f5:c4 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.152/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.17.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe7b:f5c4/64 scope link
valid_lft forever preferred_lft forever
[root@lb-1 keepalived]#
[root@lb-2 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:5b:79:20 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.151/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.17.199/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe5b:7920/64 scope link
valid_lft forever preferred_lft forever
[root@lb-2 keepalived]#
五、Healthcheck
实例一:
实例:监控本机的nginx进程是否运行,如果nginx进程不运行就立马将优先级降低30,观察vip是否漂移?
# keepalived的价值是建立在nginx能正常工作的情况下,如果nginx异常,这台机器就不是负载均衡器了,需要停止它的master身份,将优先级降低,让位给其他的机器。 背后需要有健康检测功能。
1.编写监控nginx的脚本
如何判断nginx是否运行,方法很多?
1.pidof nginx
2.killall -0 nginx
[root@lb-1 check_nginx]# pwd
/check_nginx
[root@lb-1 check_nginx]# cat check_nginx.sh
#!/bin/bash
if /usr/sbin/pidof nginx &>/dev/null ; then
exit 0
else
exit 1
fi
[root@lb-1 check_nginx]# chmod +x check_nginx.sh
[root@lb-1 check_nginx]# ll
总用量 8
-rwxr-xr-x 1 root root 82 3月 26 22:05 check_nginx.sh
keepalived 会通过看脚本执行的返回值来判断脚本是否正确执行
0 执行成功
非0 表示执行失败
在2台负载均衡器上都要完成脚本的编写,并且授予可执行权限
[root@lb-2 keepalived]# mkdir /check_nginx/
[root@lb-2 keepalived]# cd /check_nginx/
[root@lb-2 check_nginx]# vim check_nginx.sh
[root@lb-2 check_nginx]# chmod +x check_nginx.sh
[root@lb-2 check_nginx]# ll
总用量 4
-rwxr-xr-x 1 root root 82 3月 26 22:08 check_nginx.sh
[root@lb-2 check_nginx]#
2.在keepalived里定义监控脚本
#定义监控脚本chk_nginx
vrrp_script chk_nginx {
# 当脚本/nginx/check_nginx.sh脚本执行返回值为0的时候,不执行下面的weight -30的操作,只有脚本执行失败,返回值非0的时候,就执行执行权重值减30的操作
script "/check_nginx/check_nginx.sh"
interval 1
weight -30
}
3.在keepalived里调用监控脚本
[root@lb-1 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
#vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_script chk_nginx {
# 当脚本/nginx/check_nginx.sh脚本执行返回值为0的时候,不执行下面的weight -30的操作,只有脚本执行失败,返回值非0的时候,就执行执行权重值减30的操作
script "/check_nginx/check_nginx.sh"
interval 1
weight -30
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
track_script {
chk_nginx
}
}
vrrp_instance VI_2 {
state BACKUP
interface ens33
virtual_router_id 59
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.199
}
}
[root@lb-1 keepalived]#
# 当我们停止nginx服务
[root@lb-1 check_nginx]# nginx -s quit
[root@lb-1 check_nginx]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:f5:c4 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.152/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe7b:f5c4/64 scope link
valid_lft forever preferred_lft forever
[root@lb-1 check_nginx]#
# lb-2中将有两个vip,实现了VIP漂移
[root@lb-2 check_nginx]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:5b:79:20 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.151/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.17.199/32 scope global ens33
valid_lft forever preferred_lft forever
inet 192.168.17.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe5b:7920/64 scope link
valid_lft forever preferred_lft forever
[root@lb-2 check_nginx]#
实例二:
如果检查到nginx进程关闭,立刻关闭keepalived的软件
当本节点服务器成为某个角色的时候,我们去执行某个脚本
#notify_master 状态改变为MASTER后执行的脚本
notify_master /mail/master.sh
#notify_backup 状态改变为BACKUP后执行的脚本
notify_backup /mail/backup.sh
#notify_stop VRRP停止后后执行的脚本
notify_stop /mail/stop.sh
第1步先编写脚本
第2步:在vrrp实例里使用notify_backup 调用脚本
[root@lb-1 check_nginx]# ls
check_nginx.sh halt_keepalived.sh
[root@lb-1 check_nginx]# cat halt_keepalived.sh
#!/bin/bash
service keepalived stop
# 授予可执行权限
[root@lb-1 check_nginx]# ll
总用量 8
-rwxr-xr-x 1 root root 82 3月 26 22:05 check_nginx.sh
-rwxr-xr-x 1 root root 38 3月 26 13:35 halt_keepalived.sh
[root@lb-1 check_nginx]#
# 在vrrp实例里调用
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188
}
track_script {
chk_nginx
}
#当本机成为backup的时候,立马执行下面的脚本
notify_backup "/check_nginx/halt_keepalived.sh"
}
# 测试效果
[root@lb-1 keepalived]# nginx -s quit
[root@lb-1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:7b:f5:c4 brd ff:ff:ff:ff:ff:ff
inet 192.168.17.152/24 brd 192.168.17.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe7b:f5c4/64 scope link
valid_lft forever preferred_lft forever
[root@lb-1 keepalived]# ps aux | grep nginx
root 98761 0.0 0.0 112824 976 pts/0 S+ 22:19 0:00 grep --color=auto nginx
[root@lb-1 keepalived]# ps aux | grep keep
root 98763 0.0 0.0 112824 980 pts/0 R+ 22:19 0:00 grep --color=auto keep
[root@lb-1 keepalived]#
# 效果非常明显,当master成为backup时,keepalived软件也关了
五、两种现象
1、VIP漂移
master 挂了,vip会漂到backup服务器上
# 当我们把第一台的keepalived服务停了之后,其他人收不到发的arrp广播,就会认为master挂了,重新选举新的master,VIP就漂移到backup上。
2、脑裂
- 什么是脑裂?
多台机器出现VIP
- 为什么会出现脑裂?
脑裂现象:
1.vrid(虚拟路由id)不一样
2.网络通信有问题:中间有防火墙阻止了网络之间的选举的过程,vrrp报文的通信
3.认证密码不一样也会出现脑裂
- 脑裂有没有危害?如果有危害对业务有什么影响?
没有危害,能正常访问,反而还有负载均衡的作用
脑裂恢复的时候,还是有影响的,会短暂的中断,影响业务的