高可用
系统架构中,是要尽量避免的
单点故障: 某个重要的功能,只有一份,容易出现这个点出现问题,导致全局不能使用
单点故障: 本质上就是一份,没有其他的备份
高可用: High Availability --》HA: 都要有备份,一个坏了,另外一个可以顶替,核心业务基本上不受到影响。
高可用性H.A.(High Availability)指的是通过尽量缩短因日常维护操作(计划)和突发的系统崩溃(非计划)所导致的停机时间,以提高系统和应用的可用性。它也被认为是不间断操作的容错技术有所不同。HA系统是企业防止核心计算机系统因故障停机的最有效手段。
单点故障: 某些重要的应用,只有1个节点,如果这个节点出现故障,导致服务不可用。
高可用: high availability :至少有2个以上的节点提供服务,互相备份,其中的一个坏了,另外一个可用顶替。
高可用软件:keepalived、heartbeat、HAproxy
keepalived官网:https://www.keepalived.org/
Keepalived 是一个用 C 语言编写的路由软件。这个项目的主要目标是为 Linux 系统和基于 Linux 的基础设施提供简单而强大的负载均衡和高可用性功能。
Keepalived 开源并且免费的软件。
Keepalived 的2大核心功能:
1.loadbalance 负载均衡 LB:ipvs–》lvs软件在linux内核里已经安装,不需要单独安装 2.high-availability 高可用 HA : vrrp协议
VRRP(virtual router redundancy protocol)虚拟路由器冗余协议
一组路由器协同工作,担任不同的角色,有master角色,也有backup角色
master角色的路由器(的接口)承担实际的数据流量转发任务
Backup路由器侦听Master路由器的状态,并在Master路由器发生故障时,接替其工作,从而保证业务流量的平滑切换。 随时候命,是备胎
VRRP协议报文使用固定的组播地址224.0.0.18进行发送[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传
vrrp协议的工作原理:
选举的过程:
1.所有的路由器或者服务器发送vrrp宣告报文,进行选举,必须是相同vrid和认证密码的,优先级高的服务器或者路由器会被选举为master,其他的机器都是backup
2.master定时(Advertisement Interval)发送VRRP通告报文,以便向Backup路由器告 知自己的存活情况。 默认是间隔1秒
3.接收Master设备发送的VRRP通告报文,判断Master设备的状态是否正常。 如果超过1秒没有收到vrrp报文,就认为master挂了,开始重新选举新的master,vip会漂移到新的master上
vip是虚拟的ip地址,真正对外提供业务ip地址,可以告诉用户的
搭建过程
实验环境:Linux(centos7.9)、keepalived、nginx
LB1:192.168.152.128
LB2:192.168.152.129
web1:192.168.152.130
web2:192.168.152.132
web3:192.168.152.133
NFS:192.168.152.142
需要2台负载均衡器
负载均衡器上都需要安装nginx,使用nginx做7层负载均衡
[root@lb1 ~]# ps aux|grep nginx
root 1060 0.0 0.0 47248 1220 ? Ss 15:46 0:00 nginx: master process /usr/local/nginx99/sbin/nginx
nginx 1064 0.0 0.1 47668 1972 ? S 15:46 0:00 nginx: worker process
root 1605 0.0 0.0 112824 984 pts/0 S+ 15:52 0:00 grep --color=auto nginx
[root@lb2 ~]# ps aux|grep nginx
root 1050 0.0 0.0 47248 1216 ? Ss 15:46 0:00 nginx: master process /usr/local/nginx99/sbin/nginx
nginx 1053 0.0 0.1 47668 1972 ? S 15:46 0:00 nginx: worker process
root 2007 0.0 0.0 112824 988 pts/0 S+ 15:51 0:00 grep --color=auto nginx
[root@lb2 ~]#
1、安装keepalived软件,在2台负载均衡上都安装
[root@lb1 ~]# yum install keepalived -y
[root@lb2 ~]# yum install keepalived -y
2、修改配置文件
[root@lb1 conf]# cd /etc/keepalived/
[root@lb1 keepalived]# ls
keepalived.conf
[root@lb1 keepalived]# vim keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
# vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
}
vrrp_instance VI_1 { #定义一个vrrp协议的实例 名字叫VI_1 第一个vrrp实例
state MASTER #做master角色
interface ens33 #指定监听网络的接口,其实就是vip绑定到那个网络接口上
virtual_router_id 151 #虚拟路由器id编号 0~255之间
priority 120 #优先级 0~255
advert_int 1 #宣告消息的时间间隔 1秒 interval 间隔
authentication {
auth_type PASS #密码认证 password
auth_pass 1111 #具体密码
}
virtual_ipaddress { #vip 虚拟ip地址
192.168.0.188
}[root@lb-1 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addrvrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
}
[root@lb2 keepalived]# vim keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
# vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 58
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
}
[root@lb2 keepalived]#
3、启动服务
在LB1上获取到vip 192.168.152.188/32
[root@lb1 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb1 keepalived]#
[root@lb1 keepalived]# ps aux|grep keepalived
root 1887 0.0 0.0 123056 1404 ? Ss 19:31 0:00 /usr/sbin/keepalived -D
root 1888 0.0 0.1 134028 3412 ? S 19:31 0:00 /usr/sbin/keepalived -D
root 1889 0.0 0.1 133896 2672 ? S 19:31 0:00 /usr/sbin/keepalived -D
root 1901 0.0 0.0 112824 992 pts/0 S+ 19:33 0:00 grep --color=auto keepalived
[root@lb1 keepalived]#
[root@lb1 keepalived]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:21:94:73 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.128/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.152.188/32 scope global ens33 #获取到vip
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:9473/64 scope link
valid_lft forever preferred_lft forever
[root@lb1 keepalived]#
[root@lb2 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb2 keepalived]#
[root@lb2 keepalived]# ps aux|grep keepalived
root 2317 0.1 0.0 123056 1408 ? Ss 19:31 0:00 /usr/sbin/keepalived -D
root 2318 0.0 0.1 134028 3416 ? S 19:31 0:00 /usr/sbin/keepalived -D
root 2319 0.0 0.1 133896 2676 ? S 19:31 0:00 /usr/sbin/keepalived -D
root 2331 0.0 0.0 112824 988 pts/0 S+ 19:33 0:00 grep --color=auto keepalived
[root@lb2 keepalived]#
root@lb2 keepalived]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:88:8e:b5 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.129/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe88:8eb5/64 scope link
valid_lft forever preferred_lft forever
[root@lb2 keepalived]#
vip漂移:
master 挂了,vip会漂到backup服务器上
比较特殊的ip地址: 谁是master的时候(对外提供服务的服务器),vip就会配置在谁的机器上,有时候就理解为vip是真正对外提供服务的ip地址
virtual ip address 虚拟的ip地址
什么是脑裂?
多台机器出现vip
为什么会出现脑裂?
脑裂现象:
1.vrid(虚拟路由id)不一样
2.网络通信有问题:中间有防火墙阻止了网络之间的选举的过程,vrrp报文的通信
3.认证密码不一样也会出现脑裂
脑裂有没有危害?如果有危害对业务有什么影响?
没有危害,能正常访问,反而还有负载均衡的作用
脑裂恢复的时候,还是有影响的,会短暂的中断,影响业务的
keepalived的架构:
单vip 架构: 只有master上有vip,backup上没有vip,这个时候master会比较忙,backup机器会比较闲,设备使用率比较低
双vip 架构: 启动2个vrrp实例,每台机器上都启用2个vrrp实例,一个做master,一个做backup,启用2个vip,每台机器上都会有一个vip,这2个vip都对外提供服务,这样就可以避免单vip的情况下,一个很忙一个很闲。 可以提升设备的使用率
双vip架构步骤:
1、在每个机器上启用2个vrrp实例
[root@lb1 ~]# cd /etc/keepalived/
[root@lb1 keepalived]# ls
keepalived.conf
[root@lb1 keepalived]# vim keepalived.conf
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
}
vrrp_instance VI_2 {
state BACKUP
interface ens33
virtual_router_id 60
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.199
}
}
[root@lb2 ~]# cd /etc/keepalived/
[root@lb2 keepalived]# ls
keepalived.conf
[root@lb2 keepalived]# vim keepalived.conf
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 58
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
}
vrrp_instance VI_2 {
state MASTER
interface ens33
virtual_router_id 60
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.199
}
}
2、重启服务
[root@lb1 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb2 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb2 keepalived]#
3、查看效果
[root@lb1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:21:94:73 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.128/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.152.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:9473/64 scope link
valid_lft forever preferred_lft forever
[root@lb1 keepalived]#
[root@lb2 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:88:8e:b5 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.129/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.152.199/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe88:8eb5/64 scope link
valid_lft forever preferred_lft forever
[root@lb2 keepalived]#
健康检测
keepalived正常启动的时候,共启动3个进程:****
一个是父进程,负责监控其子进程;一个是VRRP子进程,另外一个是checkers子进程;
两个子进程都被系统watchdog看管,两个子进程各自负责自己的事。
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-INgu2HRt-1677397719195)(%E9%AB%98%E5%8F%AF%E7%94%A8.assets/image-20230225195348734.png)]
Healthcheck子进程检查各自服务器的健康状况,,例如http,lvs。如果healthchecks进程检查到master上服务不可用了,就会通知本机上的VRRP子进程,让他删除通告,并且去掉虚拟IP,转换为BACKUP状态。
keepalived的价值是建立在nginx能正常工作的情况下,如果nginx异常,这台机器就不是负载均衡器了,需要停止它的master身份,将优先级降低,让位给其他的机器。 背后需要有健康检测功能。
实例:监控本机的nginx进程是否运行,如果nginx进程不运行就立马将优先级降低30,观察vip是否漂移?
1.编写监控nginx的脚本
如何判定nginx是否运行?
pidof nginx
killall -0 nginx
[root@lb1 keepalived]# mkdir /nginx
[root@lb1 keepalived]# cd /nginx
[root@lb1 nginx]# ls
[root@lb1 nginx]# vim check_nginx.sh
#!/bin/bash
#检测nginx是否正常运行
if /usr/sbin/pidof nginx;then
exit 0
else
exit 1
fi
[root@lb1 nginx]# echo $?
0
[root@lb1 nginx]# chmod +x check_nginx.sh
[root@lb1 nginx]# ls
check_nginx.sh
[root@lb1 nginx]#
2.在keepalived里定义监控脚本
#定义监控脚本
vrrp_script chk_nginx {
#当脚本/nginx/check_nginx.sh脚本执行返回值为0的时候,不执行下面的weight -30的操作,只有脚本执行失败,返回值非0的时候,就执行执行权重值减30的操作
script "/nginx/check_nginx.sh"
interval 1
weight -30
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 58
priority 120
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.152.188
}
#调用监控脚本
track_script {
chk_nginx
}
}
3、重启服务,查看效果
[root@lb1 keepalived]# service keepalived restart
Redirecting to /bin/systemctl restart keepalived.service
[root@lb1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:21:94:73 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.128/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.152.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:9473/64 scope link
valid_lft forever preferred_lft forever
[root@lb1 keepalived]#
root@lb1 keepalived]# nginx -s stop
[root@lb1 keepalived]# ps aux|grep nginx
root 2926 0.0 0.0 112824 984 pts/1 S+ 15:16 0:00 grep --color=auto nginx
[root@lb1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:21:94:73 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.128/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:9473/64 scope link
valid_lft forever preferred_lft forever
[root@lb1 keepalived]#
[root@lb1 keepalived]# nginx
[root@lb1 keepalived]# ps aux|grep nginx
root 2986 0.0 0.0 47252 1224 ? Ss 15:16 0:00 nginx: master process nginx
nginx 2987 0.0 0.1 47672 1976 ? S 15:16 0:00 nginx: worker process
root 3001 0.0 0.0 112828 988 pts/1 S+ 15:16 0:00 grep --color=auto nginx
[root@lb1 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:21:94:73 brd ff:ff:ff:ff:ff:ff
inet 192.168.152.128/24 brd 192.168.152.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.152.188/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe21:9473/64 scope link
valid_lft forever preferred_lft forever
[root@lb1 keepalived]#
notify_master 状态改变为MASTER后执行的脚本
notify_master /mail/master.shnotify_backup 状态改变为BACKUP后执行的脚本
notify_backup /mail/backup.shnotify_stop VRRP停止后后执行的脚本
notify_stop /mail/stop.sh