keepalived 高可用
一、什么是高可用
高可用keepalived一般是指两台机器启动,有着完全相同的业务系统,当期中有一台机器down机了,另外一台机器(服务器)就能够快速的接管,维持正常状态进行,对于正在访问的用户是无感知的。
二、高可用keepalived(可以用什么)
1、硬件通常使用 **F5**
2、软件通常使用 **keepalived**
三、keepalived是怎么实现高可用的
keepalived软件是基于VRRP协议来实现的,VRRP是虚拟路由冗余协议,主要用于解决单点故障问题

比如公司的网络是通过网关进行上网的,那么如果该路由器故障了,网关无法转发报文了,此时所有人都无法上网了,怎么办?
通常做法是给路由器增加一台北街店,但是问题是,如果我们的主网关master故障了,用户是需要手动指向backup的,如果用户过多修改起来会非常麻烦。
问题一:假设用户将指向都修改为backup路由器,那么master路由器修好了怎么办?
问题二:假设Master网关故障,我们将backup网关配置为master网关的ip是否可以?
其实是不行的,因为PC第一次通过ARP广播寻找到Master网关的MAC地址与IP地址后,会将信息写到ARP的缓存表中,那么PC之后连接都是通过那个缓存表的信息去连接,然后进行数据包的转发,即使我们修改了IP但是Mac地址是唯一的,pc的数据包依然会发送给master。(除非是PC的ARP缓存表过期,再次发起ARP广播的时候才能获取新的backup对应的Mac地址与IP地址)
如何才能做到出现故障自动转移,此时VRRP就出现了,我们的VRRP其实是通过软件或者硬件的形式在Master和Backup外面增加一个虚拟的MAC地址(VMAC)与虚拟IP地址(VIP),那么在这种情况下,PC请求VIP的时候,无论是Master处理还是Backup处理,PC仅会在ARP缓存表中记录VMAC与VIP的信息。
1、如何确定谁是主节点谁是背节点(选举投票,优先级)
2、如果Master故障,Backup自动接管,那么Master回复后会夺权吗(抢占试、非抢占式)
3、如果两台服务器都认为自己是Master会出现什么问题(脑裂)
四、keepalived 高可用环境配置
1.准备环境
主机 IP 身份 lb01 172.15.1.5 master lb02 172.15.1.6 backup keepaviled 192.168.15.4 VIP
2.配置nfs挂载点,nginx配置共享目录
[ root@nfs ~]
172.16.1.0/20( rw,sync,all_squash,anonuid= 1000,anongid= 1000)
[ root@lb01 ~]
[ root@nfs nfs]
3.安装高可用keepalived(lb01与lb02)
[ root@lb01 ~]
[ root@lb02 ~]
4.配置nginx配置文件
[ root@lb02 ~]
upstream http {
server 172.16.1.7:8081;
server 172.16.1.8:8082;
server 172.16.1.9:8082;
}
server {
listen 443 ssl;
server_name _;
ssl_certificate /etc/nginx/cert/server.crt;
ssl_certificate_key /etc/nginx/cert/server.key;
location / {
proxy_pass http://hzl;
}
}
server {
listen 80;
server_name 192.168.15.5;
rewrite ( .*) https://$server_name $request_uri ;
}
[ root@lb02 ~]
[ root@lb02 ~]
5.配置keepalived节点
[ root@lb01 ~]
/etc/keepalived/keepalived.conf
/etc/sysconfig/keepalived
[ root@lb01 ~]
global_defs {
router_id lb01
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 3
authentication {
auth_type PASS
auth_pass 1314
}
virtual_ipaddress {
192.168.15.4
}
}
[ root@lb02 ~]
global_defs {
router_id lb02
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 50
advert_int 3
authentication {
auth_type PASS
auth_pass 1314
}
virtual_ipaddress {
192.168.15.4
}
}
192.168.15.4 www.linux.lb.com
6.配置区别
KEEPALIVED配置区别 MASTER主节点 BACKUP从节点 router_id(路由唯一标识) lb01 lb02 state(角色状态) master backup priority(优先级设定) 100 50
7.启动keepalived
[ root@lb02 ~]
[ root@lb02 ~]
[ root@lb01 ~]
[ root@lb01 ~]
8.配置keepalived日志
一、修改 /etc/sysconfig/keepalived
把KEEPALIVED_OPTIONS= "-D" 修改为KEEPALIVED_OPTIONS= "-D -d -S 0"
二、重启keepalived服务
[ root@lb01 ~]
[ root@lb01 ~]
三、设置syslog,修改/etc/syslog.conf,添加内容如下
local0.* /var/log/keepalived.log
注:local0是l是字符L的小写
五、高可用keepalived (抢占式与非抢占式)
1、节点启动
[ root@lb01 ~]
inet 192.168.15.4 scope global eth0
[ root@lb01 ~]
[ root@lb02 ~]
inet 192.168.15.4/24 scope global eth0
[ root@lb01 ~]
[ root@lb01 ~]
inet 192.168.15.4/24 scope global eth0
2、配置非抢占式nopreempt
1.修改节点状态,两边状态都必须是**backup**
2.两个节点都要加上 **nopreempt**
3.优先级仍保持不同
lobal_defs {
router_id lb01
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 3
***nopreempt***
authentication {
auth_type PASS
auth_pass 1314
}
virtual_ipaddress {
192.168.15.4
}
}
lobal_defs {
router_id lb02
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 50
advert_int 3
***nopreempt***
authentication {
auth_type PASS
auth_pass 1314
}
virtual_ipaddress {
192.168.15.4
}
}
3、通过windows验证mac地址切换
[ root@lb01 ~]
inet 192.168.15.4/24 scope global eth0
C:\Users\admin> arp -a
[ root@lb01 ~]
[ root@lb02 ~]
inet 192.168.15.4/24 scope global eth0
C:\Users\admin> arp -a
4、测试访问
192.168.15.4 www.linux.lb.com
六、高可用keepalived(脑裂)
由于某些原因,导致两台keepalived高可用服务器在指定时间内,无法检测到对方是否存活,各自去调用资源,分配工作,而此时两台服务器都还活着并且在工作。
1、脑裂的故障
1.服务器网线松动,网络故障
2.服务器硬件发生损坏,硬件故障
3.主备服务器之间开启了防火墙
2.开启防火墙(两台)
[ root@lb01 ~]
[ root@lb02 ~]
3、访问页面测试
[ root@lb02 ~]
[ root@lb02 ~]
4、解决脑裂
[ root@lb02 ~]
[ root@lb01 ~]
[ root@lb01 ~]
[ root@lb01 ~]
vip= 192.168.15.4
lb02_ip= 172.16.1.6
while true ; do
ssh $lb02_ip 'ip a | grep 192.168.15.4' & > /dev/null
if [ $? -eq 0 -a ` ip add| grep "$vip " | wc -l` -eq 1 ] ; then
echo "ha is split brain.warning."
else
echo "ha is ok"
fi
sleep 3
done
[ root@lb01 ~]
[ root@lb01 ~]
[ root@lb02 ~]
[ root@lb01 ~]
[ root@lb01 ~]
[ root@lb02 ~]
[ root@lb02 ~]
VIP= "192.168.15.4"
MASTERIP= "172.16.1.6"
BACKUPIP= "172.16.1.5"
while true ; do
PROBE= 'ip a | grep "${VIP} "'
ssh ${MASTERIP} "${PROBE} " > /dev/null
MASTER_STATU= $?
ssh ${BACKUPIP} "${PROBE} " > /dev/null
BACKUP_STATU= $?
if [ [ $MASTER_STATU -eq 0 && $BACKUP_STATU -eq 0 ] ] ; then
ssh ${BACKUPIP} "systemctl stop keepalived.service"
fi
sleep 3
done
[ root@lb01 ~]
-eq 等于
-ne 不等于
-ge 大于等于
-gt 大于
-le 小于等于
-lt 小于
七、高可用keepalived和nginx
1.域名解析到VIP
1.nginx默认监听所有IP
2.nginx故障切换脚本
[ root@lb01 ~]
ps -ef | grep [ n] ginx & > /dev/null
if [ $? -eq 1 ] ; then
systemctl start nginx & > /dev/null
sleep 3
ps -ef | grep [ n] ginx & > /dev/null
if [ $? -eq 1 ] ; then
systemctl stop keepalived
fi
fi
[ root@lb01 ~]
nginxpid= $( ps -C nginx --no-header| wc -l)
1
if [ $nginxpid -eq 0 ] ; then
systemctl start nginx
sleep 3
2
nginxpid= $( ps -C nginx --no-header| wc -l)
3
if [ $nginxpid -eq 0 ] ; then
systemctl stop keepalived
fi
fi
[ root@lb01 keepalived]
[ root@lb01 ~]
nginxnum= ` ps -ef | grep [ n] ginx | wc -l`
if [ $nginxnum -eq 0 ] ; then
systemctl start nginx
sleep 3
nginxnum= ` ps -ef | grep [ n] ginx | wc -l`
if [ $nginxnum -eq 0 ] ; then
systemctl stop keepalived.service
fi
fi
[ root@lb01 keepalived]
3.调用脚本
[ root@lb01 ~]
global_defs {
router_id lb01
}
vrrp_script check_web {
script "/etc/keepalived/check_web.sh"
interval 5
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1314
}
virtual_ipaddress {
192.168.15.4
}
track_script {
check_web
}
}
[ root@lb01 keepalived]
192.168.15.4 www.linux.lb.com