一.高可用集群
1.1 集群类型
LB:Load Balance 负载均衡
LVS/HAProxy/nginx(http/upstream, stream/upstream)
HA:High Availability 高可用集群 数据库、Redis SPoF: Single Point of Failure,解决单点故障
HPC:High Performance Computing 高性能集群
1.2 系统可用性
SLA:Service-Level Agreement 服务等级协议(提供服务的企业与客户之间就服务的品质、水准、性能 等方面所达成的双方共同认可的协议或契约)
A = MTBF / (MTBF+MTTR) 指标 :99.9%, 99.99%, 99.999%,99.9999%
1.3 系统故障
硬件故障:设计缺陷、wear out(损耗)、非人为不可抗拒因素 软件故障:设计缺陷 bug
1.4 实现高可用
提升系统高用性的解决方案:降低MTTR- Mean Time To Repair(平均故障时间)
解决方案:建立冗余机制
active/passive 主/备
active/active 双主
active --> HEARTBEAT --> passive active <--> HEARTBEAT <--> active
1.5.VRRP:
Virtual Router Redundancy Protocol
虚拟路由冗余协议,解决静态网关单点风险 物理层:路由器、三层交换机 软件层:keepalived
99.95%:(602430)*(1-0.9995)=21.6分钟 #一般按一个月停机时间统计
二.Keepalived 部署
1. keepalived 简介
vrrp 协议的软件实现,原生设计目的为了高可用 ipvs服务 官网:Keepalived for Linux
功能: 基于vrrp协议完成地址流动 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义)
为ipvs集群的各RS做健康状态检测 基于脚本调用接口完成脚本中定义的功能,进而影响集群事务,以此支持nginx、haproxy等服务
2.keepalived虚拟路由管理
实验环境:
首先检查防火墙是否关闭以及selinux=diabled
realserver1:
[root@realserver1 ~]# yum install httpd -y
[root@realserver1 ~]# systemctl stop firewalld
[root@realserver1 ~]# echo realserver1 - 172.25.254.110 > /var/www/html/index.html
[root@realserver1 ~]# systemctl start httpd
realserver2:
[root@realserver2 ~]# yum install httpd -y
[root@realserver2 ~]# systemctl stop firewalld
[root@realserver2 ~]# echo realserver2 - 172.25.254.120 > /var/www/html/index.html
[root@realserver2 ~]# systemctl start httpd
KA1:
测试:
[root@ka1 ~]# curl 172.25.254.110
realserver1 - 172.25.254.110
[root@ka1 ~]# curl 172.25.254.120
realserver2 - 172.25.254.120
下载lkeepalived:
[root@ka1 ~]# yum install keepalived -y
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
KA2:
[root@ka2 ~]# yum install keepalived -y
pingVRRP的两种方法:
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
3.keepalived日志分离
[root@ka1 ~]# vim /etc/sysconfig/keepalived
[root@ka1 ~]# cat /etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -S 6"
[root@ka1 ~]# vim /etc/rsyslog.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka1 ~]# systemctl restart rsyslog.service
[root@ka1 ~]# ll /var/log/keepalived.log
-rw-------. 1 root root 4277 8月 12 21:57 /var/log/keepalived.log
[root@ka1 ~]# cat /var/log/keepalived.log
4.keepalived独立子配置文件
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
Job for keepalived.service failed because the control process exited with error code. See "systemctl status keepalived.service" and "journalctl -xe" for details.
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# mkdir -p /etc/keepalived/conf.d
[root@ka1 ~]# vim /etc/keepalived//conf.d/172.25.254.100.conf
[root@ka1 ~]# cat /etc/keepalived//conf.d/172.25.254.100.conf
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 100
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:1
}
}
[root@ka1 ~]# systemctl restart keepalived
三.Keepalived 企业应用示例
1.抢占模式和非抢占模式
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka1 ~]# systemctl stop keepalived
[root@ka1 ~]# ifconfig
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.25.254.100 netmask 255.255.255.0 broadcast 0.0.0.0
ether 00:0c:29:0c:6c:2d txqueuelen 1000 (Ethernet)
2.单播模式
默认keepalived主机之间利用多播相互通告消息,会造成网络拥塞,可以替换成单播,减少网络流量
注意:启用 vrrp_strict 时,不能启用单播
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived
[root@ka1 ~]# tcpdump -i eth0 -nn src host 172.25.254.10 and dst 172.25.254.20
[root@ka2 ~]# vim /etc/keepalived/keepalived.conf
[root@ka2 ~]# systemctl restart keepalived
[root@ka2 ~]# tcpdump -i eth0 -nn src host 172.25.254.20 and dst 172.25.254.10
3.邮件通知
[root@ka1 ~]# vim /etc/mail.rc
[root@ka1 ~]# cat /etc/mail.rc
set from=1921145337@qq.com
set smtp=smtp.qq.com
set smtp-auth-user=1921145337@qq.com
set smtp-auth-password=fdvoyibvazmecfbd
set smtp-auth=login
set ssl-verify=ignore
[root@netmask ~]# echo test message | mail -s test 1921145337@qq.com
发邮件的脚本:
[root@ka1 ~]# cat /etc/keepalived/mail.sh
#!/bin/bash
mail_dst="1921145337@qq.com"
send_message()
{
mail_sub="$HOSTNAME to be $1 vip move"
mail_msg="'date +%F\ %T': vrrp move $HOSTNAME chage $!"
echo $mail_msg | mail -s "mail_sub" $mail_dst
}
case $1 in
master)
send_message master
;;
backup)
send_message backup
;;
fault)
send_message fault
;;
*)
;;
esac
4.多主模式
master/slave的单主架构,同一时间只有一个Keepalived对外提供服务,此主机繁忙,而另一台主机却 很空闲,利用率低下,可以使用master/master的双主架构,解决此问题。
master/master 的双主架构: 即将两个或以上VIP分别运行在不同的keepalived服务器,以实现服务器并行提供web访问的目的,提高服务器资源利用率
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived.service
[root@ka1 ~]# cat restart keepalived.service
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 100
priority 100
advert_int 1
#nopreempt
#preempt_delay 5
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.100/24 dev eth0 label eth0:1
}
unicast_src_ip 172.25.254.10
unicast_peer {
172.25.254.20
}
}
vrrp_instance VI_2 {
state BACKUP
interface eth0
virtual_router_id 200
priority 80
advert_int 1
#nopreempt
#preempt_delay 5
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.25.254.200/24 dev eth0 label eth0:2
}
unicast_src_ip 172.25.254.10
unicast_peer {
172.25.254.20
}
}
5.实现IPVS的高可用性
两台机子,其中一台挂掉还是可以运行:
[root@ka1 ~]# yum install ipvsadm -y
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived.service
[root@ka1 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.254.100:80 wrr
-> 172.25.254.110:80 Route 1 0 0
-> 172.25.254.120:80 Route 1 0 0
[root@ka2 ~]# systemctl stop firewalld
[root@ka2 ~]# systemctl restart keepalived
[root@ka2 ~]# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 172.25.254.100:80 wrr
-> 172.25.254.110:80 Route 1 0 0
-> 172.25.254.120:80 Route 1 0 0
两台机子配置相同,一定记得关闭防火墙
realserver1:
[root@realserver1 ~]# ip a a 172.25.254.100/32 dev lo
[root@realserver1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet 172.25.254.100/32 scope global lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
[root@realserver1 ~]# vim /etc/sysctl.d/arp.conf
[root@realserver1 ~]# sysctl --system
* Applying /usr/lib/sysctl.d/00-system.conf ...
* Applying /usr/lib/sysctl.d/10-default-yama-scope.conf ...
kernel.yama.ptrace_scope = 0
* Applying /usr/lib/sysctl.d/50-default.conf ...
kernel.sysrq = 16
kernel.core_uses_pid = 1
kernel.kptr_restrict = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /usr/lib/sysctl.d/60-libvirtd.conf ...
fs.aio-max-nr = 1048576
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.d/arp.conf ...
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2
* Applying /etc/sysctl.conf ...
[root@realserver1 ~]# systemctl stop firewalld
[root@realserver1 ~]# systemctl start httpd
[root@realserver1 ~]# systemctl restart httpd
realserver2中:
[root@realserver2 ~]# vim /etc/sysctl.d/arp.conf
[root@realserver2 ~]# sysctl --system
四.VRRPScript 配置
分为两步:
1.定义脚本
vrrp_script:自定义资源监控脚本,vrrp实例根据脚本返回值,公共定义,可被多个实例调用,定 义在vrrp实例之外的独立配置块,一般放在global_defs设置块之后。 通常此脚本用于监控指定应用的状态。一旦发现应用的状态异常,则触发对MASTER节点的权重减至 低于SLAVE节点,从而实现 VIP 切换到 SLAVE 节点
vrrp_script <SCRIPT_NAME> {
script <STRING>|<QUOTED-STRING> #此脚本返回值为非0时,会触发下面OPTIONS执行OPTIONS
}
2.调用脚本
track_script:调用vrrp_script定义的脚本去监控资源,定义在VRRP实例之内,调用事先定义的
vrrp_script
track_script {
SCRIPT_NAME_1
SCRIPT_NAME_2
}
1.定义 VRRP script
vrrp_script { #定义一个检测脚本,在global_defs 之外配置
script | #shell命令或脚本路径
interval #间隔时间,单位为秒,默认1秒
timeout #超时时间
weight #默认为0,如果设置此值为负数,
\#当上面脚本返回值为非0时
\#会将此值与本节点权重相加可以降低本节点权重, #即表示fall. #如果是正数,当脚本返回值为0,
\#会将此值与本节点权重相加可以提高本节点权重
\#即表示 rise.通常使用负值
fall #执行脚本连续几次都失败,则转换为失败,建议设为2以上
rise #执行脚本连续几次都成功,把服务器从失败标记为成功
user USERNAME [GROUPNAME] #执行监测脚本的用户或组
init_fail #设置默认标记为失败状态,监测成功之后再转换为成功状态
}
2.调用 VRRP script
vrrp_instance test
{ ... ...
track_script {
check_down
}
}
3.VRRP通过脚本控制 vip/keepalived+lvs
[root@ka1 ~]# vim /mnt/lee
[root@ka1 ~]# sh /mnt/lee
[root@ka1 ~]# touch /etc/keepalived/test.sh
[root@ka1 ~]# vim /etc/keepalived/test.sh
[root@ka1 ~]# cat /etc/keepalived/test.sh
#!/bin/bash
[ ! -f "/mnt/lee" ]
把权重小的那个分给ka2了:
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived.service
[root@ka1 ~]# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.25.254.10 netmask 255.255.0.0 broadcast 172.25.255.255
inet6 fe80::20c:29ff:fe0c:6c2d prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:0c:6c:2d txqueuelen 1000 (Ethernet)
RX packets 75496 bytes 6118881 (5.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 128025 bytes 13978589 (13.3 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.25.254.100 netmask 255.255.255.0 broadcast 0.0.0.0
ether 00:0c:29:0c:6c:2d txqueuelen 1000 (Ethernet)
4.keepalived+haproxy
两台主机和服务机:同时配置以下的服务,当停止其中的
[root@ka1 ~]# vim /etc/sysctl.conf
[root@ka1 ~]# sysctl -system
[root@ka1 ~]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
[root@ka1 ~]# vim /etc/haproxy/haproxy.cfg
[root@ka1 ~]# systemctl enable haproxy.service
Created symlink from /etc/systemd/system/multi-user.target.wants/haproxy.service to /usr/lib/systemd/system/haproxy.service.
[root@ka1 ~]# systemctl restart haproxy.service
[root@ka1 ~]# vim /etc/keepalived/test.sh
[root@ka1 ~]# chmod +X /etc/keepalived/test.sh
[root@ka1 ~]# cat /etc/keepalived/test.sh
#!/bin/bash
killall -0 haproxy
[root@ka1 ~]# vim /etc/keepalived/keepalived.conf
[root@ka1 ~]# systemctl restart keepalived.service
[root@realserver1 ~]# vim /etc/sysctl.d/arp.conf
[root@realserver1 ~]# ip a d 172.25.254.100/32 dev lo
[root@realserver1 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
[root@realserver1 ~]# vim /etc/sysctl.d/arp.conf
[root@realserver1 ~]# cat /etc/sysctl.d/arp.conf
net.ipv4.conf.all.arp_ignore=0
net.ipv4.conf.all.arp_announce=0
net.ipv4.conf.lo.arp_ignore=0
net.ipv4.conf.lo.arp_announce=0
[root@netmask ~]# systemctl stop haproxy.service
停止以上的服务,下面的运行也不受影响: