Keepalived+haproxy
Keepalived和haproxy是什么 这里面都有写
思路:
- HA,也就是高可用,是用来保障服务正常运行的,在单个服务器出现问题的时候另外一个服务器会启动,代替出现问题的服务器来完成工作
- LB,负载均衡,这里的负载均衡就是需要去保障的服务。
- 那么首先需要做好的就是haproxy,然后再实现高可用
haproxy搭建zabbix负载均衡
这里用zabbix是因为凑巧有,可以用别的服务替代,举一反三
源码安装
[root@server ~]# dnf -y install make gcc pcre-devel bzip2-devel openssl-devel systemd-devel
--
[root@server ~]# useradd -rMs /sbin/nologin haproxy
--
[root@server ~]# ls
anaconda-ks.cfg haproxy-2.1.3.tar.gz to.sh zabbix-6.2.3.tar.gz
[root@server ~]# tar -xf haproxy-2.1.3.tar.gz haproxy-2.1.3/
[root@server ~]# ls
anaconda-ks.cfg haproxy-2.1.3 haproxy-2.1.3.tar.gz to.sh zabbix-6.2.3.tar.gz
[root@server ~]# cd haproxy-2.1.3
[root@server haproxy-2.1.3]# make clean
[root@server haproxy-2.1.3]# make -j $(grep 'processor' /proc/cpuinfo |wc -l) \
TARGET=linux-glibc \
USE_OPENSSL=1 \
USE_ZLIB=1 \
USE_PCRE=1 \
USE_SYSTEMD=1
[root@server haproxy-2.1.3]# make install PREFIX=/usr/local/haproxay
[root@server haproxy-2.1.3]# cp haproxy /usr/sbin/
[root@server haproxy-2.1.3]# vim /etc/sysctl.conf
[root@server haproxy-2.1.3]# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
[root@server haproxy-2.1.3]#
配置服务
[root@server haproxy-2.1.3]# mkdir /etc/haproxy
[root@server haproxy-2.1.3]vim /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 info
#log loghost local0 info
maxconn 20480
#chroot /usr/local/haproxy
pidfile /var/run/haproxy.pid
#maxconn 4000
user haproxy
group haproxy
daemon
#---------------------------------------------------------------------
#common defaults that all the 'listen' and 'backend' sections will
#use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option dontlognull
option httpclose
option httplog
#option forwardfor
option redispatch
balance roundrobin
timeout connect 10s
timeout client 10s
timeout server 10s
timeout check 10s
maxconn 60000
retries 3
#--------------统计页面配置------------------
listen admin_stats
bind 0.0.0.0:82
stats enable
mode http
log global
stats uri /haproxy_stats
stats realm Haproxy\ Statistics
stats auth admin:admin
#stats hide-version
stats admin if TRUE
stats refresh 30s
#---------------web设置-----------------------
listen webcluster
bind 0.0.0.0:81
mode http
#option httpchk GET /index.html
log global
maxconn 3000
balance roundrobin
cookie SESSION_COOKIE insert indirect nocache
server web01 你的ip:80 check inter 2000 fall 5
//启动haproxy,配置service服务单元文件启动
[root@server haproxy-2.1.3]# vim /usr/lib/systemd/system/haproxy.service
[root@server haproxy-2.1.3]# cat /usr/lib/systemd/system/haproxy.service
[Unit]
Description=HAProxy Load Balancer
After=syslog.target network.target
[Service]
ExecStartPre=/usr/local/haproxy/sbin/haproxy -f /etc/haproxy/haproxy.cfg -c -q
ExecStart=/usr/local/haproxy/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfgv-p /var/run/haproxy.pid
ExecReload=/bin/kill -USR2 $MAINPID
[Install]
WantedBy=multi-user.target
[root@server haproxy]# systemctl restart rsyslog.service
[root@server haproxy]# systemctl restart haproxy.service
下面配置负载均衡
主机 | IP | 需要安装的应用 | 系统 |
---|---|---|---|
centos8-stream8 | 192.168.245.128 | keepalived-haproxy-zabbix-server | 全部centos8-stream |
stream1 | 192.168.245.131 | haproxy-zabbix-agent | |
stream2 | 192.168.245.129 | zabbix-显示用 | |
stream3 | 192.168.245.130 | zabbix-显示用 |
#第一步,关闭所有防火墙
然后在23中安装随意一个服务,能看就行
这里使用zabbix
修改haproxy配置文件
[root@server haproxy]# vim /etc/haproxy/haproxy.cfg
[root@server haproxy]# cat /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0 info
#log loghost local0 info
maxconn 20480
#chroot /usr/local/haproxy
pidfile /var/run/haproxy.pid
#maxconn 4000
user haproxy
group haproxy
daemon
#---------------------------------------------------------------------
#common defaults that all the 'listen' and 'backend' sections will
#use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option dontlognull
option httpclose
option httplog
#option forwardfor
option redispatch
balance roundrobin
timeout connect 10s
timeout client 10s
timeout server 10s
timeout check 10s
maxconn 60000
retries 3
#--------------统计页面配置------------------
listen admin_stats
bind 0.0.0.0:82
stats enable
mode http
log global
stats uri /haproxy_stats
stats realm Haproxy\ Statistics
stats auth admin:admin
#stats hide-version
stats admin if TRUE
stats refresh 30s
#---------------web设置-----------------------
listen webcluster
bind 0.0.0.0:81
mode http
#option httpchk GET /index.html
log global
maxconn 3000
balance roundrobin
cookie SESSION_COOKIE insert indirect nocache
server web01 192.168.245.129:80 check inter 2000 fall 5
server web02 192.168.245.130:80 check inter 2000 fall 5
[root@server haproxy]# systemctl restart haproxy.service
同样的在另外一台上装haproxy
步骤不重复,一模一样
[root@agent haproxy]# vim haproxy.cfg
[root@agent haproxy]# systemctl restart haproxy.service
[root@agent haproxy]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:81 0.0.0.0:*
LISTEN 0 128 0.0.0.0:82 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:10050 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
此时负载均衡绝对有用,因为zabbix都装在了80端口,这是通过haproxy访问的两个测试界面。
此时负载均衡搭建完毕
搭建高可用
高可用环境
主机 | IP | 需要安装的应用 | 系统 |
---|---|---|---|
centos8-stream8 | 192.168.245.128 | keepalived-haproxy-zabbix-server | 全部centos8-stream |
stream1 | 192.168.245.131 | keepalived-haproxy-zabbix-agent |
安装keepalived
[root@server haproxy]# rm -rf /etc/yum.repos.d/CentOS-Strea*
[root@server haproxy]# curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2495 100 2495 0 0 11995 0 --:--:-- --:--:-- --:--:-- 11995
[root@server haproxy]# yum -y install keepalived
#同样的操作,两台主机
首先写配置文件
[root@server haproxy]# vim /etc/keepalived/keepalived.conf
[root@server scripts]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id lb01
}
vrrp_script haproxy_check {
script "/scripts/check_hapro.sh"
interval 1
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass yuxuan
}
virtual_ipaddress {
192.168.245.234
}
}
track_script{
haproxy_check
}
notify_master "/scripts/notify.sh master"
virtual_server 192.168.245.234 80 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 50
protocol TCP
real_server 192.168.245.128 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.245.131 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
[root@server scripts]# vim check_hapro.sh
[root@server scripts]# cat check_hapro.sh
#!/bin/bash
haproxy=$(ps -A|grep -Ev "grep|$0" |grep '\bhaproxy\b'|wc -l)
if [ $haproxy -lt 1 ];then
systemctl stop keepalived
fi
[root@server haproxy]# mkdir /scripts
[root@server haproxy]# cd /scripts/
[root@server scripts]# vim check_hapro.sh
[root@server scripts]# cat check_hapro.sh
#!/bin/bash
haproxy=$(ps -A|grep -Ev "grep|$0" |grep '\bhaproxy\b'|wc -l)
if [ $haproxy -lt 1 ];then
systemctl stop keepalived.service
fi
[root@server scripts]# chmod +x check_hapro.sh
[root@server scripts]# ls
check_hapro.sh notify.sh
配置备用主机
[root@agent haproxy]# vim /etc/keepalived/keepalived.conf
[root@agent haproxy]# cat /etc/keepalived/keepalived.conf
Configuration File for keepalived
global_defs {
router_id lb02
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass yuxuan
}
virtual_ipaddress {
192.168.245.234
}
}
notify_master "/scripts/notify.sh master"
notify_backup "/scripts/notify.sh backup "
virtual_server 192.168.245.234 80 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 50
protocol TCP
real_server 192.168.245.128 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
real_server 192.168.245.131 80 {
weight 1
TCP_CHECK {
connect_port 80
connect_timeout 3
nb_get_retry 3
delay_before_retry 3
}
}
}
[root@agent haproxy]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:10050 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
写脚本
[root@server haproxy]# mkdir /scripts
[root@server haproxy]# cd /scripts/
[root@server scripts]# vim notify.sh
[root@server scripts]# cat notify.sh
#!/bin/bash
case "$1" in
master)
haproxy=$(ps -A|grep -Ev "grep|$0" |grep '\bhaproxy\b'|wc -l)
if [ $haproxy -lt 1 ];then
systemctl start haproxy
fi
;;
backup)
haproxy=$(ps -A|grep -Ev "grep|$0" |grep '\bhaproxy\b'|wc -l)
if [ $haproxy -gt 0 ];then
systemctl stop haproxy
fi
;;
*)
echo "Usage:$0 master|backup VIP"
;;
esac
[root@server scripts]# chmod +x notify.sh
#两台主机都要做
配置完后,128ip的主服务器能访问,而131ip的不行
正常情况:
[root@server scripts]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:7f:82:fa brd ff:ff:ff:ff:ff:ff
inet 192.168.245.128/24 brd 192.168.245.255 scope global dynamic noprefixroute ens33
valid_lft 940sec preferred_lft 940sec
inet 192.168.245.234/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fe7f:82fa/64 scope link noprefixroute
valid_lft forever preferred_lft forever
128没了
脑裂
在高可用(HA)系统中,当联系2个节点的“心跳线”断开时,本来为一整体、动作协调的HA系统,就分裂成为2个独立的个体。由于相互失去了联系,都以为是对方出了故障。两个节点上的HA软件像“裂脑人”一样,争抢“共享资源”、争起“应用服务”,就会发生严重后果——或者共享资源被瓜分、2边“服务”都起不来了;或者2边“服务”都起来了,但同时读写“共享存储”,导致数据损坏(常见如数据库轮询着的联机日志出错)。
对付HA系统“裂脑”的对策,目前达成共识的的大概有以下几条:
- 添加冗余的心跳线,例如:双线条线(心跳线也HA),尽量减少“裂脑”发生几率;
- 启用磁盘锁。正在服务一方锁住共享磁盘,“裂脑”发生时,让对方完全“抢不走”共享磁盘资源。但使用锁磁盘也会有一个不小的问题,如果占用共享盘的一方不主动“解锁”,另一方就永远得不到共享磁盘。现实中假如服务节点突然死机或崩溃,就不可能执行解锁命令。后备节点也就接管不了共享资源和应用服务。于是有人在HA中设计了“智能”锁。即:正在服务的一方只在发现心跳线全部断开(察觉不到对端)时才启用磁盘锁。平时就不上锁了。
- 设置仲裁机制。例如设置参考IP(如网关IP),当心跳线完全断开时,2个节点都各自ping一下参考IP,不通则表明断点就出在本端。不仅“心跳”、还兼对外“服务”的本端网络链路断了,即使启动(或继续)应用服务也没有用了,那就主动放弃竞争,让能够ping通参考IP的一端去起服务。更保险一些,ping不通参考IP的一方干脆就自我重启,以彻底释放有可能还占用着的那些共享资源
脑裂产生的原因
一般来说,脑裂的发生,有以下几种原因:
- 高可用服务器对之间心跳线链路发生故障,导致无法正常通信
- 因心跳线坏了(包括断了,老化)
- 因网卡及相关驱动坏了,ip配置及冲突问题(网卡直连)
- 因心跳线间连接的设备故障(网卡及交换机)
- 因仲裁的机器出问题(采用仲裁的方案)
- 高可用服务器上开启了 iptables防火墙阻挡了心跳消息传输
- 高可用服务器上心跳网卡地址等信息配置不正确,导致发送心跳失败
- 其他服务配置不当等原因,如心跳方式不同,心跳广插冲突、软件Bug等
注意:
keepalived配置里同一 VRRP实例如果 virtual_router_id两端参数配置不一致也会导致裂脑问题发生。
这里通过修改vip的方式来实现脑裂
对脑裂进行监控
对脑裂的监控应在备用服务器上进行,通过添加zabbix自定义监控进行。
监控什么信息呢?监控备上有无VIP地址
备机上出现VIP有两种情况:
- 发生了脑裂
- 正常的主备切换
监控只是监控发生脑裂的可能性,不能保证一定是发生了脑裂,因为正常的主备切换VIP也是会到备上的。
所有情况下,虚拟ip发生改变就证明该机器变成backup
监控脚本如下:
[root@server ~]# cd /scripts/
[root@server scripts]# vim check_keepalived.sh
[root@server scripts]# cat check_keepalived.sh
#!/bin/bash
if [ $(ip a show ens33 | grep 192.168.245.234|wc -l) -eq 0 ];then
echo "2"
else
echo "1"
fi
[root@server scripts]# chmod +x check_keepalived.sh
配置zabbix监控:
[root@server scripts]# vim /usr/local/etc/zabbix_agentd.conf
UserParameter=check_keepalived,/scripts/check_keepalived.sh
[root@server scripts]# pkill zabbix
[root@server scripts]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9000 0.0.0.0:*
LISTEN 0 128 0.0.0.0:81 0.0.0.0:*
LISTEN 0 128 0.0.0.0:82 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 80 *:3306 *:*
LISTEN 0 128 *:80 *:*
[root@server scripts]# zabbix_agentd
[root@server scripts]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:10050 0.0.0.0:*
LISTEN 0 128 0.0.0.0:9000 0.0.0.0:*
LISTEN 0 128 0.0.0.0:81 0.0.0.0:*
LISTEN 0 128 0.0.0.0:82 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 80 *:3306 *:*
LISTEN 0 128 *:80 *:*
[root@server scripts]#
#同样的,监控备用服务器
#并且在下面测试,脚本相同,改配置文件相同
[root@server scripts]# zabbix_get -s 192.168.245.131 -k check_keepalived
2
进入web界面更改
先设置备用的监控
同样的,配置主,只需要更改触发器中的数值为2,当虚拟ip没了,就报警。
下面是返回值,需要去配置触发器,相关操作在之前的zabbix监控中有写