文章目录
一、高可用方案
之所以选择三台机做GlusterFS热备,是因为双机只能做2个复制卷,2个复制卷的模式则无法避免脑裂问题,而3个复制卷模式则可以大大减少脑裂的几率,与3快硬盘做RAID5的原理相似。相关描述与介绍,请参考GlusterFS官方对此的说明。一旦发生脑裂,请参考GlusterFS官方给出的步骤进行检查、修复。
选择GLusterFS与Keepalived结合,是因为没有Keepalived的VIP功能,客户端在挂载GlusterFS时候只指定3个节点中的一个节点的IP,而如果该节点网络故障,即使其他节点正常,也会导致客户端无法访问GlusterFS。而Keepalived的VIP功能可以为客户端提供统一的虚拟IP,只要Keepalived的监控脚本对所要监控的状态进行处理,即可根据GlusterFS的状态、Keepalived状态以及其他如网络连通性等,自动漂移VIP到可用的节点上。对于客户端来说,始终挂载的是可用状态的节点,从而避免单一节点网络故障而无法访问的情况。
二、环境信息
IP | 主机名 | 存储 | 系统 | VIP | 软件 |
---|---|---|---|---|---|
192.168.1.138 | gluster-node-01 | /dev/sdb | CentOS-7.6 | 192.168.1.142 | gluserfs+keepalived |
192.168.1.140 | gluster-node-02 | /dev/sdb | CentOS-7.6 | 192.168.1.142 | gluserfs+keepalived |
192.168.1.141 | gluster-node-03 | /dev/sdb | CentOS-7.6 | 192.168.1.142 | gluserfs+keepalived |
192.168.1.170 | gluster-client | CentOS-7.6 | gluserfs |
说明:192.168.1.142用于对外提供存储服务。
三、基础配置
# 1、/etc/hosts设置(3个节点都要设置)
192.168.1.138 gluster-node-01
192.168.1.140 gluster-node-02
192.168.1.141 gluster-node-03
# 2、主机名设置,退出登录,再次登陆,自动显示主机名为所设置的名称
#在节点1上设置:
hostnamectl set-hostname gluster-node-01
#在节点2上设置:
hostnamectl set-hostname gluster-node-02
#在节点3上设置:
hostnamectl set-hostname gluster-node-03
# 3、针对数据盘(如果没有,请添加硬盘,使用独立的硬盘存储更为可靠),创建文件系统(3个节点都执行)
mkfs.xfs /dev/sdb1
# 4、创建GlusterFS存储挂载点(3个节点都执行)
mkdir -p /data/glusterfs
echo "/dev/sdb1 /data/glusterfs xfs defaults 0 0" >> /etc/fstab
mount –a
说明:如果没有独立的硬盘做数据存储,可以直接创建该文件夹即可,使用文件夹来做数据存储。
四、部署步骤
4.1、安装GlusterFS相关软件包
# 1、安装GlusterFS的yum源(3个节点都执行)
yum install centos-release-gluster -y
# 2、安装GlusterFS相关软件包(3个节点都执行)
yum install –y glusterfs glusterfs-server glusterfs-cli glusterfs-geo-replication glusterfs-rdma
# 3、启动Glusterd服务(3个节点都执行)
systemctl start glusterd glusterfsd glusterfssharedstorage
systemctl enable glusterd glusterfsd glusterfssharedstorage
可以看到Gluster相关的3个服务都会自动启动:
systemctl status glusterd glusterfsd glusterfssharedstorage
# 4、防火墙添加Gluster端口(如果未开启防火墙,则跳过此步;否则3个节点都执行)
firewall-cmd --zone=public --add-port=24007/tcp –permanent
firewall-cmd --reload
# 5、在节点1上添加节点2、3为信任节点
[root@gluster-node-01 ~]# gluster peer probe gluster-node-02
peer probe: success
[root@gluster-node-01 ~]# gluster peer probe gluster-node-03
peer probe: success
# 6、查看节点状态
# 可以看到,节点1上把节点2、3当作信任节点;
# 同样在节点2上查看状态,可以看到节点2把节点1、3当作信任节点,
# 节点3把节点1、2当作信任节点
###################################################
[root@gluster-node-01 ~]# gluster peer status
Number of Peers: 2
Hostname: gluster-node-02
Uuid: 33b41ce0-4e06-44bf-8a5e-2b1568c39825
State: Accepted peer request (Connected)
Hostname: gluster-node-03
Uuid: e4106bd1-2504-4205-8852-ed86ae8ed946
State: Accepted peer request (Connected)
###################################################
[root@gluster-node-02 ~]# gluster peer status
Number of Peers: 2
Hostname: gluster-node-01
Uuid: 6f5fe502-20c6-407d-b337-a349770b4374
State: Peer in Cluster (Connected)
Hostname: gluster-node-03
Uuid: a02576c8-c68b-4bea-8725-ec492de3f40c
State: Peer in Cluster (Connected)
###################################################
[root@gluster-node-03 ~]# gluster peer status
Number of Peers: 2
Hostname: gluster-node-01
Uuid: 6f5fe502-20c6-407d-b337-a349770b4374
State: Peer in Cluster (Connected)
Hostname: gluster-node-02
Uuid: 50ccce86-2528-443b-bc8e-dee243f4ed16
State: Peer in Cluster (Connected)
4.2、创建复制卷
# 1、在节点1上建立复制卷
[root@gluster-node-01 ~]# mkdir -p /data/glusterfs/repvol_01
[root@gluster-node-01 ~]# gluster volume create repvol replica 3 gluster-node-01:/data/glusterfs/repvol gluster-node-02:/data/glusterfs/repvol gluster-node-03:/data/glusterfs/repvol
看到如下输出,表示命令执行成功:
volume create: repvol: success: please start the volume to access data
# 2、启动复制卷
[root@gluster-node-01 ~]# gluster volume start repvol
看到如下输出信息表示启动成功:
volume start: repvol: success
# 3、查看复制卷状态
[root@gluster-node-01 ~]# gluster volume status
Status of volume: repvol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gluster-node-01:/data/glusterfs/repvo
l 49152 0 Y 27968
Brick gluster-node-02:/data/glusterfs/repvo
l 49152 0 Y 18776
Brick gluster-node-03:/data/glusterfs/repvo
l 49152 0 Y 20730
Self-heal Daemon on localhost N/A N/A Y 27985
Self-heal Daemon on gluster-node-02 N/A N/A Y 18793
Self-heal Daemon on gluster-node-03 N/A N/A Y 20747
Task Status of Volume repvol
# 4、查看复制卷信息
[root@gluster-node-01 ~]# gluster volume info
Volume Name: repvol
Type: Replicate
Volume ID: f49fafa4-4a00-4793-94fe-b29627659a58
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster-node-01:/data/glusterfs/repvol
Brick2: gluster-node-02:/data/glusterfs/repvol
Brick3: gluster-node-03:/data/glusterfs/repvol
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
4.2、客户机安装GluseterFS客户端软件
# 1、客户机安装GluseterFS客户端软件
# 为了确保客户端软件与服务端软件版本的兼容性与一致性,客户端先安装与服务器同样的yum仓库,再安装客户端软件
yum install centos-release-gluster
yum install glusterfs-fuse
# 2、客户机添加GlusterFS节点描述
编辑/etc/hosts文件,添加GlusterFS节点信息
192.168.1.138 gluster-node-01
192.168.1.140 gluster-node-02
192.168.1.141 gluster-node-03
# 3、GlusterFS节点中添加客户机描述
在/etc/hosts文件中添加客户机的IP和名称信息,比如:
192.168.1.170 gluster-client
# 4、节点中添加客户机网段为信任网段(3个节点都执行,如果防火墙已关闭可忽略此操作)
firewall-cmd --permanent --add-rich-rule "rule family=ipv4 source address=192.168.1.1/24 accept"
firewall-cmd –reload
# 5、客户机创建存储挂载点
mkdir /data/glusterfs/repvol
# 6、客户机挂载复制卷
mount -t glusterfs gluster-node-01:repvol /data/glusterfs/repvol
# 7、客户机创建100M的文件验证复制卷
time dd if=/dev/zero of=/data/glusterfs/repvol/hello bs=100M count=1
# 7、、查看各个节点是否同步了hello文件(3个节点都查看验证)
ls -lh /data/glusterfs/repvol
看到如下输出,表示复制卷创建成功:
total 100M
-rw-r--r--. 2 root root 100M Jun 21 14:40 hello
# 8、、优化配置
gluster volume set repvol write-behind on
gluster volume set repvol io-thread-count 2
# 说明:第二个命令设置IO线程数,用来提高读写速度,请根据CPU的线程数来设置,不超过线程数即可。
4.3、安装keepalived
说明:经过上面的步骤,GlusterFS已经可以提供相对可靠的复制卷模式的存储服务,但3台服务器3个IP,客户端使用其中任何一个IP来挂载GlusterFS,都无法避免那台服务器或服务停止的情况下导致存储服务不可用的问题。为此,我们增加keepalived服务,通过keepalived服务提供VIP的模式,对外提供统一的挂载IP。通过keepalived服务的自动服务监控与VIP动态迁移的功能,通过keepalived的vrrp_script脚本监控GlusterFS服务的状态,实现VIP的动态迁移与GlusterFS状态监控的功能,从而实现可靠的GlusterFS存储服务。
# 1、三个节点分别安装
yum install keepalived -y
# 2、编辑keepalived配置文件
# 主节点(192.168.1.138)
[root@gluster-node-01 keepalived]# vim keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost.loc
}
notification_email_from root@localhost.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
script_user root
enable_script_security
router_id GFS_HA_MASTER
vrrp_skip_check_adv_addr
#ivrrp_strict
vrrp_iptables
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_sync_group GFS_HA_GROUP {
group {
GFS_HA_1
}
}
vrrp_script monitor_glusterfs_status {
script "/usr/libexec/keepalived/monitor_glusterfs_status.sh"
interval 5
fall 2
rise 1
weight -25
}
vrrp_instance GFS_HA_1 {
state MASTER
interface ens33
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 192.168.1.138
unicast_peer {
192.168.1.140
192.168.1.141
}
virtual_ipaddress {
192.168.1.142/24 dev ens33 label ens33:0
}
track_script {
monitor_glusterfs_status
}
track_interface {
ens33
}
notify_master "/usr/libexec/keepalived/keepalived_notify.sh master"
notify_backup "/usr/libexec/keepalived/keepalived_notify.sh backup"
notify_fault "/usr/libexec/keepalived/keepalived_notify.sh fault"
notify_stop "/usr/libexec/keepalived/keepalived_notify.sh stop"
}
# 3、复制脚本文件到其他节点(脚本内容如下所示)
cd /usr/libexec/keepalived
scp keepalived_notify.sh monitor_glusterfs_status.sh 192.168.1.140:/usr/libexec/keepalived/
scp keepalived_notify.sh monitor_glusterfs_status.sh 192.168.1.141:/usr/libexec/keepalived/
上述命令把脚本文件从192.168.1.138复制到192.168.1.140和192.168.1.141。
# 4、复制keepalived配置文件并修改
scp /etc/keepalived/keepalived.conf 192.168.1.140:/etc/keepalived/
scp /etc/keepalived/keepalived.conf 192.168.1.141:/etc/keepalived/
修改192.168.1.140和192.168.1.141上的keepalived.conf配置文件,把MASTER修改为BACKUP,priority分别修改为90和85,unicast_src_ip修改为对应的本机IP,unicast_peer修改为对方的IP。
重点是红色字体部分的内容,个别说明如下:
script_user root
enable_script_security
这俩行说明keepalived脚本的用户和安全设置,如果不是以root用户身份运行服务,请配置对应的用户,并确保用户有执行脚本的权限。
vrrp_script monitor_glusterfs_status
申明vrrp_script脚本,也就是keepalived通过脚本要监控的服务,我们这里要监控GlusterFS服务,因此,配置文件GlusterFS状态监控。script后面为脚本的具体存放路径。
fall 2,表示监测到2次失败就认为服务异常。
rise 1,表示监测到1次正常就认为服务正常。
weight 后面为权重调整参数,当监测脚本返回0时,权重不变;当检测脚本返回非0时,本机权重=原来的权重+状态码×调整权重,因此,监测脚本失败返回1时,本机权重会减少25。
script脚本一定要存放在/usr/libexec/keepalived下面,否则脚本中的很多命令不识别,脚本运行会以127的状态码结束。
ens192为vrrp绑定的内部网卡名,一般也就是虚拟IP所绑定的网卡名
unicast_src_ip通过单播方式监控心跳时,本机的心跳IP,下面为对方心跳的IP
virtual_ipaddress中配置VIP的信息,也就是虚拟IP以及子网掩码,虚拟IP绑定的网卡
track_script中配置keepalived进行状态追踪的脚本名称,也就是vrrp_script后面的名字
最后的4个notify脚本,是当本机状态切换为对应的状态时,要执行的脚本。
上述配置文件涉及到2个脚本文件,分别对应notify脚本和服务状态监控脚本。
[root@gluster-node-01 keepalived]# vim /usr/libexec/keepalived/keepalived_notify.sh
#!/bin/bash
#keepalived script for glusterd
master() {
logger -is "this server become to master, now check glusterd status..."
systemctl status glusterd >/dev/null 2>&1
logger -is "check gluster status result: $?"
if [ "$?" != "0" ];then
logger -is "glusterd service is not running, now start it..."
systemctl start glusterd >/dev/null 2>&1
logger -is "start glusterd service result: $?"
else
logger -is "glusterd service is running"
fi
}
backup() {
logger -is "this server become to backup, now check glusterd status..."
systemctl status glusterd >/dev/null 2>&1
logger -is "check gluster status result: $?"
if [ "$?" != "0" ];then
logger -is "glusterd service is not running, now start it..."
systemctl start glusterd >/dev/null 2>&1
logger -is "start glusterd service result: $?"
fi
}
case $1 in
master)
master
;;
backup)
backup
;;
fault)
backup
;;
stop)
backup
#systemctl restart keepalived
;;
*)
echo $"Usage: $0 {master|backup|fault|stop}"
esac
################################################################################################
[root@gluster-node-01 keepalived]# vim /usr/libexec/keepalived/monitor_glusterfs_status.sh
#!/bin/bash
#check glusterfsd and glusterd process
pidof glusterd
if [ $? -eq 0 ]; then
pidof glusterfsd
if [ $? -eq 0 ]; then
STATUS=0
else
logger -is "glusterfsd service is not running, now try to restart it..."
systemctl start glusterfsd >/dev/null 2>&1
pidof glusterfsd
if [ $? -eq 0 ]; then
STATUS=0
else
logger -is "failed to restart glusterfsd service"
STATUS=1
fi
fi
else
logger -is "glusterd service is not running, so try to restart it"
systemctl start glusterd >/dev/null 2>&1
pidof glusterd
if [ $? -eq 0 ]; then
pidof glusterfsd
if [ $? -eq 0 ]; then
STATUS=0
else
logger -is "glusterfsd service is not running, now try to restart it..."
systemctl start glusterfsd >/dev/null 2>&1
pidof glusterfsd
if [ $? -eq 0 ]; then
STATUS=0
else
logger -is "failed to restart glusterfsd service"
STATUS=1
fi
fi
else
logger -is "failed to restart glusterd service"
#pkill keepalived
STATUS=1
fi
fi
exit $STATUS
################################################################################################
# 5、启动keepalived服务(三个节点)
systemctl start keepalived
systemctl enable keepalived
systemctl status keepalived
# 6、查看VIP绑定
# 可以看到,由于138配置的默认优先级高,而且是MASTER,因此,服务正常的情况下,虚拟IP绑定在了138节点上。
[root@gluster-node-01 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:72:10:71 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.138/24 brd 192.168.1.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.1.142/24 scope global secondary ens33:0
valid_lft forever preferred_lft forever
inet6 fe80::95b4:2a39:bac0:53c7/64 scope link noprefixroute
valid_lft forever preferred_lft forever
4.4、高可用测试
# 1、使用VIP地址重新挂载
mount /data/glusterfs/repvol
mount -t glusterfs 192.168.1.142:repvol /data/glusterfs/repvol
# 2、验证虚拟IP漂移
[root@gluster-node-01 ~] systemctl stop keepalived
在138节点上执行上述命令,可以看到虚拟IP会从本机漂移到优先级第二高的节点上。
# 查看节点138,VIP已经不在该节点上
[root@gluster-node-01 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:72:10:71 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.138/24 brd 192.168.1.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::95b4:2a39:bac0:53c7/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 查看节点140,VIP已经切换到该节点上
[root@gluster-node-02 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:22:e7:20 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.140/24 brd 192.168.1.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.1.142/24 scope global secondary ens33:0
valid_lft forever preferred_lft forever
inet6 fe80::60cc:310d:36d9:946f/64 scope link noprefixroute
valid_lft forever preferred_lft forever
# 3、验证VIP漂移后,glusterfs是否可用
# 在客户端上传或创建新的文件,等待一段时间后,在140节点和141节点上查看是否有与客户端上传的相同文件
自此,Gluster+Keepalived实现了3台服务器的高可靠的存储服务。
总结:整理不易,如果对你有帮助,可否点赞关注一下?
更多详细内容请参考:企业级K8s集群运维实战