《Linux运维实战:Keepalived+GlusterFS多机热备集群方案》



一、高可用方案

之所以选择三台机做GlusterFS热备,是因为双机只能做2个复制卷,2个复制卷的模式则无法避免脑裂问题,而3个复制卷模式则可以大大减少脑裂的几率,与3快硬盘做RAID5的原理相似。相关描述与介绍,请参考GlusterFS官方对此的说明。一旦发生脑裂,请参考GlusterFS官方给出的步骤进行检查、修复。

选择GLusterFS与Keepalived结合,是因为没有Keepalived的VIP功能,客户端在挂载GlusterFS时候只指定3个节点中的一个节点的IP,而如果该节点网络故障,即使其他节点正常,也会导致客户端无法访问GlusterFS。而Keepalived的VIP功能可以为客户端提供统一的虚拟IP,只要Keepalived的监控脚本对所要监控的状态进行处理,即可根据GlusterFS的状态、Keepalived状态以及其他如网络连通性等,自动漂移VIP到可用的节点上。对于客户端来说,始终挂载的是可用状态的节点,从而避免单一节点网络故障而无法访问的情况。


二、环境信息

IP主机名存储系统VIP软件
192.168.1.138gluster-node-01/dev/sdbCentOS-7.6192.168.1.142gluserfs+keepalived
192.168.1.140gluster-node-02/dev/sdbCentOS-7.6192.168.1.142gluserfs+keepalived
192.168.1.141gluster-node-03/dev/sdbCentOS-7.6192.168.1.142gluserfs+keepalived
192.168.1.170gluster-clientCentOS-7.6gluserfs

说明:192.168.1.142用于对外提供存储服务。


三、基础配置

# 1、/etc/hosts设置(3个节点都要设置)
192.168.1.138 gluster-node-01
192.168.1.140 gluster-node-02
192.168.1.141 gluster-node-03

# 2、主机名设置,退出登录,再次登陆,自动显示主机名为所设置的名称
#在节点1上设置:
hostnamectl set-hostname gluster-node-01
#在节点2上设置:
hostnamectl set-hostname gluster-node-02
#在节点3上设置:
hostnamectl set-hostname gluster-node-03

# 3、针对数据盘(如果没有,请添加硬盘,使用独立的硬盘存储更为可靠),创建文件系统(3个节点都执行)
mkfs.xfs /dev/sdb1

# 4、创建GlusterFS存储挂载点(3个节点都执行)
mkdir -p /data/glusterfs
echo "/dev/sdb1 /data/glusterfs xfs defaults 0 0" >> /etc/fstab
mount –a

说明:如果没有独立的硬盘做数据存储,可以直接创建该文件夹即可,使用文件夹来做数据存储。


四、部署步骤

4.1、安装GlusterFS相关软件包

# 1、安装GlusterFS的yum源(3个节点都执行)
yum install centos-release-gluster -y

# 2、安装GlusterFS相关软件包(3个节点都执行)
yum install –y glusterfs glusterfs-server glusterfs-cli glusterfs-geo-replication glusterfs-rdma 

# 3、启动Glusterd服务(3个节点都执行)
systemctl start glusterd glusterfsd glusterfssharedstorage
systemctl enable glusterd glusterfsd glusterfssharedstorage

可以看到Gluster相关的3个服务都会自动启动:
systemctl status glusterd glusterfsd glusterfssharedstorage

# 4、防火墙添加Gluster端口(如果未开启防火墙,则跳过此步;否则3个节点都执行)
firewall-cmd --zone=public --add-port=24007/tcp –permanent
firewall-cmd --reload

# 5、在节点1上添加节点2、3为信任节点
[root@gluster-node-01 ~]# gluster peer probe gluster-node-02
peer probe: success
[root@gluster-node-01 ~]# gluster peer probe gluster-node-03
peer probe: success

# 6、查看节点状态
# 可以看到,节点1上把节点2、3当作信任节点;
# 同样在节点2上查看状态,可以看到节点2把节点1、3当作信任节点,
# 节点3把节点1、2当作信任节点
###################################################
[root@gluster-node-01 ~]# gluster peer status
Number of Peers: 2

Hostname: gluster-node-02
Uuid: 33b41ce0-4e06-44bf-8a5e-2b1568c39825
State: Accepted peer request (Connected)

Hostname: gluster-node-03
Uuid: e4106bd1-2504-4205-8852-ed86ae8ed946
State: Accepted peer request (Connected)
###################################################
[root@gluster-node-02 ~]# gluster peer status
Number of Peers: 2
Hostname: gluster-node-01
Uuid: 6f5fe502-20c6-407d-b337-a349770b4374
State: Peer in Cluster (Connected)

Hostname: gluster-node-03
Uuid: a02576c8-c68b-4bea-8725-ec492de3f40c
State: Peer in Cluster (Connected)
###################################################
[root@gluster-node-03 ~]# gluster peer status
Number of Peers: 2

Hostname: gluster-node-01
Uuid: 6f5fe502-20c6-407d-b337-a349770b4374
State: Peer in Cluster (Connected)

Hostname: gluster-node-02
Uuid: 50ccce86-2528-443b-bc8e-dee243f4ed16
State: Peer in Cluster (Connected)

4.2、创建复制卷

# 1、在节点1上建立复制卷
[root@gluster-node-01 ~]# mkdir -p /data/glusterfs/repvol_01
[root@gluster-node-01 ~]# gluster volume create repvol replica 3 gluster-node-01:/data/glusterfs/repvol gluster-node-02:/data/glusterfs/repvol gluster-node-03:/data/glusterfs/repvol
看到如下输出,表示命令执行成功:
volume create: repvol: success: please start the volume to access data

# 2、启动复制卷
[root@gluster-node-01 ~]# gluster volume start repvol
看到如下输出信息表示启动成功:
volume start: repvol: success

# 3、查看复制卷状态
[root@gluster-node-01 ~]# gluster volume status
Status of volume: repvol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick gluster-node-01:/data/glusterfs/repvo
l                                           49152     0          Y       27968
Brick gluster-node-02:/data/glusterfs/repvo
l                                           49152     0          Y       18776
Brick gluster-node-03:/data/glusterfs/repvo
l                                           49152     0          Y       20730
Self-heal Daemon on localhost               N/A       N/A        Y       27985
Self-heal Daemon on gluster-node-02         N/A       N/A        Y       18793
Self-heal Daemon on gluster-node-03         N/A       N/A        Y       20747
 
Task Status of Volume repvol

# 4、查看复制卷信息
[root@gluster-node-01 ~]# gluster volume info
 
Volume Name: repvol
Type: Replicate
Volume ID: f49fafa4-4a00-4793-94fe-b29627659a58
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: gluster-node-01:/data/glusterfs/repvol
Brick2: gluster-node-02:/data/glusterfs/repvol
Brick3: gluster-node-03:/data/glusterfs/repvol
Options Reconfigured:
cluster.granular-entry-heal: on
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

4.2、客户机安装GluseterFS客户端软件

# 1、客户机安装GluseterFS客户端软件
# 为了确保客户端软件与服务端软件版本的兼容性与一致性,客户端先安装与服务器同样的yum仓库,再安装客户端软件
yum install centos-release-gluster
yum install glusterfs-fuse

# 2、客户机添加GlusterFS节点描述
编辑/etc/hosts文件,添加GlusterFS节点信息
192.168.1.138 gluster-node-01
192.168.1.140 gluster-node-02
192.168.1.141 gluster-node-03

# 3、GlusterFS节点中添加客户机描述
在/etc/hosts文件中添加客户机的IP和名称信息,比如:
192.168.1.170 gluster-client

# 4、节点中添加客户机网段为信任网段(3个节点都执行,如果防火墙已关闭可忽略此操作)
firewall-cmd --permanent --add-rich-rule "rule family=ipv4 source address=192.168.1.1/24 accept"
firewall-cmd –reload

# 5、客户机创建存储挂载点
mkdir /data/glusterfs/repvol

# 6、客户机挂载复制卷
mount -t glusterfs gluster-node-01:repvol /data/glusterfs/repvol

# 7、客户机创建100M的文件验证复制卷
time dd if=/dev/zero of=/data/glusterfs/repvol/hello bs=100M count=1

# 7、、查看各个节点是否同步了hello文件(3个节点都查看验证)
ls -lh /data/glusterfs/repvol
看到如下输出,表示复制卷创建成功:
total 100M
-rw-r--r--. 2 root root 100M Jun 21 14:40 hello

# 8、、优化配置
gluster volume set repvol write-behind on
gluster volume set repvol io-thread-count 2
# 说明:第二个命令设置IO线程数,用来提高读写速度,请根据CPU的线程数来设置,不超过线程数即可。

4.3、安装keepalived

说明:经过上面的步骤,GlusterFS已经可以提供相对可靠的复制卷模式的存储服务,但3台服务器3个IP,客户端使用其中任何一个IP来挂载GlusterFS,都无法避免那台服务器或服务停止的情况下导致存储服务不可用的问题。为此,我们增加keepalived服务,通过keepalived服务提供VIP的模式,对外提供统一的挂载IP。通过keepalived服务的自动服务监控与VIP动态迁移的功能,通过keepalived的vrrp_script脚本监控GlusterFS服务的状态,实现VIP的动态迁移与GlusterFS状态监控的功能,从而实现可靠的GlusterFS存储服务。

# 1、三个节点分别安装
yum install keepalived -y

# 2、编辑keepalived配置文件
# 主节点(192.168.1.138)
[root@gluster-node-01 keepalived]# vim keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     root@localhost.loc
   }
   notification_email_from root@localhost.loc
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   script_user root
   enable_script_security
   router_id GFS_HA_MASTER
   vrrp_skip_check_adv_addr
   #ivrrp_strict
   vrrp_iptables
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_sync_group GFS_HA_GROUP {
   group {
        GFS_HA_1
   }
}

vrrp_script monitor_glusterfs_status {
   script "/usr/libexec/keepalived/monitor_glusterfs_status.sh"
   interval 5
   fall 2
   rise 1
   weight -25
}

vrrp_instance GFS_HA_1 {
    state MASTER
    interface ens33
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    unicast_src_ip 192.168.1.138
    unicast_peer {
        192.168.1.140
        192.168.1.141
    }
    virtual_ipaddress {
        192.168.1.142/24 dev ens33 label ens33:0
    }
    track_script {
        monitor_glusterfs_status
    }
    track_interface {
        ens33
    }
    notify_master "/usr/libexec/keepalived/keepalived_notify.sh master"
    notify_backup "/usr/libexec/keepalived/keepalived_notify.sh backup"
    notify_fault "/usr/libexec/keepalived/keepalived_notify.sh fault"
    notify_stop "/usr/libexec/keepalived/keepalived_notify.sh stop"
}

# 3、复制脚本文件到其他节点(脚本内容如下所示)
cd /usr/libexec/keepalived
scp keepalived_notify.sh monitor_glusterfs_status.sh 192.168.1.140:/usr/libexec/keepalived/
scp keepalived_notify.sh monitor_glusterfs_status.sh 192.168.1.141:/usr/libexec/keepalived/
上述命令把脚本文件从192.168.1.138复制到192.168.1.140和192.168.1.141。

# 4、复制keepalived配置文件并修改
scp /etc/keepalived/keepalived.conf 192.168.1.140:/etc/keepalived/
scp /etc/keepalived/keepalived.conf 192.168.1.141:/etc/keepalived/
修改192.168.1.140和192.168.1.141上的keepalived.conf配置文件,把MASTER修改为BACKUP,priority分别修改为90和85,unicast_src_ip修改为对应的本机IP,unicast_peer修改为对方的IP。

重点是红色字体部分的内容,个别说明如下:
script_user root
enable_script_security
这俩行说明keepalived脚本的用户和安全设置,如果不是以root用户身份运行服务,请配置对应的用户,并确保用户有执行脚本的权限。
vrrp_script monitor_glusterfs_status
申明vrrp_script脚本,也就是keepalived通过脚本要监控的服务,我们这里要监控GlusterFS服务,因此,配置文件GlusterFS状态监控。script后面为脚本的具体存放路径。
fall 2,表示监测到2次失败就认为服务异常。
rise 1,表示监测到1次正常就认为服务正常。
weight 后面为权重调整参数,当监测脚本返回0时,权重不变;当检测脚本返回非0时,本机权重=原来的权重+状态码×调整权重,因此,监测脚本失败返回1时,本机权重会减少25。


script脚本一定要存放在/usr/libexec/keepalived下面,否则脚本中的很多命令不识别,脚本运行会以127的状态码结束。
ens192为vrrp绑定的内部网卡名,一般也就是虚拟IP所绑定的网卡名
unicast_src_ip通过单播方式监控心跳时,本机的心跳IP,下面为对方心跳的IP
virtual_ipaddress中配置VIP的信息,也就是虚拟IP以及子网掩码,虚拟IP绑定的网卡
track_script中配置keepalived进行状态追踪的脚本名称,也就是vrrp_script后面的名字
最后的4个notify脚本,是当本机状态切换为对应的状态时,要执行的脚本。
上述配置文件涉及到2个脚本文件,分别对应notify脚本和服务状态监控脚本。

[root@gluster-node-01 keepalived]# vim /usr/libexec/keepalived/keepalived_notify.sh 
#!/bin/bash
#keepalived script for glusterd
master() {
        logger -is "this server become to master, now check glusterd status..."
        systemctl status glusterd >/dev/null 2>&1
        logger -is "check gluster status result: $?"
        if [ "$?" != "0" ];then
                logger -is "glusterd service is not running, now start it..."
                systemctl start glusterd >/dev/null 2>&1
                logger -is "start glusterd service result: $?"
        else
                logger -is "glusterd service is running"
        fi
}

backup() {
        logger -is "this server become to backup, now check glusterd status..."
        systemctl status glusterd >/dev/null 2>&1
        logger -is "check gluster status result: $?"
        if [ "$?" != "0" ];then
                logger -is "glusterd service is not running, now start it..."
                systemctl start glusterd >/dev/null 2>&1
                logger -is "start glusterd service result: $?"
        fi
}

case $1 in
        master)
                master
        ;;
        backup)
                backup
        ;;
        fault)
                backup
        ;;
        stop)
                backup
                #systemctl restart keepalived
        ;;
        *)
                echo $"Usage: $0 {master|backup|fault|stop}"
esac
################################################################################################
[root@gluster-node-01 keepalived]# vim /usr/libexec/keepalived/monitor_glusterfs_status.sh 
#!/bin/bash
#check glusterfsd and glusterd process
pidof glusterd
if [ $? -eq 0 ]; then
        pidof glusterfsd
        if [ $? -eq 0 ]; then
                STATUS=0
        else
                logger -is "glusterfsd service is not running, now try to restart it..."
                systemctl start glusterfsd >/dev/null 2>&1
                pidof glusterfsd
                if [ $? -eq 0 ]; then
                        STATUS=0
                else
                        logger -is "failed to restart glusterfsd service"
                        STATUS=1
                fi
        fi
else
        logger -is "glusterd service is not running, so try to restart it"
        systemctl start glusterd >/dev/null 2>&1
        pidof glusterd
        if [ $? -eq 0 ]; then
                pidof glusterfsd
                if [ $? -eq 0 ]; then
                        STATUS=0
                else
                        logger -is "glusterfsd service is not running, now try to restart it..."
                        systemctl start glusterfsd >/dev/null 2>&1
                        pidof glusterfsd
                        if [ $? -eq 0 ]; then
                                STATUS=0
                        else
                                logger -is "failed to restart glusterfsd service"
                                STATUS=1
                        fi
                fi
        else
                logger -is "failed to restart glusterd service"
                #pkill keepalived
                STATUS=1
        fi
fi
exit $STATUS
################################################################################################
# 5、启动keepalived服务(三个节点)
systemctl start keepalived
systemctl enable keepalived
systemctl status keepalived

# 6、查看VIP绑定
# 可以看到,由于138配置的默认优先级高,而且是MASTER,因此,服务正常的情况下,虚拟IP绑定在了138节点上。
[root@gluster-node-01 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:72:10:71 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.138/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.1.142/24 scope global secondary ens33:0
       valid_lft forever preferred_lft forever
    inet6 fe80::95b4:2a39:bac0:53c7/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

4.4、高可用测试

# 1、使用VIP地址重新挂载
mount /data/glusterfs/repvol
mount -t glusterfs 192.168.1.142:repvol  /data/glusterfs/repvol

# 2、验证虚拟IP漂移
[root@gluster-node-01 ~] systemctl stop keepalived
在138节点上执行上述命令,可以看到虚拟IP会从本机漂移到优先级第二高的节点上。
# 查看节点138,VIP已经不在该节点上 
[root@gluster-node-01 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:72:10:71 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.138/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::95b4:2a39:bac0:53c7/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
# 查看节点140,VIP已经切换到该节点上        
[root@gluster-node-02 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:22:e7:20 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.140/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.1.142/24 scope global secondary ens33:0
       valid_lft forever preferred_lft forever
    inet6 fe80::60cc:310d:36d9:946f/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

# 3、验证VIP漂移后,glusterfs是否可用
# 在客户端上传或创建新的文件,等待一段时间后,在140节点和141节点上查看是否有与客户端上传的相同文件

自此,Gluster+Keepalived实现了3台服务器的高可靠的存储服务。


总结:整理不易,如果对你有帮助,可否点赞关注一下?

更多详细内容请参考:企业级K8s集群运维实战

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

东城绝神

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值