分布式存储-clusterfs

最新推荐文章于 2024-05-24 09:20:36 发布

NewRain001

最新推荐文章于 2024-05-24 09:20:36 发布

阅读量1.7k

点赞数

分类专栏： Linux 系统基础分布式存储

本文链接：https://blog.csdn.net/NewRain_wang/article/details/111352436

版权

Linux 系统基础同时被 2 个专栏收录

10 篇文章 2 订阅

订阅专栏

分布式存储

1 篇文章 0 订阅

订阅专栏

分布式存储

Glusterfs分布式文件系统

GlusterFS简介

PB级容量  高可用性  读/写性能  基于文件系统级别共享  分布式

    GlusterFS（GNU ClusterFile System）是一种全对称的开源分布式文件系统，所谓全对称是指GlusterFS采用弹性哈希算法，没有中心节点，所有节点全部平等。GlusterFS配置方便，稳定性好，可轻松达到PB级容量，数千个节点。
    2011年被红帽收购，之后推出了基于GlusterFS的 Red Hat Storage Server，增加了针对KVM的许多特性。可用作为KVM存储image存储集群，也可以为LB或HA提供存储。


GlusterFS重要特性：
全对称架构
支持多种卷类型（类似RAID0/1/5/10/01）
支持卷级别的压缩
支持NFS
支持SMB
支持Hadoop
支持Openstack

GlusterFS重要概念：
brick:              GlusterFS的基本单元，以节点服务器目录形式展现。
Volume:             多个 bricks 的逻辑集合
Metadata:           元数据，用于描述文件、目录等的信息。
Self-heal:             用于后台运行检测复本卷中文件和目录的不一致性并解决这些不一致。
GlusterFS Server：数据存储服务器，即组成GlusterFs存储集群的节点。
GlusterFS Client:   使用GlusterFS存储服务器的服务器，例如KVM、Openstack、LB RealServer、HA node。

准备环境

4台虚拟机(当然可以更多节点)
操作系统		 IP					主机名	
Centos7.4		192.168.62.203	   node1
Centos7.4		192.168.62.204	   node2
Centos7.4		192.168.62.135     node3
Centos7.4		192.168.62.166     node4

所有机器关闭防火墙
# systemctl stop firewalld && setenforce 0
分别修改主机名称：
[root@192 ~]# hostnamectl set-hostname node1
[root@192 ~]# hostnamectl set-hostname node2
[root@192 ~]# hostnamectl set-hostname node3
[root@192 ~]# hostnamectl set-hostname node4
配置解析，所有机器：
[root@192 yum.repos.d]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.62.131 node01
192.168.62.231 node02
192.168.62.168 node03
192.168.62.166 node04

安装glusterfs服务（所有主机）
[root@node1 ~]# yum install centos-release-gluster glusterfs-server samba rpcbind -y
这条命令需要敲两遍

检查时间：
yum install -y ntpdate
ntpdate ntp6.aliyun.com

如果下载失败，修改glusterfs的yum源配置文件

在这里插入图片描述

内网环境，需要找到下载的这些rpm安装包
[root@node1 ~]# ls glusterfs/
glusterfs-3.10.3-1.el7.x86_64.rpm            glusterfs-client-xlators-3.10.3-1.el7.x86_64.rpm  
glusterfs-server-3.10.3-1.el7.x86_64.rpm  glusterfs-api-3.10.3-1.el7.x86_64.rpm  
glusterfs-fuse-3.10.3-1.el7.x86_64.rpm     userspace-rcu-0.7.16-3.el7.x86_64.rpm
glusterfs-cli-3.10.3-1.el7.x86_64.rpm        glusterfs-libs-3.10.3-1.el7.x86_64.rpm

所有节点启动服务并设置为开机自启
[root@node1 ~]# systemctl start glusterd.service
[root@node1 ~]# systemctl enable glusterd.service
[root@node1 ~]# glusterfs -V
glusterfs 7.3

创建Glusterfs集群
添加节点的过程就是创建集群的过程，在node01一台上操作就可以，不需要添加本节点
[[root@node1 ~]# gluster peer probe node2
peer probe: success. 
[root@node1 ~]# gluster peer probe node3
peer probe: success. 
[root@node1 ~]# gluster peer probe node4
peer probe: success. 
[root@node1 ~]# gluster peer status
Number of Peers: 3

Hostname: node2
Uuid: 1a99d57f-9575-4ba0-9cc9-9c2d3b4b4e3f
State: Peer in Cluster (Connected)

Hostname: node3
Uuid: 06f43aee-3edb-4cc5-b886-fee2e52c8b7f
State: Peer in Cluster (Connected)

Hostname: node4
Uuid: 64556459-b332-463f-9e2c-4067650544e0
State: Peer in Cluster (Connected)

从集群中删除节点
[root@node1 ~]# gluster peer detach node4
All clients mounted through the peer which is getting detached need to be remounted using one of the other active peers in the trusted storage pool to ensure client gets notification on any changes done on the gluster configuration and if the same has been done do you want to proceed? (y/n) y
peer detach: success
[root@node1 ~]# gluster peer status
Number of Peers: 2

Hostname: node2
Uuid: 1a99d57f-9575-4ba0-9cc9-9c2d3b4b4e3f
State: Peer in Cluster (Connected)

Hostname: node3
Uuid: 06f43aee-3edb-4cc5-b886-fee2e52c8b7f
State: Peer in Cluster (Connected)
[root@node1 ~]# gluster peer probe node4  #在重新加回来
peer probe: success.

glusgerfs卷的类型
基本类型：条带，复制，哈希。然后还有两两组合和三种类型同时使用，总共加起来共7种，新版的还有冗余卷

分布卷
在这里插入图片描述

分布巻

分布卷也称为哈希卷，多个文件在多个 brick 上使用哈希算法随机存储。
哈希卷类似与负载均衡（实际上不是很均衡），他会将完整的数据分成几个部分，分别存储在每一个brick上
应用场景: 大量小文件
优点：读/写性能好
缺点：如果存储或服务器故障，数据将丢失

创建数据分区
所有server节点分别创建/data0/gluster目录，所谓brick的位置，用于存储数据
# mkdir -p /data0/gluster

创建volume,在控制节点上操作
[root@node1 yum.repos.d]# gluster
Welcome to gluster prompt, type 'help' to see the available commands.
gluster> volume create datavol1 transport tcp node1:/data0/gluster/data1 node2:/data0/gluster/data1 node3:/data0/gluster/data1 node4:/data0/gluster/data1 force

volume create: datavol1: success: please start the volume to access data

启动volume
因为默认是分布巻（哈希卷），所以卷的类型没有指定，datavol1 这个volume拥有4个brick，分布在4个peer节点
gluster> volume start datavol1
volume start: datavol1: success

查看卷信息
gluster> volume info datavol1
 
Volume Name: datavol1
Type: Distribute
Volume ID: c01c8a3e-5554-4d09-8e1c-c6531f4a6e1f
Status: Started
Snapshot Count: 0
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: node1:/data0/gluster/data1
Brick2: node2:/data0/gluster/data1
Brick3: node3:/data0/gluster/data1
Brick4: node4:/data0/gluster/data1
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on

查看卷状态
gluster> volume status datavol1
Status of volume: datavol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data0/gluster/data1            49152     0          Y       1858 
Brick node2:/data0/gluster/data1            49152     0          Y       1784 
Brick node3:/data0/gluster/data1            49152     0          Y       1781 
Brick node4:/data0/gluster/data1            49152     0          Y       1781 
 
Task Status of Volume datavol1
------------------------------------------------------------------------------
There are no active volume tasks
===================================================================================
删除卷
需要提前停止卷运行
gluster> volume stop datavol1  #停止
Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y
volume stop: datavol1: success
gluster> volume delete datavol1  #删除
Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y
volume delete: datavol1: success

找台虚拟机作为客户端，去挂载
## 如果是一台独立的客户端，需要安装客户端软件
## yum -y install glusterfs-fuse
[root@node4 ~]# mount -t glusterfs node01:/datavol1 /mnt
[root@node4 ~]# touch  /mnt/fenbu1.txt      #会随机分配到某个节点上
================================================================
去各个节点查看，不一定分布到哪个节点
[root@node1 ~]# ls /data0/gluster/data1/
fenbu.txt
[root@node4 ~]# touch /mnt/fenbu2.txt
[root@node4 ~]# ls /data0/gluster/data1/
fenbu2.txt

[root@node4 ~]# touch /mnt/fenbu3.txt  #在创建一个

以上是volume的状态信息，可以看到在每一个节点上启动一个volume后，gluster会自动的启动相关的进程，Port机监听的
端口。
在使用ps去查看的时候此时会有3个进程：
    glusterd      #管理进程
    glusterfsd   #brick进程，因为本机上只有一个brick
    glusterfs     #默认启动的nfs的协议进程，是可以关闭的
在另外一个节点上会启动相同的进程。

卷的扩容和缩容

收缩卷

注意：收缩之前数据会自动迁移

[root@node1 ~]# ls /data0/gluster/data1/
fenbu.txt
[root@node4 ~]# ls /data0/gluster/data1/
fenbu2.txt  fenbu3.txt
[root@node1 glusterfs]# gluster
gluster> volume remove-brick datavol1 node4:/data0/gluster/data1 start #开始迁移
It is recommended that remove-brick be run with cluster.force-migration option disabled to prevent possible data corruption. Doing so will ensure that files that receive writes during migration will not be migrated and will need to be manually copied after the remove-brick commit operation. Please check the value of the option and update accordingly. 
Do you want to continue with your current cluster.force-migration settings? (y/n) y
volume remove-brick start: success
ID: bf71f743-7d85-4366-b084-e98ecf6d09ce
gluster> volume remove-brick datavol1 node4:/data0/gluster/data1 status	   #查看迁移状态
gluster> volume remove-brick datavol1 node4:/data0/gluster/data1 commit    #提交
volume remove-brick commit: success
gluster> volume info  datavol1	   #再次查看状态，就看不到node4了

数据也会自动迁移到其他节点的brick上

[root@node3 ~]# ls /data0/gluster/data1/    #随机移动到了node3机器上面
fenbu2.txt  fenbu3.txt
[root@node4 ~]# ls /data0/gluster/data1/    #node4为空了
[root@node4 ~]#

卷的扩容

gluster> volume add-brick datavol1 node4:/data0/gluster/data1 force #扩容，但是数据自动分布上去
volume add-brick: success
gluster> volume info datavol1	    #再次查看卷信息，就会有node4节点

卷的重新均衡

gluster> volume rebalance datavol1 start
volume rebalance: datavol1: success: Rebalance on datavol1 has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: dfcd2f01-59c8-425b-8fe2-c8ad1fed7c13
gluster> volume rebalance datavol1 status

复制卷

在这里插入图片描述

多个文件在多个brick上复制多份，brick 的数目要与需要复制的份数相等，建议brick分布在不同的服务器上。
复制卷和条带卷必须要指定卷的类型，复制卷就是每一个brick中的数据都是一样的，都是写入数据的完整备份，相当raid1，
所以容量会减少一半，当然性能上也会有所消耗.

应用场景:   对可靠性和读性能要求高的场景
优点：       读性能好
缺点：       写性能差

[root@node4 ~]# gluster
Welcome to gluster prompt, type 'help' to see the available commands.

创建复制卷
gluster> volume create datavol2 replica 2 transport tcp node1:/data0/gluster/data2 node2:/data0/gluster/data2 force
volume create: datavol2: success: please start the volume to access data

启动volume
gluster> volume start datavol2
volume start: datavol2: success

查看volume状态
gluster> volume status
Status of volume: datavol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data0/gluster/data1            49152     0          Y       1995 
Brick node2:/data0/gluster/data1            49152     0          Y       1913 
Brick node3:/data0/gluster/data1            49152     0          Y       1908 
Brick node4:/data0/gluster/data1            49152     0          Y       2263 
 
Task Status of Volume datavol1
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 2529bbe2-0c50-4401-abed-5c60ffc19895
Status               : completed           
 
Status of volume: datavol2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data0/gluster/data2            49153     0          Y       11961
Brick node2:/data0/gluster/data2            49153     0          Y       11858
Self-heal Daemon on localhost               N/A       N/A        Y       12031
Self-heal Daemon on node3                   N/A       N/A        Y       11857
Self-heal Daemon on node2                   N/A       N/A        Y       11879
Self-heal Daemon on node1                   N/A       N/A        Y       11982
 
Task Status of Volume datavol2
------------------------------------------------------------------------------
There are no active volume tasks
 
查看卷信息 
gluster> volume info  datavol2
 
Volume Name: datavol2
Type: Replicate
Volume ID: cb872ec7-6ce0-4635-bea2-d2e8d08dfac5
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node1:/data0/gluster/data2
Brick2: node2:/data0/gluster/data2
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off

想要看到需要挂载
关于数据，存储节点会同步控制节点的数据，控制节点不会同步存储节点的数据
[root@node3 ~]# mount -t glusterfs node1:/datavol2/ /mnt
[root@node3 ~]# ls /mnt/
[root@node3 ~]# cd /mnt/
[root@node3 mnt]# touch a.txt  #创建测试数据

测试
[root@node1 ~]# ls /data0/gluster/data2/
a.txt

[root@node2 ~]# ls /data0/gluster/data2
a.txt

条带卷

在这里插入图片描述

将文件分成条带，存放在多个 brick上，默认条带大小128K，相当于RAID0
条带卷在处理大文件的时候会有一定的作用，它会将文件拆分几个部分，分别存在两个条带上即两个brick上。这个实际用的较少

应用场景: 大文件
优点：适用于大文件存储
缺点：可靠性低，brick 故障会导致数据全部丢失

创建条带卷
gluster> volume create datavol3 stripe 2 node01:/data0/gluster/data3 node02:/data0/gluster/data3 force
stripe option not supported
创建了一下，好像现在不支持了

复合卷

复合卷
复合卷就是分布式复制，分布式条带，这两个是比较常用的，像分布式条带复制卷，还有三种揉一块儿的用的都比较少，
之前单一类型的卷，复制、条带和brick的数量是相同的，但是当我们的brick的数量是复制或条带的倍数的时候就会自动的转换为分布式复制或者分布式条带。

分布复制卷

在这里插入图片描述

多个文件在多个节点哈希存储，在多个brick 复制多份存储。

应用场景:   大量文件读和可靠性要求高的场景
优点：       高可靠性，读性能高
缺点：       牺牲存储空间，写性能差

这里我们用4个brick
哈希复制卷是一对一对组成复制卷，所以要选择不同的节点上的brick组成复制卷，这样一个数据的副本就会分布在不同的节点
上，不管那个节点宕机，另外一个节点都会数据的完整副本。

制作
[root@node3 ~]# gluster
Welcome to gluster prompt, type 'help' to see the available commands.
gluster> volume create data_rd replica 2 node1:/data0/gluster/data_rd_1 node2:/data0/gluster/data_rd_1 node1:/data0/gluster/data_rd_2 node2:/data0/gluster/data_rd_2 force
volume create: data_rd: success: please start the volume to access data

查看信息
gluster> volume info data_rd
 
Volume Name: data_rd
Type: Distributed-Replicate
Volume ID: 2d206595-edb9-4bbe-9db6-071c4229bea2
Status: Created
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4  #brick数量为两个复制，两个复制之间构成哈希关系
Transport-type: tcp
Bricks:
Brick1: node1:/data0/gluster/data_rd_1
Brick2: node2:/data0/gluster/data_rd_1
Brick3: node1:/data0/gluster/data_rd_2
Brick4: node2:/data0/gluster/data_rd_2
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off


启动
gluster> volume start data_rd
volume start: data_rd: success

查看状态
gluster> volume status data_rd
Status of volume: data_rd
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick node1:/data0/gluster/data_rd_1        49154     0          Y       12150
Brick node2:/data0/gluster/data_rd_1        49154     0          Y       12041
Brick node1:/data0/gluster/data_rd_2        49155     0          Y       12170
Brick node2:/data0/gluster/data_rd_2        49155     0          Y       12061
Self-heal Daemon on localhost               N/A       N/A        Y       11857
Self-heal Daemon on node4                   N/A       N/A        Y       12031
Self-heal Daemon on node1                   N/A       N/A        Y       11982
Self-heal Daemon on node2                   N/A       N/A        Y       11879
 
Task Status of Volume data_rd
------------------------------------------------------------------------------
There are no active volume tasks

测试：
随便找一台虚拟机，前提是能和gluster集群各个节点ping通
我这里用node03充当客户端，进行挂载
[root@node3 ~]# umount /mnt/   #先将之前挂载的取消
[root@node3 ~]# mount -t glusterfs node1:/data_rd /mnt
[root@node3 ~]# ls /mnt/
[root@node3 ~]# touch /mnt/test.txt

看node1和node2的是否有test.txt
[root@node1 ~]# ls /data0/gluster/data_rd_1/  #下面有数据
test.txt
[root@node1 ~]# ls /data0/gluster/data_rd_2/  #但这个目录下是空的
[root@node2 ~]# ls /data0/gluster/data_rd_1    #这个目录下是有的
test.txt
[root@node2 ~]# ls /data0/gluster/data_rd_2    #这个目录下是有的

再次创建
[root@node3 ~]# touch /mnt/test2.txt
再次查看
[root@node1 ~]# ls /data0/gluster/data_rd_1/
test.txt
[root@node1 ~]# ls /data0/gluster/data_rd_2/
test2.txt
[root@node2 ~]# ls /data0/gluster/data_rd_1/
test.txt
[root@node2 ~]# ls /data0/gluster/data_rd_2/
test2.txt
可以看到，分布式复制卷的效果。

分布条带卷

在这里插入图片描述

gluster> volume create fentiao stripe 2 node03:/data/gluster/data5 node04:/data/gluster/data5 force
stripe option not supported    #也不支持了

各种卷的整理

分布卷：存储数据时，将文件分开存储到各台glusterfs机器上。
        优点：存储数据时，读取速度快
        缺点：一个birck坏掉，文件就会丢失
复制卷：存储数据时，所有文件分别存储到每台glusterfs机器上。相当于raid1
        优点：对文件进行的多次备份，一个brick坏掉，文件不会丢失，其他机器的brick上面有备份
        缺点：占用资源
条带卷：存数据时，一个文件分开存到每台glusterfs机器上   相当于raid0
        优点：对大文件，读写速度快
        缺点：一个brick坏掉，文件就会坏掉

GlusterFS 配置高可用

使用node1、node2 部署高可用服务

! Configuration File for keepalived

vrrp_script chk_gluster {
	script "/etc/keepalived/check_gluster.sh"
	interval 2

	global_defs {
		router_id LVS_DEVEL
	}

	vrrp_instance VI_1 {
		state MASTER    #备服务器要改为BACKUP
		interface ens33
		virtual_router_id 51
		priority 100  #备服务器优先级要降低
		advert_int 1

		authentication {
			auth_type PASS
			auth_pass 1111
		}

		track_script {
			chk_gluster
		}

		virtual_ipaddress {
			192.168.200.100
		}
	}
}
-------------------------------------------
#准备脚本
[root@serverE ~]#  vim /etc/keepalived/check_gluster.sh

#!/bin/bash

num=$(netstat -lnupt |grep glusterd | wc -l)
if [ $num -eq 0 ];then
        systemctl start glusterd
        if [ $(netstat -lnupt |grep glusterd | wc -l) -eq 0 ];then
                systemctl stop keepalived
        fi
fi
#给脚本权限
chmod +x /etc/keepalived/check_gluster.sh

报错解决：
/usr/sbin/keepalived: error while loading shared libraries: /lib64/libsensors.so.4: file too short

将其他服务器的此模块进行拷贝即可

ress {
192.168.200.100
}
}
}

#准备脚本
[root@serverE ~]# vim /etc/keepalived/check_gluster.sh

#!/bin/bash

num=$(netstat -lnupt |grep glusterd | wc -l)
if [ $num -eq 0 ];then
systemctl start glusterd
if [ $(netstat -lnupt |grep glusterd | wc -l) -eq 0 ];then
systemctl stop keepalived
fi
fi
#给脚本权限
chmod +x /etc/keepalived/check_gluster.sh

报错解决：
/usr/sbin/keepalived: error while loading shared libraries: /lib64/libsensors.so.4: file too short

将其他服务器的此模块进行拷贝即可

NewRain001

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
分布式存储-clusterfs

分布式存储Glusterfs分布式文件系统GlusterFS简介PB级容量高可用性读/写性能基于文件系统级别共享分布式 GlusterFS（GNU ClusterFile System）是一种全对称的开源分布式文件系统，所谓全对称是指GlusterFS采用弹性哈希算法，没有中心节点，所有节点全部平等。GlusterFS配置方便，稳定性好，可轻松达到PB级容量，数千个节点。 2011年被红帽收购，之后推出了基于GlusterFS的 Red Hat Storage
复制链接

扫一扫