1.搭建双机lustre高可用集群:
1.环境说明:
主机名 | 系统 | 挂载情况 | IP地址 | Lustre集群名 | 内存 |
---|---|---|---|---|---|
mds001 | Centos7.9 | (共享磁盘)1个mgs,1个MDT,2个OST | 192.168.10.21/209.21 | global | 1G |
mds002 | Centos7.9 | (共享磁盘)1个mgs,1个MDT,2个OST | 192.168.10.22/209.22 | global | 1G |
client | Centos7.9 | 无 | 192.168.10.41 | 无 | 1G |
mds01,mds02作为mds的同时 也做oss,做5个共享盘:1个mgs,2个mdt,2个ost,搭建一套高可用的lustre服务集群
2.两个虚拟机使用共享磁盘:
-
前提:两台虚拟机没有拍摄快照
-
在mds001主机中:
-
添加五块5G的硬盘
SCSI > 创建新虚拟磁盘 > 指定磁盘容量 ,立即分配所有磁盘空间,将虚拟磁盘存储为单个文件
-
-
在mds001和mds002的虚拟机目录下,找个后缀名为vmx的文件,在文件末尾添加一下内容:
scsi1.sharedBus = "virtual" disk.locking = "false" diskLib.dataCacheMaxSize = "0" diskLib.dataCacheMaxReadAheadSize = "0" diskLib.dataCacheMinReadAheadSize = "0" diskLib.dataCachePageSize = "4096" diskLib.maxUnsyncedWrites = "0" disk.EnableUUID = "TRUE"
-
重启两台虚拟机发现,添加成功
[root@mds001 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm / └─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 5G 0 disk sdc 8:32 0 5G 0 disk sdd 8:48 0 5G 0 disk sde 8:64 0 5G 0 disk sdf 8:80 0 5G 0 disk sr0 11:0 1 9.5G 0 rom /mnt/cdrom
3.安装和配置Lustre:
-
两台主机都需要下载OSS服务器所需要的包:E2fsprogs包只是在Ext4的原版RPM包基础上增加了对Lustre⽀持
mkdir ~/e2fsprogs && cd ~/e2fsprogs wget -c -r -nd https://downloads.whamcloud.com/public/e2fsprogs/1.44.5.wc1/el7/RPMS/x86_64/ rm -rf index.html* unknown.gif *.gif sha256sum
-
全部rpm安装:
[root@mds001 e2fsprogs]# cd ~/e2fsprogs && rpm -Uvh * 准备中... ################################# [100%] 正在升级/安装... 1:libcom_err-1.42.12.wc1-4.el7.cent################################# [ 8%] 2:e2fsprogs-libs-1.42.12.wc1-4.el7.################################# [ 15%] 3:libcom_err-devel-1.42.12.wc1-4.el################################# [ 23%] 4:libss-1.42.12.wc1-4.el7.centos ################################# [ 31%] 5:e2fsprogs-1.42.12.wc1-4.el7.cento################################# [ 38%] 6:libss-devel-1.42.12.wc1-4.el7.cen################################# [ 46%] 7:e2fsprogs-devel-1.42.12.wc1-4.el7################################# [ 54%] 8:e2fsprogs-static-1.42.12.wc1-4.el################################# [ 62%] 9:e2fsprogs-debuginfo-1.42.12.wc1-4################################# [ 69%] 正在清理/删除... 10:e2fsprogs-1.42.9-19.el7 ################################# [ 77%] 11:e2fsprogs-libs-1.42.9-19.el7 ################################# [ 85%] 12:libss-1.42.9-19.el7 ################################# [ 92%] 13:libcom_err-1.42.9-19.el7 ################################# [100%]
-
两台主机都需要下载MDS服务器所需要的包:
wget命令参数 说明 -c 断点续传 -r 递归下载 -nd 不分层,所有文件下载到当前目录下 rpm包 说明 kernel-*.el7_lustre.x86_64.rpm 带 Lustre 补丁的 Linux 内核 kmod-lustre-*.el7.x86_64.rpm Lustre 补丁内核模块 kmod-lustre-osd-ldiskfs-*.el7.x86_64.rpm 基于 ldiskfs 的 Lustre 后端文件系统工具 lustre-*.el7.x86_64.rpm Lustre 软件命令行工具 lustre-osd-ldiskfs-mount-*.el7.x86_64.rpm 基于ldiskfs 的 mount.lustre和mkfs。lustre相关帮助文档 mkdir ~/lustre2.12.1 && cd ~/lustre2.12.1 yum install -y wget wget \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kernel-3.10.0-957.10.1.el7_lustre.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kmod-lustre-2.12.1-1.el7.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kmod-lustre-osd-ldiskfs-2.12.1-1.el7.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/lustre-2.12.1-1.el7.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/lustre-osd-ldiskfs-mount-2.12.1-1.el7.x86_64.rpm
-
安装依赖(否则报错error: Failed dependencies:):
yum clean all && yum repolist yum install -y linux-firmware dracut selinux-policy-targeted kexec-tools libyaml perl
-
全部rpm安装:(如果无法安装就强行安装)
cd ~/lustre2.12.1 && rpm -ivh *.rpm --force
-
重启服务器:
init 6
-
检查内核:
[root@master ~]# uname -r 3.10.0-957.el7_lustre.x86_64
-
加载Lustre模块(此为临时加载,重启失效):
[root@master ~]# modprobe lustre && lsmod | grep lustre lustre 758679 0 lmv 177987 1 lustre mdc 232938 1 lustre lov 314581 1 lustre ptlrpc 2264705 7 fid,fld,lmv,mdc,lov,osc,lustre obdclass 1962422 8 fid,fld,lmv,mdc,lov,osc,lustre,ptlrpc lnet 595941 6 lmv,osc,lustre,obdclass,ptlrpc,ksocklnd libcfs 421295 11 fid,fld,lmv,mdc,lov,osc,lnet,lustre,obdclass,ptlrpc,ksocklnd
-
查看lustre版本:
[root@mds001 ~]# modinfo lustre filename: /lib/modules/3.10.0-957.10.1.el7_lustre.x86_64/extra/lustre/fs/lustre.ko license: GPL version: 2.12.1 description: Lustre Client File System author: OpenSFS, Inc. <http://www.lustre.org/> retpoline: Y rhelversion: 7.6 srcversion: E50D950B04B4044ABCBCFA3 depends: obdclass,ptlrpc,libcfs,lnet,lmv,mdc,lov vermagic: 3.10.0-957.10.1.el7_lustre.x86_64 SMP mod_unload modversions
-
只有在某个节点挂载上了,才表明MGS,MDT或者是OST被创建了
-
五个硬盘都在mds001服务器上格式化,因为是共享硬盘,所以在mds002服务器上可以查看到格式化类型:
格式化参数 说明 --fsname 设置Lustre集群的名称,Lustre文件系统的标识,必须唯一 --servicenode mgs节点的IP地址,Lnet网络 --mgsnode mgs节点的IP地址,Lnet网络 --mgs 将分区格式化为MGS,MGS(ManaGe Server)是⽤来记录整个Lustre状态的服务 --mdt 将分区格式化为MDT,MDT(MetaData Target)是存放Lustre元数据服务的设备 --ost 将分区格式化为OST,OST(Object Storage Target)则是存储Lustre数据的设备 --index 设置设备的在lustre集群中的标签值,如:--mdt --index=1,LABEL="global-MDT0001",在同集群具有唯一性 --reformat 跳过检查,防止格式化操作清除已有的数据 --replace 替换 # 关于MDT和OST,--index的值要从0开始,不然系统/var/log/messages日志显示序列号等待错误
mkfs.lustre --mgs --servicenode=192.168.209.21@tcp2 --servicenode=192.168.209.22@tcp2 --backfstype=ldiskfs --reformat /dev/sdb mkfs.lustre --fsname global --mdt --index=0 --servicenode=192.168.209.24@tcp2 --servicenode=192.168.209.25@tcp2 --mgsnode=192.168.209.24@tcp2 --mgsnode=192.168.209.25@tcp2 --backfstype=ldiskfs --reformat /dev/sdc mkfs.lustre --fsname global --mdt --index=1 --servicenode=192.168.209.24@tcp2 --servicenode=192.168.209.25@tcp2 --mgsnode=192.168.209.24@tcp2 --mgsnode=192.168.209.25@tcp2 --backfstype=ldiskfs --reformat /dev/sdd mkfs.lustre --fsname global --ost --index=0 --servicenode=192.168.209.24@tcp2 --servicenode=192.168.209.25@tcp2 --mgsnode=192.168.209.24@tcp2 --mgsnode=192.168.209.25@tcp2 --backfstype=ldiskfs --reformat /dev/sde mkfs.lustre --fsname global --ost --index=1 --servicenode=192.168.209.24@tcp2 --servicenode=192.168.209.25@tcp2 --mgsnode=192.168.209.24@tcp2 --mgsnode=192.168.209.25@tcp2 --backfstype=ldiskfs --reformat /dev/sdf
-
在mds002服务器上可以查看格式化类型:
[root@mds002 ~]# blkid /dev/sdb /dev/sdc: LABEL="global:MDT0000" UUID="d99a724e-d199-4f59-9a4e-7164882f754b" TYPE="ext4"
-
在corosync中创建的挂载资源需要创建目录:
[root@mds001 ~]# mkdir /mnt/mgs;mkdir /mnt/mdt1;mkdir /mnt/mdt2;mkdir /mnt/ost1;mkdir /mnt/ost2 [root@mds002 ~]# mkdir /mnt/mgs;mkdir /mnt/mdt1;mkdir /mnt/mdt2;mkdir /mnt/ost1;mkdir /mnt/ost2
4.搭建时间同步chrony:
-
两个节点都并且连接到阿里云时间服务器,安装:
[root@mds001 ~]# yum -y install chrony
-
启动进程:
[root@mds001 ~]# systemctl enable chronyd;systemctl start chronyd
-
修改配置文件/etc/chrony.conf
[root@mds001 ~]# sed -i '/^server [0-9]/d' /etc/chrony.conf [root@mds001 ~]# sed -i '2a\server 192.168.10.24 iburst\' /etc/chrony.conf [root@mds001 ~]# sed -i 's/#allow 192.168.0.0\/16/allow 192.168.10.0\/24/' /etc/chrony.conf [root@mds001 ~]# sed -i 's/#local stratum 10/local stratum 10/' /etc/chrony.conf
-
重启服务:
[root@mds001 ~]# systemctl restart chronyd
-
查看时间同步状态:
[root@mds001 ~]# timedatectl status Local time: 三 2023-07-26 23:02:14 EDT Universal time: 四 2023-07-27 03:02:14 UTC RTC time: 四 2023-07-27 03:02:14 Time zone: America/New_York (EDT, -0400) NTP enabled: yes NTP synchronized: yes RTC in local TZ: no DST active: yes Last DST change: DST began at 日 2023-03-12 01:59:59 EST 日 2023-03-12 03:00:00 EDT Next DST change: DST ends (the clock jumps one hour backwards) at 日 2023-11-05 01:59:59 EDT 日 2023-11-05 01:00:00 EST
-
开启网络时间同步:
[root@mds001 ~]# timedatectl set-ntp true
-
查看具体的同步信息:
[root@mds001 ~]# chronyc sources -v 210 Number of sources = 1 .-- Source mode '^' = server, '=' = peer, '#' = local clock. / .- Source state '*' = current synced, '+' = combined , '-' = not combined, | / '?' = unreachable, 'x' = time may be in error, '~' = time too variable. || .- xxxx [ yyyy ] +/- zzzz || Reachability register (octal) -. | xxxx = adjusted offset, || Log2(Polling interval) --. | | yyyy = measured offset, || \ | | zzzz = estimated error. || | | \ MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 203.107.6.88 2 6 17 2 +2105us[+1914us] +/- 20ms
5.配置ssh免密:
-
两个节点各自创建密钥对
# ssh-keygen - 生成、管理和转换认证密钥,t制定类型 [root@mds001 ~]# ssh-keygen -t rsa -b 1024 Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:jRwgxJr1/bIRZ630UbN4QJtlVI1qsYkqQ+SOe++pwS4 root@mds001 The key's randomart image is: +---[RSA 1024]----+ | oo . ...+oo| | o... o=+ .| | + + .. ooO o | | o +.o+= O o | | + SB.+ o | | ..+ + o . | | .oo + | | E..... | | oo++ | +----[SHA256]-----+ [root@mds002 ~]# ssh-keygen -t rsa -b 1024 Generating public/private rsa key pair. ....
-
在被登录主机创建目录和文件:
mkdir ~/.ssh;touch ~/.ssh/authorized_keys ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.10.51
-
两个节点相互复制该公钥文件到对方的该目录下:
[root@mds001 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.10.22 The authenticity of host '192.168.10.22 (192.168.10.22)' can't be established. ECDSA key fingerprint is SHA256:8GotQw1f08FF9REsxJKn9ObpvvOib0h1W2sfJNClXwk. ECDSA key fingerprint is MD5:a4:7d:50:65:13:8d:17:dd:8e:b9:10:6f:64:e8:9b:d6. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.10.22' (ECDSA) to the list of known hosts. root@192.168.10.22's password: id_rsa.pub 100% 225 125.2KB/s 00:00 [root@mds002 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@192.168.10.21 The authenticity of host '192.168.10.21 (192.168.10.21)' can't be established. ...
-
在本地服务器上登陆对端服务器
[root@mds001 ~]# ssh root@192.168.10.22 Last login: Wed Jul 26 21:24:53 2023 from 192.168.10.1 [root@mds002 ~]# [root@mds002 ~]# ssh root@192.168.10.21 Last login: Wed Jul 26 23:23:22 2023 from 192.168.10.1 [root@mds001 ~]#
6.搭建Corosync高可用:
-
主从服务器全部配置域名映射,关闭防火墙,关闭selinux:
cat << eof >> /etc/hosts 192.168.10.21 mds001 192.168.10.22 mds002 eof
cat << eof >> /etc/hosts 192.168.10.24 mds004 192.168.10.25 mds005 eof
systemctl stop firewalld systemctl disable firewalld sed -i -e 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
-
各节点都安装:
工具 说明 pacemaker 最为广泛的开源集群资源管理器(故障检测和资源恢复,保证集群服务的高可用) pcs corosync中的命令行工具 psmisc 管理进程的工具 policycoreutils-python python软件包,用于在 SELinux(Security-Enhanced Linux)环境中操作和管理 SELinux 策略 [root@mds001 ~]# yum install pacemaker pcs policycoreutils-python -y
-
各节点都设置pscd服务开机自启:
[root@mds001 ~]# systemctl enable pcsd;systemctl restart pcsd
-
在集群各节点上给hacluster用户设定相同的密码
[root@mds001 ~]# echo "110119" |passwd --stdin hacluster 更改用户 hacluster 的密码 。 passwd:所有的身份验证令牌已经成功更新。
-
认证各节点的用户名(hacluster)和密码(110119),在一个节点上认证即可:
[root@mds001 ~]# pcs cluster auth mds00{1,2} Username: hacluster Password: master: Authorized slave1: Authorized slave2: Authorized
-
创建集群:
[root@mds001 ~]# pcs cluster setup --name mylustre mds00{1,2} Destroying cluster on nodes: mds001, mds002... mds001: Stopping Cluster (pacemaker)... mds002: Stopping Cluster (pacemaker)... mds002: Successfully destroyed cluster mds001: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'mds001', 'mds002' mds001: successful distribution of the file 'pacemaker_remote authkey' mds002: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... mds001: Succeeded mds002: Succeeded Synchronizing pcsd certificates on nodes mds001, mds002... mds001: Success mds002: Success Restarting pcsd on the nodes in order to reload the certificates... mds001: Success mds002: Success
-
启动集群:
[root@mds001 ~]# pcs cluster start --all mds001: Starting Cluster (corosync)... mds002: Starting Cluster (corosync)... mds001: Starting Cluster (pacemaker)... mds002: Starting Cluster (pacemaker)...
-
好了,到此一个2节点的corosync+pacemaker集群就创建启动好了
-
在各节点查看corosync pacemaker是否都启动了,(其中一个节点的pacemaker日志报错,它这里告诉我们我配置了stonith选项,却没有发现stonith设备),解决方法:后期关闭关闭stonith选项即可。
[root@mds001 ~]# systemctl status corosync pacemaker ● corosync.service - Corosync Cluster Engine Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled) Active: active (running) since 三 2023-07-26 22:21:09 EDT; 2min 41s ago Docs: man:corosync man:corosync.conf man:corosync_overview Process: 67381 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS) Main PID: 67389 (corosync) CGroup: /system.slice/corosync.service └─67389 corosync 7月 26 22:21:09 mds001 corosync[67389]: [TOTEM ] A new membership (192.168.10.21:9) was for...: 2 7月 26 22:21:09 mds001 corosync[67389]: [CPG ] downlist left_list: 0 received 7月 26 22:21:09 mds001 corosync[67389]: [CPG ] downlist left_list: 0 received 7月 26 22:21:09 mds001 corosync[67389]: [CPG ] downlist left_list: 0 received 7月 26 22:21:09 mds001 corosync[67389]: [VOTEQ ] Waiting for all cluster members. Current v...: 2 7月 26 22:21:09 mds001 corosync[67389]: [QUORUM] This node is within the primary component ...ce. 7月 26 22:21:09 mds001 corosync[67389]: [QUORUM] Members[2]: 1 2 7月 26 22:21:09 mds001 corosync[67389]: [MAIN ] Completed service synchronization, ready t...ce. 7月 26 22:21:09 mds001 corosync[67381]: Starting Corosync Cluster Engine (corosync): [ OK ] 7月 26 22:21:09 mds001 systemd[1]: Started Corosync Cluster Engine. ● pacemaker.service - Pacemaker High Availability Cluster Manager Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled) Active: active (running) since 三 2023-07-26 22:21:10 EDT; 2min 40s ago Docs: man:pacemakerd https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html Main PID: 67424 (pacemakerd) CGroup: /system.slice/pacemaker.service ├─67424 /usr/sbin/pacemakerd -f ├─67425 /usr/libexec/pacemaker/cib ├─67426 /usr/libexec/pacemaker/stonithd ├─67427 /usr/libexec/pacemaker/lrmd ├─67428 /usr/libexec/pacemaker/attrd ├─67429 /usr/libexec/pacemaker/pengine └─67430 /usr/libexec/pacemaker/crmd 7月 26 22:21:11 mds001 crmd[67430]: notice: Node mds002 state is now member 7月 26 22:21:11 mds001 stonith-ng[67426]: notice: Node mds002 state is now member 7月 26 22:21:11 mds001 crmd[67430]: notice: The local CRM is operational 7月 26 22:21:11 mds001 crmd[67430]: notice: State transition S_STARTING -> S_PENDING 7月 26 22:21:11 mds001 attrd[67428]: notice: Node mds002 state is now member 7月 26 22:21:11 mds001 attrd[67428]: notice: Recorded local node as attribute writer (was unset) 7月 26 22:21:13 mds001 crmd[67430]: notice: Fencer successfully connected 7月 26 22:21:32 mds001 crmd[67430]: warning: Input I_DC_TIMEOUT received in state S_PENDIN...pped 7月 26 22:21:32 mds001 crmd[67430]: notice: State transition S_ELECTION -> S_PENDING 7月 26 22:21:32 mds001 crmd[67430]: notice: State transition S_PENDING -> S_NOT_DC Hint: Some lines were ellipsized, use -l to show in full.
-
查看集群状态
[root@mds001 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: mds002 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum Last updated: Wed Jul 26 22:24:52 2023 Last change: Wed Jul 26 22:21:32 2023 by hacluster via crmd on mds002 2 nodes configured 0 resource instances configured PCSD Status: mds002: Online mds001: Online
-
验证集群配置信息(发生错误,需要关闭stonith选项):
[root@mds001 ~]# crm_verify -LV error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid
7.配置资源防护:
-
安装所有防护代理:
[root@mds001 resource.d]# yum install -y fence-agents-all
-
查看本机中的资源防护代理:
[root@mds001 resource.d]# pcs stonith list fence_amt_ws - Fence agent for AMT (WS) fence_apc - Fence agent for APC over telnet/ssh fence_apc_snmp - Fence agent for APC, Tripplite PDU over SNMP fence_bladecenter - Fence agent for IBM BladeCenter fence_brocade - Fence agent for HP Brocade over telnet/ssh fence_cisco_mds - Fence agent for Cisco MDS fence_cisco_ucs - Fence agent for Cisco UCS fence_compute - Fence agent for the automatic resurrection of OpenStack compute instances fence_drac5 - Fence agent for Dell DRAC CMC/5 fence_eaton_snmp - Fence agent for Eaton over SNMP fence_emerson - Fence agent for Emerson over SNMP fence_eps - Fence agent for ePowerSwitch fence_evacuate - Fence agent for the automatic resurrection of OpenStack compute instances fence_heuristics_ping - Fence agent for ping-heuristic based fencing fence_hpblade - Fence agent for HP BladeSystem fence_ibmblade - Fence agent for IBM BladeCenter over SNMP fence_idrac - Fence agent for IPMI fence_ifmib - Fence agent for IF MIB fence_ilo - Fence agent for HP iLO fence_ilo2 - Fence agent for HP iLO fence_ilo3 - Fence agent for IPMI fence_ilo3_ssh - Fence agent for HP iLO over SSH fence_ilo4 - Fence agent for IPMI fence_ilo4_ssh - Fence agent for HP iLO over SSH fence_ilo5 - Fence agent for IPMI fence_ilo5_ssh - Fence agent for HP iLO over SSH fence_ilo_moonshot - Fence agent for HP Moonshot iLO fence_ilo_mp - Fence agent for HP iLO MP fence_ilo_ssh - Fence agent for HP iLO over SSH fence_imm - Fence agent for IPMI fence_intelmodular - Fence agent for Intel Modular fence_ipdu - Fence agent for iPDU over SNMP fence_ipmilan - Fence agent for IPMI fence_kdump - fencing agent for use with kdump crash recovery service fence_mpath - Fence agent for multipath persistent reservation fence_redfish - I/O Fencing agent for Redfish fence_rhevm - Fence agent for RHEV-M REST API fence_rsa - Fence agent for IBM RSA fence_rsb - I/O Fencing agent for Fujitsu-Siemens RSB fence_sbd - Fence agent for sbd fence_scsi - Fence agent for SCSI persistent reservation fence_virt - Fence agent for virtual machines fence_vmware_rest - Fence agent for VMware REST API
-
开启stonith设备:
pcs property set stonith-enabled=true
-
配置fence_heuristics_ping
pcs stonith create stonith-ping-mds001 fence_heuristics_ping ping_targets=192.168.10.21 pcs stonith create stonith-ping-mds002 fence_heuristics_ping ping_targets=192.168.10.22
8.配置Lnet网络:
-
总共需要有两种网卡(ens33,ens38)
-
修改配置文件:/etc/modprobe.d/lustre.conf
cat << eof > /etc/modprobe.d/lustre.conf options lnet networks="tcp(ens33),tcp2(ens38)" eof
-
复制lustre.conf文件到另一个节点上:
scp /etc/modprobe.d/lustre.conf root@mds001:/etc/modprobe.d
-
两个节点,重新加载模块:
lustre_rmmod && modprobe -v lustre
-
查看:
[root@mds001 ~]# lctl list_nids 192.168.10.21@tcp 192.168.209.21@tcp2 [root@mds002 ~]# lctl list_nids 192.168.10.22@tcp 192.168.209.22@tcp2
9.使用Lustre代理方式添加:
-
ocf:lustre:Lustre
:-
由于其范围较窄,它比 Lustre 存储资源更简单
ocf:heartbeat:Filesystem
,因此更适合管理 Lustre 存储资源。 -
专为 Lustre OSD 开发,该 RA 由 Lustre 项目分发,并在 Lustre 2.10.0 版本及以上版本中提供
-
ocf:heartbeat:ZFS
: (得自己安装,用到了存储池)
-
-
下载:(需要注意两个节点都需要安装)
wget https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/lustre-resource-agents-2.12.1-1.el7.x86_64.rpm
rpm -ivh lustre-resource-agents-2.12.1-1.el7.x86_64.rpm
-
创建资源:
参数 说明 <resource name> 资源名 target= 用于存储的块设备的路径 mountpoint= 是 OSD 的安装点 pcs resource create global-mgs ocf:lustre:Lustre target=/dev/sdb mountpoint=/mnt/mgs pcs resource create global-mdt1 ocf:lustre:Lustre target=/dev/sdc mountpoint=/mnt/mdt1 pcs resource create global-mdt2 ocf:lustre:Lustre target=/dev/sdd mountpoint=/mnt/mdt2 pcs resource create global-ost1 ocf:lustre:Lustre target=/dev/sde mountpoint=/mnt/ost1 pcs resource create global-ost2 ocf:lustre:Lustre target=/dev/sdf mountpoint=/mnt/ost2
-
配置资源优先级:不能添加资源组,会将资源绑定在一个节点上,使优先级无效
pcs constraint location add global-constraint-mgs global-mgs mds001 10 pcs constraint location add global-constraint-mdt1 global-mdt1 mds001 10 pcs constraint location add global-constraint-mdt2 global-mdt2 mds002 10 pcs constraint location add global-constraint-ost1 global-ost1 mds001 10 pcs constraint location add global-constraint-ost2 global-ost2 mds002 10
-
设置启动顺序,mgs > mdt > ost:
pcs constraint order start global-mgs then start global-mdt1 pcs constraint order start global-mgs then start global-mdt2 pcs constraint order start global-mgs then start global-ost1 pcs constraint order start global-mgs then start global-ost2
10.创建Lnet监控资源:
-
Pacemaker 可以配置为监视集群服务器的各个方面,以帮助确定整体系统的运行状况。这提供了额外的数据点,用于做出有关在何处运行资源的决策。
-
Lustre 2.10版本引入了两种监控资源代理:
-
ocf:lustre:healthLNET
– 用于监控 LNet 连接(需要配置LNet网络接口) -
ocf:lustre:healthLUSTRE
– 用于监控Lustre的健康状况(需要安装)
-
-
创建:(没有在那个节点之分)
创建参数 说明 lctl 告诉资源代理使用 lctl ping
来监视 LNet NID 。如果未设置,则使用常规系统 ping 命令。multiplier 是一个正整数值,乘以响应ping数的机器数量。结果需要大于资源粘性值。 device 是要监控的网络设备,例如 eth1
、ib0
。host_list 是要尝试 ping 的以空格分隔的 LNet NID 列表。如果 lctl=false
,host_list
应包含常规主机名或 IP 地址。--clone 告诉 Pacemaker 在集群中的每个节点上启动资源实例。 pcs resource create ping-lnet ocf:lustre:healthLNET \ lctl=true \ multiplier=1001 \ device=ens33 \ host_list="192.168.209.21@tcp2 192.168.209.22@tcp2" \ --clone
11.创建Ustre监控资源:
-
ocf:lustre:healthLUSTRE
遵循与 相同的实现模型ocf:lustre:healthLNET
-
只不过不是监视 LNet NID,而是
ocf:lustre:healthLUSTRE
监视 Lustre 文件的内容health_check
并维护一个名为 的属性lustred
。 -
创建:
pcs resource create global-healthLUSTRE ocf:lustre:healthLUSTRE --clone
12.进行故障转移:
[root@mds004 ~]# pcs status
Cluster name: MyUpdate
Stack: corosync
Current DC: mds004 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Thu Aug 10 06:09:44 2023
Last change: Thu Aug 10 06:07:46 2023 by root via cibadmin on mds004
2 nodes configured
11 resource instances configured
Online: [ mds004 mds005 ]
Full list of resources:
Clone Set: ping-lnet-clone [ping-lnet]
Started: [ mds004 mds005 ]
Clone Set: global-healthLUSTRE-clone [global-healthLUSTRE]
Started: [ mds004 mds005 ]
stonith-ping-mds004 (stonith:fence_heuristics_ping): Started mds004
stonith-ping-mds005 (stonith:fence_heuristics_ping): Started mds005
global-mgs (ocf::lustre:Lustre): Started mds004
global-mdt1 (ocf::lustre:Lustre): Started mds004
global-mdt2 (ocf::lustre:Lustre): Started mds005
global-ost1 (ocf::lustre:Lustre): Started mds004
global-ost2 (ocf::lustre:Lustre): Started mds005
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
[root@mds005 lustre2.12.1]# pcs status
Cluster name: MyUpdate
Stack: corosync
Current DC: mds005 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Thu Aug 10 06:11:16 2023
Last change: Thu Aug 10 06:07:46 2023 by root via cibadmin on mds004
2 nodes configured
11 resource instances configured
Online: [ mds005 ]
OFFLINE: [ mds004 ]
Full list of resources:
Clone Set: ping-lnet-clone [ping-lnet]
Started: [ mds005 ]
Stopped: [ mds004 ]
Clone Set: global-healthLUSTRE-clone [global-healthLUSTRE]
Started: [ mds005 ]
Stopped: [ mds004 ]
stonith-ping-mds004 (stonith:fence_heuristics_ping): Started mds005
stonith-ping-mds005 (stonith:fence_heuristics_ping): Started mds005
global-mgs (ocf::lustre:Lustre): Started mds005
global-mdt1 (ocf::lustre:Lustre): Started mds005
global-mdt2 (ocf::lustre:Lustre): Started mds005
global-ost1 (ocf::lustre:Lustre): Started mds005
global-ost2 (ocf::lustre:Lustre): Started mds005
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
2.客户端远程测试:
1.下载并安装:
-
下载:
rpm包 说明 kernel-*.el7_lustre.x86_64.rpm 带 Lustre 补丁的 Linux 内核 kernel-devel-*.el7_lustre.x86_64.rpm 编译第三方模块 (如网终驱动程序) 所需的内核树部分 kernel-headers-*.el7_lustre.x86_64.rpm 在/user/include 下的头文件,用于编译用户空间和内核相关代码 lustre-client-dkms-*.el7.noarch.rpm kmod-lustre-client 的代客户端 RPM,含动态内核模块支持(DKMS)。 lustre-client-*.el7.x86_64.rpm 客户端命令行工具 mkdir lustre2.12 && cd lustre2.12 wget \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kernel-3.10.0-957.10.1.el7_lustre.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kernel-devel-3.10.0-957.10.1.el7_lustre.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/server/RPMS/x86_64/kernel-headers-3.10.0-957.10.1.el7_lustre.x86_64.rpm \ https://downloads.whamcloud.com/public/lustre/lustre-2.12.7/el7/client/RPMS/x86_64/lustre-client-dkms-2.12.7-1.el7.noarch.rpm \ wget https://downloads.whamcloud.com/public/lustre/lustre-2.12.1/el7/client/RPMS/x86_64/lustre-client-2.12.1-1.el7.x86_64.rpm
-
安装(需要安装gcc等依赖):
[root@client lustre2.12]# rpm -ivh *.rpm 错误:依赖检测失败: /usr/bin/expect 被 lustre-client-dkms-2.12.1-1.el7.noarch 需要 dkms >= 2.2.0.3-28.git.7c3e7c5 被 lustre-client-dkms-2.12.1-1.el7.noarch 需要 gcc 被 lustre-client-dkms-2.12.1-1.el7.noarch 需要 kernel-devel 被 lustre-client-dkms-2.12.1-1.el7.noarch 需要 libyaml-devel 被 lustre-client-dkms-2.12.1-1.el7.noarch 需要
-
安装扩展源:
yum install -y epel-release
-
安装依赖:
yum install -y perl expect gcc kernel-devel libyaml-devel dkms
-
rpm安装内核:
[root@client lustre2.12]# rpm -ivh kernel-*.rpm --force
-
安装成功了,但我们注意到,⽇志表明,Lustre内核模块并没有⾃动编译,原因:内核头⽂件和内核版本不同。
[root@client ~]# rpm -qa | grep kernel kernel-3.10.0-1160.el7.x86_64 kernel-tools-3.10.0-1160.el7.x86_64 kernel-headers-3.10.0-957.10.1.el7_lustre.x86_64 kernel-3.10.0-957.10.1.el7_lustre.x86_64 kernel-tools-libs-3.10.0-1160.el7.x86_64 kernel-devel-3.10.0-957.10.1.el7_lustre.x86_64 kernel-devel-3.10.0-1160.92.1.el7.x86_64
-
重启主机查看内核:
[root@client ~]# init 6 [root@client ~]# uname -r 3.10.0-957.10.1.el7_lustre.x86_64
-
rpm安装客户端:
[root@client ~]# rpm -ivh lustre-client-dkms-2.12.1-1.el7.noarch.rpm 准备中... ################################# [100%] 正在升级/安装... 1:lustre-client-dkms-2.12.1-1.el7 ################################# [100%] Loading new lustre-client-2.12.1 DKMS files... Deprecated feature: REMAKE_INITRD (/usr/src/lustre-client-2.12.1/dkms.conf) Building for 3.10.0-957.10.1.el7_lustre.x86_64 Building initial module for 3.10.0-957.10.1.el7_lustre.x86_64 Deprecated feature: REMAKE_INITRD (/var/lib/dkms/lustre-client/2.12.1/source/dkms.conf) configure: WARNING: No selinux package found, unable to build selinux enabled tools Done. Deprecated feature: REMAKE_INITRD (/var/lib/dkms/lustre-client/2.12.1/source/dkms.conf) Deprecated feature: REMAKE_INITRD (/var/lib/dkms/lustre-client/2.12.1/source/dkms.conf) lnet_selftest.ko.xz: Running module version sanity check. - Original module - No original module exists within this kernel - Installation - Installing to /lib/modules/3.10.0-957.10.1.el7_lustre.x86_64/extra/ lnet.ko.xz: ..... [root@client ~]# rpm -ivh lustre-client-2.12.1-1.el7.x86_64.rpm 准备中... ################################# [100%] 正在升级/安装... 1:lustre-client-2.12.1-1.el7 ################################# [100%]
-
加载模块:
[root@client ~]# modprobe lustre [root@client ~]# lsmod | grep lustre lustre 754256 0 lmv 177987 1 lustre mdc 237615 1 lustre lov 314554 1 lustre ptlrpc 1345789 7 fid,fld,lmv,mdc,lov,osc,lustre obdclass 1741312 8 fid,fld,lmv,mdc,lov,osc,lustre,ptlrpc lnet 600632 6 lmv,osc,lustre,obdclass,ptlrpc,ksocklnd libcfs 415252 11 fid,fld,lmv,mdc,lov,osc,lnet,lustre,obdclass,ptlrpc,ksocklnd
-
查看lustre版本:
[root@mds001 ~]# modinfo lustre filename: /lib/modules/3.10.0-957.10.1.el7_lustre.x86_64/extra/lustre/fs/lustre.ko license: GPL version: 2.12.1 description: Lustre Client File System author: OpenSFS, Inc. <http://www.lustre.org/> retpoline: Y rhelversion: 7.6 srcversion: E50D950B04B4044ABCBCFA3 depends: obdclass,ptlrpc,libcfs,lnet,lmv,mdc,lov vermagic: 3.10.0-957.10.1.el7_lustre.x86_64 SMP mod_unload modversions
2.挂载客户端到双机lustre:
-
创建目录,远程挂载:
[root@client ~]# mkdir /mnt/global-client1;mount -t lustre 192.168.10.21@tcp:192.168.10.22@tcp:/global /mnt/global-client1 [root@client ~]# mkdir /mnt/global-client2;mount -t lustre 192.168.10.21@tcp:192.168.10.22@tcp:/global /mnt/global-client2
-
查看:
[root@client ~]# mount | grep lustre 192.168.10.21@tcp:192.168.10.22@tcp:/global on /mnt/global-client1 type lustre (rw,seclabel,lazystatfs) 192.168.10.21@tcp:192.168.10.22@tcp:/global on /mnt/global-client2 type lustre (rw,seclabel,lazystatfs)
[root@client ~]# lfs df UUID 1K-blocks Used Available Use% Mounted on global-MDT0000_UUID 2330824 16548 2106120 1% /mnt/global-client1[MDT:0] global-MDT0002_UUID 2330824 16468 2106200 1% /mnt/global-client1[MDT:2] global-OST0000_UUID 3865564 34084 3605384 1% /mnt/global-client1[OST:0] global-OST0002_UUID 3865564 34088 3605380 1% /mnt/global-client1[OST:2] filesystem_summary: 7731128 68172 7210764 1% /mnt/global-client1 UUID 1K-blocks Used Available Use% Mounted on global-MDT0000_UUID 2330824 16548 2106120 1% /mnt/global-client2[MDT:0] global-MDT0002_UUID 2330824 16468 2106200 1% /mnt/global-client2[MDT:2] global-OST0000_UUID 3865564 34084 3605384 1% /mnt/global-client2[OST:0] global-OST0002_UUID 3865564 34088 3605380 1% /mnt/global-client2[OST:2] filesystem_summary: 7731128 68172 7210764 1% /mnt/global-client2
4.测试:
-
在/mnt/global客户端处,创建文件并写数据:
[root@client ~]# echo "Hello Lustre,I am Centos7.0-client client1" > /mnt/global-client1/client-test
-
在/mnt/global2客户端处,发现文件first_file并且查看到数据
[root@client ~]# ll /mnt/global-client2 总用量 5 -rw-r--r--. 1 root root 44 7月 26 05:16 client-test -rw-r--r--. 1 root root 26 7月 26 02:51 first_file [root@client ~]# cat /mnt/global-client2/client-test Hello Lustre,I am Centos7.0-client client1
到此高可用的Lustre集群就完全搭建完成。