操作环境
Centos 7
DRBDADM_API_VERSION=2
DRBD_KERNEL_VERSION=9.0.14
DRBDADM_VERSION_CODE=0x090301
DRBDADM_VERSION=9.3.1
targetcli version 2.1.fb46
Corosync Cluster Engine, version '2.4.3'
Pacemaker 1.1.18-11.el7_5.3
crm 3.0.0
网络拓扑图
将在drbd-node1以及drbd-node3上,配置drbd/corosync/pacemaker/targetcli/crm服务。
操作步骤
配置DRBD
DRBD配置步骤参考Centos6下drbd9安装与基本配置进行配置,注意的是这里的两台vm都配置了双网卡,后面Cluster使用,drbd硬盘资源配置文件如下:
[root@drbd-node1 ~]# vi /etc/drbd.d/scsivol.res
resource scsivol {
on drbd-node1 {
device /dev/drbd0;
disk /dev/vdb;
address 10.10.200.228:7789;
meta-disk internal;
}
on drbd-node3 {
device /dev/drbd0;
disk /dev/vdb;
address 10.10.200.226:7789;
meta-disk internal;
}
}
安装Corosync/Pacemaker/Crm
1.配置yum源文件如下:
[root@drbd-node1 yum.repos.d]# vi crm.repo
[network_ha-clustering_Stable]
name=Stable High Availability/Clustering packages (CentOS_CentOS-7)
type=rpm-md
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key
enabled=1
[root@drbd-node1 yum.repos.d]# vi drbd.repo
[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$ba[...]
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
exclude=pacemaker* corosync* cluster* drbd* resource-agents
#released updates
[updates]
name=CentOS-$releasever - Updates
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$ba[...]
#baseurl=http://mirror.centos.org/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
exclude=pacemaker* corosync* cluster* drbd* resource-agents
2.安装corosync/pacemaker/crmsh/targetcli
[root@drbd-node1 yum.repos.d]# yum -y install corosync pacemaker crmsh targetcli
在安装过程中,可能会出现有些依赖包因为网络原因无法安装的问题,需要自己去找rpm手动安装。
配置Corosync
1.配置corosync配置文件如下
[root@drbd-node1 yum.repos.d]# vi /etc/corosync/corosync.conf
totem {
version: 2
secauth: off
cluster_name: cluster
transport: udpu
rrp_mode: passive
}
nodelist {
node {
ring0_addr: 192.168.0.228
ring1_addr: 10.10.200.228
nodeid: 1
}
node {
ring0_addr: 192.168.0.226
ring1_addr: 10.10.200.226
nodeid: 2
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
logging {
to_syslog: yes
}
分别在2个节点上启动corosync/pacemaker
[root@drbd-node1 ~]# systemctl enable corosync
[root@drbd-node1 ~]# systemctl enable pacemaker
[root@drbd-node1 ~]# systemctl start corosync
[root@drbd-node1 ~]# systemctl start pacemaker
查看corosync状态
[root@drbd-node1 ~]# systemctl status corosync
?corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2018-07-06 22:29:14 CST; 2min 22s ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1114 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS)
Main PID: 1279 (corosync)
CGroup: /system.slice/corosync.service
忖1279 corosync
Jul 06 22:29:14 drbd-node1 corosync[1279]: [TOTEM ] adding new UDPU member {10.10.200.228}
Jul 06 22:29:14 drbd-node1 corosync[1279]: [TOTEM ] adding new UDPU member {10.10.200.226}
Jul 06 22:29:14 drbd-node1 corosync[1279]: [TOTEM ] A new membership (192.168.0.228:3392) was formed. Members joined: 1
Jul 06 22:29:14 drbd-node1 corosync[1279]: [TOTEM ] A new membership (192.168.0.226:3396) was formed. Members joined: 2
Jul 06 22:29:14 drbd-node1 corosync[1279]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2
Jul 06 22:29:14 drbd-node1 corosync[1279]: [QUORUM] This node is within the primary component and will provide service.
Jul 06 22:29:14 drbd-node1 corosync[1279]: [QUORUM] Members[2]: 2 1
Jul 06 22:29:14 drbd-node1 corosync[1279]: [MAIN ] Completed service synchronization, ready to provide service.
Jul 06 22:29:14 drbd-node1 corosync[1114]: Starting Corosync Cluster Engine (corosync): [ OK ]
Jul 06 22:29:14 drbd-node1 systemd[1]: Started Corosync Cluster Engine.
查看pacemaker状态
[root@drbd-node1 ~]# systemctl status pacemaker
?pacemaker.service - Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2018-07-06 22:29:14 CST; 2min 29s ago
Docs: man:pacemakerd
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html
Main PID: 1304 (pacemakerd)
CGroup: /system.slice/pacemaker.service
忖1304 /usr/sbin/pacemakerd -f
忖1349 /usr/libexec/pacemaker/cib
忖1351 /usr/libexec/pacemaker/stonithd
忖1352 /usr/libexec/pacemaker/lrmd
忖1353 /usr/libexec/pacemaker/attrd
忖1354 /usr/libexec/pacemaker/pengine
忖1355 /usr/libexec/pacemaker/crmd
Jul 06 22:29:21 drbd-node1 crmd[1355]: notice: Result of start operation for p_drbd_r0 on drbd-node1: 0 (ok)
Jul 06 22:29:22 drbd-node1 crmd[1355]: notice: Result of notify operation for p_drbd_r0 on drbd-node1: 0 (ok)
Jul 06 22:29:22 drbd-node1 crmd[1355]: notice: Result of notify operation for p_drbd_r0 on drbd-node1: 0 (ok)
Jul 06 22:29:22 drbd-node1 crmd[1355]: notice: Result of promote operation for p_drbd_r0 on drbd-node1: 0 (ok)
Jul 06 22:29:22 drbd-node1 crmd[1355]: notice: Result of notify operation for p_drbd_r0 on drbd-node1: 0 (ok)
Jul 06 22:29:22 drbd-node1 crmd[1355]: notice: Result of start operation for p_iscsi_portblock_on_drbd0 on drbd-node1: 0 (ok)
Jul 06 22:29:27 drbd-node1 crmd[1355]: notice: Result of start operation for p_iscsi_ip0 on drbd-node1: 0 (ok)
Jul 06 22:29:29 drbd-node1 crmd[1355]: notice: Result of start operation for p_iscsi_target_drbd0 on drbd-node1: 0 (ok)
Jul 06 22:29:30 drbd-node1 crmd[1355]: notice: Result of start operation for p_iscsi_lun_drbd0 on drbd-node1: 0 (ok)
Jul 06 22:29:31 drbd-node1 crmd[1355]: notice: Result of start operation for p_iscsi_portblock_off_drbd0 on drbd-node1: 0 (ok)
CRM配置Cluster
1.添加以下配置到crm配置中
# crm configure property no-quorum-policy=ignore
# crm configure property stonith-enabled=false
2.在crm中配置drbd
进入crm配置界面,drbd_resource后面跟的是drbd资源的名称在drbd.d目录下配置scsivol.res中定义的。
[root@drbd-node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive p_drbd_r0 ocf:linbit:drbd \
params drbd_resource="scsivol" \
op start timeout=240 \
op promote timeout=90 \
op demote timeout=90 \
op stop timeout=100 \
op monitor interval="29" role="Master" \
op monitor interval="31" role="Slave"
3.创建于drbd资源scsivol对应的Master/Slave资源
crm(live)configure# ms ms_drbd_r0 p_drbd_r0 \
meta master-max=1 master-node-max=1 \
notify=true clone-max=2 clone-node-max=1
4.创建cluster的虚拟ip
crm(live)configure# primitive p_iscsi_ip0 ocf:heartbeat:IPaddr2 \
params ip="10.10.200.235" cidr_netmask="24" \
op start timeout=20 \
op stop timeout=20 \
op monitor interval="10s"
5.配置iscsi target名称
crm(live)configure# primitive p_iscsi_target_drbd0 ocf:heartbeat:iSCSITarget \
params iqn="iqn.2017-10.com.example1:edrbd0" \
implementation=lio-t portals="10.10.200.235:3260" \
op start timeout=20 \
op stop timeout=20 \
op monitor interval=20 timeout=40
6.配置LUN
crm(live)configure# primitive p_iscsi_lun_drbd0 ocf:heartbeat:iSCSILogicalUnit \
params target_iqn="iqn.2017-10.com.example1:edrbd0" \
implementation=lio-t lun=0 path="/dev/drbd0" \
op start timeout=20 \
op stop timeout=20 \
op monitor interval=20 timeout=40
7.配置端口的blocking以及unblocking,这将防止在iSCSI目标成功进行故障转移之前iSCSI启动程序接收“拒绝连接”错误。
crm(live)configure# primitive p_iscsi_portblock_on_drbd0 ocf:heartbeat:portblock \
params ip=10.10.200.235 portno=3260 protocol=tcp action=block \
op start timeout=20 \
op stop timeout=20 \
op monitor timeout=20 interval=20
crm(live)configure# primitive p_iscsi_portblock_off_drbd0 ocf:heartbeat:portblock \
params ip=10.10.200.235 portno=3260 protocol=tcp action=unblock \
op start timeout=20 \
op stop timeout=20 \
op monitor timeout=20 interval=20
8.创建于iSCSI Target相关联的资源组
crm(live)configure# group g_iscsi_drbd0 \
p_iscsi_portblock_on_drbd0 \
p_iscsi_ip0 p_iscsi_target_drbd0 p_iscsi_lun_drbd0 \
p_iscsi_portblock_off_drbd0
9.最后,我们要确认在资源组以及drbd资源运行在相同的节点上,该节点为Primary节点。
crm(live)configure# colocation cl_g_iscsi_drbd0-with-ms_drbd_r0 \
inf: g_iscsi_drbd0:Started ms_drbd_r0:Master
crm(live)configure# order o_ms_drbd_r0-before-g_iscsi_drbd0 \
inf: ms_drbd_r0:promote g_iscsi_drbd0:start
10.提交配置
crm(live)configure# commit
11.查看crm配置状态
crm(live)# status
Stack: corosync
Current DC: drbd-node3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Fri Jul 6 22:50:01 2018
Last change: Fri Jul 6 21:31:38 2018 by root via cibadmin on drbd-node3
2 nodes configured
7 resources configured
Online: [ drbd-node1 drbd-node3 ]
Full list of resources:
Master/Slave Set: ms_drbd_r0 [p_drbd_r0]
Masters: [ drbd-node1 ]
Slaves: [ drbd-node3 ]
Resource Group: g_iscsi_drbd0
p_iscsi_portblock_on_drbd0 (ocf::heartbeat:portblock): Started drbd-node1
p_iscsi_ip0 (ocf::heartbeat:IPaddr2): Started drbd-node1
p_iscsi_target_drbd0 (ocf::heartbeat:iSCSITarget): Started drbd-node1
p_iscsi_lun_drbd0 (ocf::heartbeat:iSCSILogicalUnit): Started drbd-node1
p_iscsi_portblock_off_drbd0 (ocf::heartbeat:portblock): Started drbd-node1
测试iSCSI连接
[root@kvm-node ~]# iscsiadm -m discovery -t st -p 10.10.200.235
10.10.200.235:3260,1 iqn.2017-10.com.example1:edrbd0
[root@kvm-node ~]# iscsiadm -m node -T iqn.2017-10.com.example1:edrbd0 -l
Logging in to [iface: default, target: iqn.2017-10.com.example1:edrbd0, portal: 10.10.200.235,3260] (multiple)
Login to [iface: default, target: iqn.2017-10.com.example1:edrbd0, portal: 10.10.200.235,3260] successful.
[root@kvm-node ~]# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 02 Id: 00 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 01 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 02 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 03 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 04 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi0 Channel: 02 Id: 05 Lun: 00
Vendor: DELL Model: PERC H700 Rev: 2.10
Type: Direct-Access ANSI SCSI revision: 05
Host: scsi1 Channel: 00 Id: 00 Lun: 00
Vendor: TEAC Model: DVD-ROM DV-28SW Rev: R.2B
Type: CD-ROM ANSI SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
Vendor: LIO-ORG Model: p_iscsi_lun_drb Rev: 4.0
Type: Direct-Access ANSI SCSI revision: 05
Host:scsi3,vendor:LIO-ORG为所配置的iSCSI设备。
测试故障切换
在进行故障切换前,要确保drbd状态为UptoDa/UpToDA,否则在切换时会不成功:
[root@drbd-node1 ~]# drbd-overview
NOTE: drbd-overview will be deprecated soon.
Please consider using drbdtop.
0:scsivol/0 Connected(2*) Primar/Second UpToDa/UpToDa
在iSCSI Cluster两个节点都正常工作的时候,状态如下,服务都在drbd-node1上
[root@drbd-node1 ~]# crm
crm(live)# status
Stack: corosync
Current DC: drbd-node3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Jul 7 16:40:34 2018
Last change: Fri Jul 6 21:31:38 2018 by root via cibadmin on drbd-node3
2 nodes configured
7 resources configured
Online: [ drbd-node1 drbd-node3 ]
Full list of resources:
Master/Slave Set: ms_drbd_r0 [p_drbd_r0]
Masters: [ drbd-node1 ]
Slaves: [ drbd-node3 ]
Resource Group: g_iscsi_drbd0
p_iscsi_portblock_on_drbd0 (ocf::heartbeat:portblock): Started drbd-node1
p_iscsi_ip0 (ocf::heartbeat:IPaddr2): Started drbd-node1
p_iscsi_target_drbd0 (ocf::heartbeat:iSCSITarget): Started drbd-node1
p_iscsi_lun_drbd0 (ocf::heartbeat:iSCSILogicalUnit): Started drbd-node1
p_iscsi_portblock_off_drbd0 (ocf::heartbeat:portblock): Started drbd-node1
此时模拟drbd-node1宕机,在另一节点上查看iSCSI Cluster状态,所有服务已经切换到drbd-node3上
[root@drbd-node3 ~]# crm
crm(live)# status
Stack: corosync
Current DC: drbd-node3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Jul 7 16:51:39 2018
Last change: Fri Jul 6 21:31:38 2018 by root via cibadmin on drbd-node3
2 nodes configured
7 resources configured
Online: [ drbd-node3 ]
OFFLINE: [ drbd-node1 ]
Full list of resources:
Master/Slave Set: ms_drbd_r0 [p_drbd_r0]
Masters: [ drbd-node3 ]
Stopped: [ drbd-node1 ]
Resource Group: g_iscsi_drbd0
p_iscsi_portblock_on_drbd0 (ocf::heartbeat:portblock): Started drbd-node3
p_iscsi_ip0 (ocf::heartbeat:IPaddr2): Started drbd-node3
p_iscsi_target_drbd0 (ocf::heartbeat:iSCSITarget): Started drbd-node3
p_iscsi_lun_drbd0 (ocf::heartbeat:iSCSILogicalUnit): Started drbd-node3
p_iscsi_portblock_off_drbd0 (ocf::heartbeat:portblock): Started drbd-node3
试下iSCSI连接,继续上一节中《测试iSCSI连接》中的iSCSI Client,查看session为正常的
[root@kvm-node ~]# iscsiadm -m session
tcp: [1] 10.10.200.235:3260,1 iqn.2017-10.com.example1:edrbd0 (non-flash)
查看SCSI设备,SCSI 3:0:0:0设备为iSCSI设备
[root@kvm-node ~]# lsscsi
[0:2:0:0] disk DELL PERC H700 2.10 /dev/sda
[0:2:1:0] disk DELL PERC H700 2.10 /dev/sdb
[0:2:2:0] disk DELL PERC H700 2.10 /dev/sdc
[0:2:3:0] disk DELL PERC H700 2.10 /dev/sdd
[0:2:4:0] disk DELL PERC H700 2.10 /dev/sde
[0:2:5:0] disk DELL PERC H700 2.10 /dev/sdf
[1:0:0:0] cd/dvd TEAC DVD-ROM DV-28SW R.2B /dev/sr0
[3:0:0:0] disk LIO-ORG p_iscsi_lun_drb 4.0 /dev/sdg