使用SanLock建立简单的HA服务

基本配置

三台VMware虚拟机,其中安装CentOS 7 X64位操作系统,4个本地硬盘,网络配置如下:

主机名 IP地址

Target 192.168.195.131 192.168.162.131

Test1 192.168.195.132 192.168.162.132

Test2 192.168.195.133 192.168.162.133

使用Target主机上的/dev/sdd1作为LVM分区,/dev/sdd2作为NFS分区,/dev/sdd3作为RAW分区。

禁用SeLinux和防火墙(所有节点)

# Show Status
$ sestatus

# Temp Disable
$ setenforce 0

$ sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config

$ disable firewalld.service
$ systemctl stop firewalld.service
$ systemctl status firewalld.service

$ reboot

配置iSCSI

配置Target服务(Target节点)

$ yum install -y scsi-target-utils

$ vi /etc/tgt/targets.conf
<target iqn.2016-12.org.lr:lvm>
  backing-store /dev/sdd1
  initiator-address 192.168.0.0/16
</target>

<target iqn.2016-12.org.lr:nfs>
  backing-store /dev/sdd2
  initiator-address 192.168.0.0/16
</target>

<target iqn.2016-12.org.lr:raw>
  backing-store /dev/sdd3
  initiator-address 192.168.0.0/16
</target>

$ systemctl enable tgtd.service
$ systemctl restart tgtd.service
$ systemctl status tgtd.service

$ tgtadm -L iscsi -m target -o show

配置Initiator客户端(Test1和Test2节点)

$ yum install -y iscsi-initiator-utils

$ systemctl enable iscsid.service
$ systemctl restart iscsid.service
$ systemctl status iscsid.service

$ iscsiadm -m discovery -t st -p 192.168.195.131
$ iscsiadm -m discovery -t st -p 192.168.162.131

$ iscsiadm -m node -T iqn.2016-12.org.lr:lvm -p 192.168.195.131:3260 -l
$ iscsiadm -m node -T iqn.2016-12.org.lr:lvm -p 192.168.162.131:3260 -l
$ iscsiadm -m node -T iqn.2016-12.org.lr:nfs -p 192.168.195.131:3260 -l
$ iscsiadm -m node -T iqn.2016-12.org.lr:nfs -p 192.168.162.131:3260 -l
$ iscsiadm -m node -T iqn.2016-12.org.lr:raw -p 192.168.195.131:3260 -l
$ iscsiadm -m node -T iqn.2016-12.org.lr:raw -p 192.168.162.131:3260 -l

$ iscsiadm -m node –l
192.168.195.131:3260,1 iqn.2016-12.org.lr:nfs
192.168.162.131:3260,1 iqn.2016-12.org.lr:nfs
192.168.195.131:3260,1 iqn.2016-12.org.lr:raw
192.168.162.131:3260,1 iqn.2016-12.org.lr:raw
192.168.195.131:3260,1 iqn.2016-12.org.lr:lvm
192.168.162.131:3260,1 iqn.2016-12.org.lr:lvm

配置多路径(Test1和Test2节点)


$ yum install -y device-mapper-multipath
$ modprobe dm-multipath

$ /lib/udev/scsi_id -g -u /dev/sdf
$ /lib/udev/scsi_id -g -u /dev/sdg
$ /lib/udev/scsi_id -g -u /dev/sdh
$ /lib/udev/scsi_id -g -u /dev/sdi
$ /lib/udev/scsi_id -g -u /dev/sdj
$ /lib/udev/scsi_id -g -u /dev/sdk

$ vi /etc/multipath.conf
blacklist {
  devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
  devnode "^sd[a-d][0-9]*"
}

defaults {
  user_friendly_names yes
  path_grouping_policy multibus
  failback immediate
  no_path_retry fail
}

multipaths {
  multipath {
    wwid 360000000000000000e00000000010001
    alias lvm
  }
  multipath {
    wwid 360000000000000000e00000000020001
    alias nfs
  }
  multipath {
    wwid 360000000000000000e00000000030001
    alias raw
  }
}


$ systemctl enable multipathd.service
$ systemctl start multipathd.service
$ systemctl status multipathd.service

$ multipath -F
$ multipath -v2
$ multipath -ll
nfs (360000000000000000e00000000020001) dm-1 IET     ,VIRTUAL-DISK    
size=100G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 3:0:0:1 sdf 8:80  active ready running
  `- 4:0:0:1 sdh 8:112 active ready running
lvm (360000000000000000e00000000010001) dm-0 IET     ,VIRTUAL-DISK    
size=100G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 5:0:0:1 sdg 8:96  active ready running
  `- 6:0:0:1 sdi 8:128 active ready running
raw (360000000000000000e00000000030001) dm-4 IET     ,VIRTUAL-DISK    
size=50G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  |- 7:0:0:1 sdj 8:144 active ready running
  `- 8:0:0:1 sdk 8:160 active ready running

配置NFS

配置NFS服务(Target节点)

$ mkfs.xfs /dev/sdd2
$ mkdir /mnt/nfs

$ vi /etc/fstab
/dev/sdd2    /mnt/nfs   xfs   defaults   1 1

$ mount /dev/sdd2 /mnt/nfs

$ yum -y install nfs-utils rpcbind

$ vi /etc/exports
/mnt/nfs *(rw,sync,no_root_squash,no_subtree_check)

$ systemctl start rpcbind
$ systemctl enable rpcbind
$ systemctl start nfs
$ systemctl enable nfs

配置NFS客户端(Test1和Test2节点)

$ showmount -e 192.168.195.131
Export list for 192.168.195.131:
/mnt/nfs *

$ mkdir /mnt/nfs
$ mount -t nfs 192.168.195.131:/mnt/nfs /mnt/nfs

$ vi /etc/fstab
192.168.195.131:/mnt/nfs    /mnt/nfs   nfs   defaults   1 1

LVM初始化(Target节点)

$ pvcreate /dev/sdd1
$ vgcreate storage /dev/sdd1

安装sanlock(Test1和Test2节点)

$ yum install -y sanlock sanlock-python

# 禁用watchdog:
$ vi /etc/sanlock/sanlock.conf
use_watchdog = 0

$ systemctl enable sanlock && systemctl start sanlock && systemctl status sanlock

创建LockSpace(Test1或Test2节点)

在NFS目录中创建名字为“LS”的Lockspace:

$ mkdir -pv /mnt/nfs/sanlock
$ dd if=/dev/zero bs=1048576 count=1 of=/mnt/nfs/sanlock/idLease

$ sanlock direct init -s LS:0:/mnt/nfs/sanlock/idLease:0
init done 0

$ chown sanlock:sanlock /mnt/nfs/sanlock/idLease

在共享存储/dev/mapper/raw上创建名字为“RAW”的Lockspace:

$ chown sanlock:sanlock /dev/mapper/raw

$ sanlock direct init -s RAW:0:/dev/mapper/raw:0
init done 0

在共享存储/dev/mapper/lvm上创建名字为LVM的Lockspace:

$ lvcreate -L 1M -n lockspace storage
  Rounding up size to full physical extent 4.00 MiB
  Logical volume "lockspace" created.

$ chown sanlock:sanlock /dev/storage/lockspace

$ sanlock direct init -s LVM:0:/dev/storage/lockspace:0
init done 0

# 在另外的节点上更新LVM元数据并激活LV:
$ pvscan --cache
$ lvchange -ay storage/lockspace

创建资源(Test1或Test2节点)

在NFS目录中创建名字为“leader”的资源,属于名字为“LS”的Lockspace:

$ dd if=/dev/zero bs=1048576 count=1 of=/mnt/nfs/sanlock/leaderLease

$ sanlock direct init -r LS:leader:/mnt/nfs/sanlock/leaderLease:0
init done 0

$ chown sanlock:sanlock /mnt/nfs/sanlock/leaderLease

在共享存储/dev/mapper/raw上创建名字为“leader”的资源,属于名字为“RAW”的Lockspace:

$ sanlock direct init -r RAW:leader:/dev/mapper/raw:1048576
init done 0

在共享存储/dev/mapper/lvm上创建名字为“leader”的资源,属于名字为“LVM”的Lockspace:

$ lvcreate -L 1M -n leader storage
  Rounding up size to full physical extent 4.00 MiB
  Logical volume "leader" created.

$ chown sanlock:sanlock /dev/storage/leader

$ sanlock direct init -r LVM:leader:/dev/storage/leader:0
init done 0

# 在另外的节点上更新LVM元数据并激活LV:
$ pvscan --cache
$ lvchange -ay storage/leader

创建测试脚本(Test1和Test2节点)

$ vi simpleHA.py

增加如下内容:

def serviceMain(hostId, lockspacePath, leasePath):
    sfd = sanlock.register()  # 把本进程注册到 sanlock 服务上
    while True:
        try:
            # 若尚未加入 sanlock 集群,则尝试获取对 hostId 的 Delta Lease
            # 并尝试加入 sanlock 集群
            if not sanlock.inq_lockspace("LS", hostId, lockspacePath):
                print "Try to acquire host id LS:%s:%s:0" % (hostId,
                                                             lockspacePath)
                print time.strftime("%Y %m %Z %H:%M:%S"), "Enter Lockspace Begin"
                sanlock.add_lockspace("LS", hostId, lockspacePath)
                print time.strftime("%Y %m %Z %H:%M:%S"), "Enter Lockspace End"

            # 尝试获取 Delta Lease
            print "Try to acquire leader lease LS:leader:%s:0" % leasePath
            sanlock.acquire("LS", "leader", [(leasePath, 0)], sfd)
        except sanlock.SanlockException:
            # 无法加入集群,或无法获取 Delta Lease
            # 10 秒后重试
            print "Failed to acquire leases, try again in 10s."
            time.sleep(10)
        else:
            break  # 成功获取 Paxos Lease,不再重试

    # 成功加入了 sanlock 集群并获取了 Paxos Lease
    # 执行实际的应用服务
    serve()

# 假想的应用服务程序,一段时间后自行崩溃
def serve():
    for i in range(6):
        print time.strftime("%Y %m %Z %H:%M:%S"), "Service is running"
        time.sleep(10)

    print time.strftime("%Y %m %Z %H:%M:%S"), "Service crashed"

if __name__ == "__main__":
    try:
        hostId = int(sys.argv[1])
        lockspacePath = sys.argv[2]
        leasePath = sys.argv[3]
    except Exception:
        sys.stderr.write(
            "Usage: %s host_id lockspace_path lease_path\n" % sys.argv[0])
        exit(1)

    # 每隔 15 秒就尝试启动新进程并运行一次 serviceMain 主程序
    while True:
        p = Process(target=serviceMain,
                    args=(hostId, lockspacePath, leasePath))
        p.start()
        p.join()
        time.sleep(15)

增加可执行权限:

$ chmod a+x simpleHA.py

执行测试脚本(Test1和Test2节点)

在NFS上测试

  • Test1节点:
$ ./simpleHA.py 1 /mnt/nfs/sanlock/idLease /mnt/nfs/sanlock/leaderLease
Try to acquire host id LS:1:/mnt/nfs/sanlock/idLease:0
2017 04 CST 10:13:04 Enter Lockspace Begin
2017 04 CST 10:15:45 Enter Lockspace End
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
2017 04 CST 10:15:45 Service is running
2017 04 CST 10:15:55 Service is running
2017 04 CST 10:16:05 Service is running
2017 04 CST 10:16:15 Service is running
2017 04 CST 10:16:25 Service is running
2017 04 CST 10:16:35 Service is running
2017 04 CST 10:16:45 Service crashed
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
  • Test2节点:
$ ./simpleHA.py 2 /mnt/nfs/sanlock/idLease /mnt/nfs/sanlock/leaderLease
Try to acquire host id LS:2:/mnt/nfs/sanlock/idLease:0
2017 04 CST 10:13:13 Enter Lockspace Begin
2017 04 CST 10:15:59 Enter Lockspace End
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LS:leader:/mnt/nfs/sanlock/leaderLease:0
2017 04 CST 10:16:48 Service is running
2017 04 CST 10:16:58 Service is running
2017 04 CST 10:17:08 Service is running
2017 04 CST 10:17:18 Service is running

查看租约当前状态:

$ sanlock direct dump /mnt/nfs/sanlock/idLease
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                   LS       a6f7177b-5a75-4cb4-bcac-9291f6f623ec.Test1 0000004803 0001 0009
00000512                                   LS       1ef4e72c-2672-4096-bce5-c7bb2102ea8f.Test2 0000003295 0002 0006

$ sanlock direct dump /mnt/nfs/sanlock/leaderLease
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                   LS                                           leader 0000004787 0001 0009 1059

在共享存储/dev/mapper/raw上直接测试

修改simpleHA.py中的"LS"改为“RAW”。 把:

sanlock.acquire("RAW", "leader", [(leasePath, 0)], sfd)

改为:

sanlock.acquire("RAW", "leader", [(leasePath, 1048576)], sfd)
  • Test1节点:
$ ./simpleHA.py 1 /dev/mapper/raw /dev/mapper/raw
Try to acquire host id RAW:1:/dev/mapper/raw:0
2017 04 CST 19:35:28 Enter Lockspace Begin
2017 04 CST 19:37:43 Enter Lockspace End
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
2017 04 CST 20:11:57 Service is running
2017 04 CST 20:12:07 Service is running
2017 04 CST 20:12:17 Service is running
2017 04 CST 20:12:27 Service is running
2017 04 CST 20:12:37 Service is running
2017 04 CST 20:12:47 Service is running
2017 04 CST 20:12:57 Service crashed
  • Test2节点:
$ ./simpleHA.py 2 /dev/mapper/raw /dev/mapper/raw
Try to acquire host id RAW:2:/dev/mapper/raw:0
2017 04 CST 19:35:21 Enter Lockspace Begin
2017 04 CST 19:37:45 Enter Lockspace End
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease RAW:leader:/dev/mapper/raw:0
2017 04 CST 20:12:59 Service is running
2017 04 CST 20:13:09 Service is running
2017 04 CST 20:13:19 Service is running

查看租约当前状态:

$ sanlock direct dump /dev/mapper/raw:0
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                  RAW       ab6d9d16-54ba-4638-8d4a-f9f4c59e969c.Test1 0000000508 0001 0001
00000512                                  RAW       9ef6aac6-90a0-43ab-bc7e-736f1708b5c6.Test2 0000000479 0002 0001
01048576                                  RAW                                           leader 0000000459 0002 0001 2

$ sanlock direct dump /dev/mapper/raw:1048576
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                  RAW                                           leader 0000000459 0002 0001 2

在共享存储/dev/mapper/lvm的LVM卷上测试

修改simpleHA.py中的"LS"改为“LVM”。

  • Test1节点:
$ ./simpleHA.py 1 /dev/storage/lockspace /dev/storage/leader
Try to acquire host id LVM:1:/dev/storage/lockspace:0
2017 04 CST 19:46:58 Enter Lockspace Begin
2017 04 CST 19:47:19 Enter Lockspace End
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
2017 04 CST 19:47:19 Service is running
2017 04 CST 19:47:29 Service is running
2017 04 CST 19:47:39 Service is running
2017 04 CST 19:47:49 Service is running
2017 04 CST 19:47:59 Service is running
2017 04 CST 19:48:09 Service is running
2017 04 CST 19:48:19 Service crashed
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
2017 04 CST 19:49:24 Service is running
2017 04 CST 19:49:34 Service is running
2017 04 CST 19:49:44 Service is running
  • Test2节点:
$ ./simpleHA.py 2 /dev/storage/lockspace /dev/storage/leader
Try to acquire host id LVM:2:/dev/storage/lockspace:0
2017 04 CST 19:47:19 Enter Lockspace Begin
2017 04 CST 19:47:40 Enter Lockspace End
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
2017 04 CST 19:48:21 Service is running
2017 04 CST 19:48:31 Service is running
2017 04 CST 19:48:41 Service is running
2017 04 CST 19:48:51 Service is running
2017 04 CST 19:49:01 Service is running
2017 04 CST 19:49:11 Service is running
2017 04 CST 19:49:21 Service crashed
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.
Try to acquire leader lease LVM:leader:/dev/storage/leader:0
Failed to acquire leases, try again in 10s.

查看租约当前状态:

$ sanlock direct dump /dev/storage/lockspace 
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                  LVM       91d8405e-a0e1-40c0-8705-8af8c8e93bcc.Test1 0000001758 0001 0001
00000512                                  LVM       72de2b1f-4626-42f1-b058-0137c9f982e1.Test2 0000001284 0002 0001

$ sanlock direct dump /dev/storage/leader
  offset                            lockspace                                         resource  timestamp  own  gen lver
00000000                                  LVM                                           leader 0000001243 0002 0001 2

总结

  • 在NFS,共享存储LUN上、或者在共享存储LVM卷上,测试的结果是相同的。
  • 加入Lockspace的时间一般比较长,最短20秒,最长要两三分钟,如果有多个主机竞争相同的Host ID,花费时间会更长。

转载于:https://my.oschina.net/LastRitter/blog/1538793

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值