drbd+pacemaker

最新推荐文章于 2021-02-07 07:14:23 发布

weixin_33849942

最新推荐文章于 2021-02-07 07:14:23 发布

阅读量114

点赞数

文章标签：运维开发工具

原文链接：http://blog.51cto.com/xz159065974/1399993

版权

本文讲述使用pacemaker对drbd实现自己角色切换

drbd和pacemaker结合时：挂载点必须同名

定义资源时，要指定挂载点

RA定义资源有四种:

资源类型：
primitive, native: 主资源，只能运行于一个节点
group: 组资源；
clone: 克隆资源；
总克隆数，每个节点最多可运行的克隆数；
stonith，cluster filesystem
master/slave: 主从资源

定义一种主从资源：

克隆首先需要是一个主资源，然后才能定义克隆

要想定义主从，需要先定义主资源

我们把一个提升成主的以后，要想设备能被访问，要能被挂载才能被使用，drbd的主节点上还需要运行一个资源代理 Filesystem来将它挂载上。

克隆资源：

chone-max 在整个集群中只能运行多少份克隆资源

clone-node-max 每个节点上只能运行多少份克隆资源（一般一个节点上运行一份）

notify 一旦停止和启动一个资源，要不要通知其他运行克隆资源的节点

做集群文件系统时，每个节点持有锁时要通知其他节点的

globally-unique 资源克隆在每个节点上运行并且取一个独一无二的名字 ture/false

order 是不是按顺序一个一个启动，而不是同时启动 ture/false

interleave

master/slave 克隆资源

master/slave 是一种特殊的克隆资源，因为只有两个节点，只有两份克隆，一主一从

master-max 整个节点中最多能有几个节点几份资源是主资源

master-node-max 每个节点上最能只运行运行几份主资源

如何把drbd集群配置成corosync集群，让二者结合起来工作:

一、装corosync和pacemaker

[root@node1 ~]# yum -y install corosync pacemaker

二、装crmsh.x86_64 （CRM的配置接口）

安装：

crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm
[root@node1 ~]# yum -y install  crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm

三：

1、将drbd0挂载卸载

2、所有的节点是primary主的都降级成secondary

3、停止服务，并且取消开机自动启动  （因为作高可用集群，服务不能自动启动，是要LRM帮助启动[root@node2 ~]# umount /mydata/

[root@node2 ~]# drbdadm secondary mystore
[root@node2 ~]# service drbd stop
Stopping all DRBD resources: .
[root@node2 ~]# chkconfig drbd off

以上两个节点都是。。。。。。。。。。。。。

四：配置corosync

1、配置corosync配置文件

[root@node1 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf    复制示例配置文件到本目录下为corosync.conf 
[root@node1 ~]# vim /etc/corosync/corosync.conf

# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
        version: 2
        secauth: on   打开安全验证功能
        threads: 0     打开的线程数interface {
                ringnumber: 0
                bindnetaddr: 172.16.0.0   网络地址
                mcastaddr: 226.94.10.122    多播地址
                mcastport: 5405
                ttl: 1
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: no   不记录日志文件到/var/log/message中，因为下面定义日志文件记录到 logfile: /var/log/cluster/corosync.log 中了，不用重复记录
        logfile: /var/log/cluster/corosync.log   
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}

service {       这个需要我们手动添加的  为那个服务来服务
        name:   pacemaker
        ver:    0
}
aisexec {       定义以哪个用户和组运行
        user:   root
        group:  root
}

2、做个节点间对corosync的认证

[root@node1 corosync]# scp -p authkey corosync.conf node2.corosync.com:/etc/corosync/
root@node2.corosync.com's password: 
authkey                                                                      100%  128     0.1KB/s   00:00    
corosync.conf                                                                100%  521     0.5KB/s   00:00

3、启动服务

两个节点都启动

[root@node1 corosync]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]

查看节点信息：

[root@node1 corosync]# crm status
Last updated: Tue Apr 22 00:20:09 2014
Last change: Tue Apr 22 00:20:05 2014 via crmd on node1.corosync.com
Stack: classic openais (with plugin)
Current DC: node1.corosync.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
0 Resources configured


Online: [ node1.corosync.com node2.corosync.com ]

如果只启动一个节点的话，会提示不具备投票数，所以一定要记得将两个节点的 corosync 都要启动起来

四：

定义资源

[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive webdrbd ocf:linbit:drbd params drbd_resource=web op monitor role=Master interval=50s timeout=30s op monitor role=Slave interval=60s timeout=30s
crm(live)configure# master MS_Webdrbd webdrbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"

crm(live)configure# show webdrbd
primitive webdrbd ocf:linbit:drbd \
 params drbd_resource="web" \
 op monitor interval="15s"
crm(live)configure# show MS_Webdrbd
ms MS_Webdrbd webdrbd \
 meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
crm(live)configure# verify
crm(live)configure# commit


查看当前集群运行状态：
# crm status
============
Last updated: Fri Jun 17 06:24:03 2011
Stack: openais
Current DC: node2.a.org - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ node2.a.org node1.a.org ]

 Master/Slave Set: MS_Webdrbd
 Masters: [ node2.a.org ]
 Slaves: [ node1.a.org ]

由上面的信息可以看出此时的drbd服务的Primary节点为node2.a.org，Secondary节点为node1.a.org。当然，也可以在node2上使用如下命令验正当前主机是否已经成为web资源的Primary节点：
# drbdadm role web
Primary/Secondary

3）为Primary节点上的web资源创建自动挂载的集群服务

MS_Webdrbd的Master节点即为drbd服务web资源的Primary节点，此节点的设备/dev/drbd0可以挂载使用，且在某集群服务的应用当中也需要能够实现自动挂载。假设我们这里的web资源是为Web服务器集群提供网页文件的共享文件系统，其需要挂载至/www（此目录需要在两个节点都已经建立完成）目录。

此外，此自动挂载的集群资源需要运行于drbd服务的Master节点上，并且只能在drbd服务将某节点设置为Primary以后方可启动。因此，还需要为这两个资源建立排列约束和顺序约束。

# crm
crm(live)# configure
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/www" fstype="ext3"
crm(live)configure# colocation WebFS_on_MS_webdrbd inf: WebFS MS_Webdrbd:Master
crm(live)configure# order WebFS_after_MS_Webdrbd inf: MS_Webdrbd:promote WebFS:start
crm(live)configure# verify
crm(live)configure# commit

查看集群中资源的运行状态：
 crm status
============
Last updated: Fri Jun 17 06:26:03 2011
Stack: openais
Current DC: node2.a.org - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ node2.a.org node1.a.org ]

 Master/Slave Set: MS_Webdrbd
 Masters: [ node2.a.org ]
 Slaves: [ node1.a.org ]
 WebFS (ocf::heartbeat:Filesystem): Started node2.a.org

由上面的信息可以发现，此时WebFS运行的节点和drbd服务的Primary节点均为node2.a.org；我们在node2上复制一些文件至/www目录（挂载点），而后在故障故障转移后查看node1的/www目录下是否存在这些文件。
# cp /etc/rc./rc.sysinit /www

下面我们模拟node2节点故障，看此些资源可否正确转移至node1。

以下命令在Node2上执行：
# crm node standby
# crm status
============
Last updated: Fri Jun 17 06:27:03 2011
Stack: openais
Current DC: node2.a.org - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Node node2.a.org: standby
Online: [ node1.a.org ]

 Master/Slave Set: MS_Webdrbd
 Masters: [ node1.a.org ]
 Stopped: [ webdrbd:0 ]
 WebFS (ocf::heartbeat:Filesystem): Started node1.a.org

由上面的信息可以推断出，node2已经转入standby模式，其drbd服务已经停止，但故障转移已经完成，所有资源已经正常转移至node1。

在node1可以看到在node2作为primary节点时产生的保存至/www目录中的数据，在node1上均存在一份拷贝。

让node2重新上线：
# crm node online
[root@node2 ~]# crm status
============
Last updated: Fri Jun 17 06:30:05 2011
Stack: openais
Current DC: node2.a.org - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
2 Resources configured.
============

Online: [ node2.a.org node1.a.org ]

 Master/Slave Set: MS_Webdrbd
 Masters: [ node1.a.org ]
 Slaves: [ node2.a.org ]
 WebFS (ocf::heartbeat:Filesystem): Started node1.a.org

转载于:https://blog.51cto.com/xz159065974/1399993

weixin_33849942

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
drbd+pacemaker

本文讲述使用pacemaker对drbd实现自己角色切换drbd和pacemaker结合时：挂载点必须同名定义资源时，要指定挂载点 RA定义资源有四种:资源类型： primitive, native: 主资源，只能运行于一个节点 group: 组资源； clone: 克隆资源； ...
复制链接

扫一扫