DRBD+heartbeat+nfs

最新推荐文章于 2024-07-27 18:51:22 发布

weixin_34290390

最新推荐文章于 2024-07-27 18:51:22 发布

阅读量109

点赞数

文章标签：开发工具操作系统运维

原文链接：http://blog.51cto.com/liufan0321/1123196

版权

DRBD+heartbeat+nfs

一、基础知识：

1 简介：

Distributed Replicated Block Device(DRBD)是一个用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。

2 原理：

DRBD的核心功能通过Linux的内核实现，最接近系统的IO栈，但它不能神奇地添加上层的功能比如检测到EXT3文件系统的崩溃。

3 DRBD的工具：

3.1 drbdadm：高级管理工具，管理/etc/drbd.conf，向drbdsetup和drbdmeta发送指令；

3.2 drbdsetup：配置装载进kernel的DRBD模块，平时很少直接用；

3.3 drbdmeta：管理META数据结构，平时很少直接用；

4 DRBD的模式：

4.1 单主模式：典型的高可靠性集群方案。

4.2 复主模式：需要采用共享cluster文件系统，如GFS和OCFS2。用于需要从2个节点并发访问数据的场合，需要特别配置。

5 复制模式：3种模式：

5.1 协议A：异步复制协议。本地写成功后立即返回，数据放在发送buffer中，可能丢失。

5.2 协议B：内存同步（半同步）复制协议。本地写成功并将数据发送到对方后立即返回，如果双机掉电，数据可能丢失。

5.3 协议C：同步复制协议。本地和对方写成功确认后返回。如果双机掉电或磁盘同时损坏，则数据可能丢失。

一般用协议C。选择协议将影响流量，从而影响网络时延。

如果主服务器宕机，造成的损失是不可估量的。要保证主服务器不间断服务，就需要对服务器实现冗余。在众多的实现服务器冗余的解决方案中，heartbeat为我们提供了廉价的、可伸缩的高可用集群方案。我们通过heartbeat+drbd在Linux下创建一个高可用(HA)的集群服务器。
DRBD是一种块设备，可以被用于高可用(HA)之中。它类似于一个网络RAID-1功能。当你将数据写入本地文件系统时，数据还将会被发送到网络中另一台主机上。以相同的形式记录在一个文件系统中。本地(主节点)与远程主机(备节点)的数据可以保证实时同步。当本地系统出现故障时，远程主机上还会保留有一份相同的数据，可以继续使用。在高可用(HA)中使用DRBD功能，可以代替使用一个共享盘阵。因为数据同时存在于本地主机和远程主机上。切换时，远程主机只要使用它上面的那份备份数据，就可以继续进行服务了。

案例：

Node1 配置

Node1上：

[root@localhost ~]# uname -r

2.6.18-164.el5

[root@localhost ~]# vim /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=node1.a.com

[root@localhost ~]# vim /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

::1 localhost6.localdomain6 localhost6

192.168.145.99 node1.a.com

192.168.145.100 node2.a.com

创建分区：

Node1：

[root@node1 ~]# fdisk /dev/sdb

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel

Building a new DOS disklabel. Changes will remain in memory only,

until you decide to write them. After that, of course, the previous

content won't be recoverable.

The number of cylinders for this disk is set to 2610.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

(e.g., DOS FDISK, OS/2 FDISK)

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n

Command action

e extended

p primary partition (1-4)

Partition number (1-4):

Value out of range.

Partition number (1-4): 1

First cylinder (1-2610, default 1):

Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-2610, default 2610):

Using default value 2610

Command (m for help): p

Disk /dev/sdb: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdb1 1 2610 20964793+ 83 Linux

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

Node1上：

[root@node1 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897

Preparing... ########################################### [100%]

1:drbd83 ########################################### [100%]

[root@node1 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897

Preparing... ########################################### [100%]

1:kmod-drbd83 ########################################### [100%]

查看配置文件

[root@node1 ~]# vim /etc/drbd.conf

编辑global_common.conf文件，编辑之前最好做备份

[root@node1 ~]# cd /etc/drbd.d/

[root@node1 drbd.d]# ll

total 4

-rwxr-xr-x 1 root root 1418 Jun 4 2010 global_common.conf

[root@node1 drbd.d]# cp -p global_common.conf global_common.conf.bak

[root@node1 drbd.d]# vim global_common.conf

global {

usage-count yes; # minor-count dialog-refresh disable-ip-verification

}

common { protocol C; 协议C

startup {

wfc-timeout 120; 延迟时间

degr-wfc-timeout 120;

}

disk {

on-io-error detach; 当io出错时拆除磁盘

fencing resource-only;

}

net {

cram-hmac-alg "sha1"; 加密方式shal

shared-secret "mydrbdlab"; 密钥

}

syncer {

rate 100M;

}

手动定义资源

[root@node1 drbd.d]# vim /etc/drbd.d/web.res

resource web { 资源名

on node1.a.com {

device /dev/drbd0; 逻辑设备名

disk /dev/sdb1; 真实设备名，节点间共享的磁盘

address 192.168.145.99:7789; 节点1的ip地址

meta-disk internal;

}

on node2.a.com { 资源名

device /dev/drbd0;

disk /dev/sdb1;

address 192.168.145.100:7789;

meta-disk internal;

}

初始化资源 web

8 16 20971520 sdb

[root@node1 drbd.d]# drbdadm create-md web

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

启动服务

[root@node1 drbd.d]# service drbd start

查看哪台设备处于激活状态

[root@node1 drbd.d]# cat /proc/drbd

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16

0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r---- 发现两台设备都处于备份状态，需要手工设置

ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:2080284

都为second 状态，没有同步

node2 配置

Node2上：

[root@localhost ~]# uname -r

2.6.18-164.el5

[root@localhost ~]# vim /etc/sysconfig/network

NETWORKING=yes

NETWORKING_IPV6=no

HOSTNAME=node2.a.com

[root@localhost ~]# vim /etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

::1 localhost6.localdomain6 localhost6

192.168.145.99 node1.a.com

192.168.145.100 node2.a.com

创建分区：

[root@node2 ~]# fdisk /dev/sdb

Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel

Building a new DOS disklabel. Changes will remain in memory only,

until you decide to write them. After that, of course, the previous

content won't be recoverable.

The number of cylinders for this disk is set to 2610.

There is nothing wrong with that, but this is larger than 1024,

and could in certain setups cause problems with:

1) software that runs at boot time (e.g., old versions of LILO)

2) booting and partitioning software from other OSs

(e.g., DOS FDISK, OS/2 FDISK)

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n

Command action

e extended

p primary partition (1-4)

Partition number (1-4): 1

First cylinder (1-2610, default 1):

Using default value 1

Last cylinder or +size or +sizeM or +sizeK (1-2610, default 2610):

Using default value 2610

Command (m for help): p

Disk /dev/sdb: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System

/dev/sdb1 1 2610 20964793+ 83 Linux

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

Syncing disks.

[root@node2 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897

Preparing... ########################################### [100%]

1:drbd83 ########################################### [100%]

[root@node2 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897

Preparing... ########################################### [100%]

1:kmod-drbd83 ########################################### [100%]

[root@node2 ~]# cd /etc/drbd.d/

[root@node2 drbd.d]# ll

total 4

-rwxr-xr-x 1 root root 1418 Jun 4 2010 global_common.conf

[root@node2 drbd.d]# cp -p global_common.conf global_common.conf.bak

global {

usage-count yes;

# minor-count dialog-refresh disable-ip-verification

}

common {

protocol C;

startup {

wfc-timeout 120; degr-wfc-timeout 120;

} disk {

on-io-error detach; fencing resource-only;

}

net {

cram-hmac-alg "sha1"; shared-secret "mydrbdlab";

}

syncer {

rate 100M;

}

[root@node2 drbd.d]# vim /etc/drbd.d/web.res

ource web {

on node1.a.com {

device /dev/drbd0;

disk /dev/sdb1;

address 192.168.145.99:7789;

meta-disk internal;

}

on node2.a.com {

device /dev/drbd0;

disk /dev/sdb1;

address 192.168.145.100:7789;

meta-disk internal;

}

8 16 20971520 sdb

[root@node2 drbd.d]# drbdadm create-md web

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

启动 drbd ：

[root@node2 drbd.d]# service drbd start

Node1上启动主动模式

[root@node1 drbd.d]# drbdadm -- --overwrite-data-of-peer primary web

[root@node1 drbd.d]# drbd-overview node1上查看

0:web SyncSource Primary/Secondary UpToDate/Inconsistent C r----

[====>...............] sync'ed: 25.4% (1554108/2080284)K delay_probe: 40

[root@node2 drbd.d]# drbd-overview node2上查看

0:web SyncTarget Secondary/Primary Inconsistent/UpToDate C r----

[=============>......] sync'ed: 71.5% (595644/2080284)K queue_delay: 0.1 ms

此时node1上为主动模式

把node1设置为主动状态：

[root@node1 drbd.d]# drbdadm -- --overwrite-data-of-peer primary web

创建文件系统（在主节点上实现）

[root@node1 drbd.d]# mkfs -t ext3 -L drbdweb /dev/drbd0

[root@node1 drbd.d]# mkdir /web

[root@node1 drbd.d]# mount /dev/drbd0 /web/

[root@node1 drbd.d]# cd /web

[root@node1 web]# echo "hello">index.html

[root@node1 web]# ll

total 20

-rw-r--r-- 1 root root 6 Jan 20 07:31 index.html

drwx------ 2 root root 16384 Jan 20 07:28 lost+found

测试，把node1变成从的，node2 变成主的

在node2 节点上

[root@node2 ~]# mount /dev/drbd0 /web

[root@node2 ~]# cd /web

[root@node2 web]# ll

total 20

-rw-r--r-- 1 root root 6 Jan 20 07:31 index.html 可以看出在node1上创建的文件

drwx------ 2 root root 16384 Jan 20 07:28 lost+found

同样可以在node2上创建文件在node1上也可以看到

[root@node2 web]# mkdir liufan

[root@node2 web]# ll

total 24

-rw-r--r-- 1 root root 6 Jan 20 07:31 index.html

drwxr-xr-x 2 root root 4096 Jan 20 07:37 liufan

drwx------ 2 root root 16384 Jan 20 07:28 lost+found

在以上实验中我们知道只有手动将一个节点设置为主节点时，才可以访问节点下的内容。很显然不符合智能化的要求。我们可以借助 heartbeat 工具，把 drbd 作为资源，进而来实现主动节点和备份节点之间的自动转换。

两个节点都安装 nfs 工具，将分区共享出来，这样客户端就可以访问到这些内容。

1 在两个节点上挂载光盘

#mkdir /mnt/cdrom

#mount /dev/cdrom /mnt/cdrom

2 两节点都编辑nfs共享清单

Node1上

[root@node1 ~]# vim /etc/exports

/web 192.168.145.0/24(rw,sync)

Node2上

[root@node2 ~]# vim /etc/exports

/web 192.168.145.0/24(rw,sync)

[root@node1 ~]# exportfs –rv 导出

exporting 192.168.145.0/24:/web

[root@node2 ~]# exportfs -rv

exporting 192.168.145.0/24:/web

3 两个节点都修改nfs配置脚本文件

# vim /etc/rc.d/init.d/nfs

122 killproc nfsd -9

4 启动服务

# service nfs start

5 两节点都配置yum源

# vim /etc/yum.repos.d/rhel-debuginfo.repo

1 [rhel-server]

2 name=Red Hat Enterprise Linux server

3 baseurl=file:///mnt/cdrom/Server

4 enabled=1

5 gpgcheck=1

6 gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release

8 [rhel-cluster]

9 name=Red Hat Enterprise Linux cluster

10 baseurl=file:///mnt/cdrom/Cluster

11 enabled=1

12 gpgcheck=1

13 gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release

6 两节点安装heartbeat

上传heartbeat包

heartbeat-2.1.4-9.el5.i386.rpm

heartbeat-pils-2.1.4-10.el5.i386.rpm

heartbeat-stonith-2.1.4-10.el5.i386.rpm

libnet-1.1.4-3.el5.i386.rpm

perl-MailTools-1.77-1.el5.noarch.rpm

# ll

total 3084

-rw------- 1 root root 885 Nov 20 01:24 anaconda-ks.cfg

drwxr-xr-x 2 root root 4096 Nov 19 17:28 Desktop

-rwxrw-rw- 1 root root 221868 May 5 2012 drbd83-8.3.8-1.el5.centos.i386.rpm

-rwxrw-rw- 1 root root 1637238 Mar 13 2010 heartbeat-2.1.4-9.el5.i386.rpm

-rwxrw-rw- 1 root root 293349 Mar 13 2010 heartbeat-devel-2.1.4-9.el5.i386.rpm

-rwxrw-rw- 1 root root 230890 Mar 13 2010 heartbeat-gui-2.1.4-9.el5.i386.rpm

-rwxrw-rw- 1 root root 111742 Mar 13 2010 heartbeat-ldirectord-2.1.4-9.el5.i386.rpm

-rwxrw-rw- 1 root root 92070 Mar 13 2010 heartbeat-pils-2.1.4-10.el5.i386.rpm

-rwxrw-rw- 1 root root 179199 Mar 13 2010 heartbeat-stonith-2.1.4-10.el5.i386.rpm

-rw-r--r-- 1 root root 28538 Nov 20 01:24 install.log

-rw-r--r-- 1 root root 3812 Nov 20 01:22 install.log.syslog

-rwxrw-rw- 1 root root 125974 May 5 2012 kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

-rwxrw-rw- 1 root root 56817 Mar 13 2010 libnet-1.1.4-3.el5.i386.rpm

-rwxrw-rw- 1 root root 92071 Mar 13 2010 perl-MailTools-1.77-1.el5.noarch.rpm

两个几点都安装

#yum localinstall heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck –y

7 两个节点都拷贝文件并配置

# cd /usr/share/doc/heartbeat-2.1.4/

# cp -p authkeys haresources ha.cf /etc/ha.d/

# cd /etc/ha.d/

# vim ha.cf

95 bcast eth0

214 node node1.a.com

215 node node2.a.com

# vim authkeys

23 #auth 1

24 #1 crc

25 #2 sha1 HI!

26 #3 md5 Hello!

27 auth 3

28 3 md5 hellp

# chmod 600 authkeys 一定要改权限，否则heartbeat可能起不来

# vim haresources

143 node1.a.com IPaddr::192.168.145.101/24/eth0 drbddisk::web Filesystem::/dev/d

rbd0::/web::ext3 killnfsd

# vim resource.d/killnfsd

killall -9 nfsd;

/etc/init.d/nfs restart；

Exit 0

两节点启动heartbeat

# service heartbeat start

客户端测试：

当node1上heartbeat关闭时测试机仍可以访问到nfs

[root@node1 ha.d]# service heartbeat stop

Stopping High-Availability services:

[ OK ]

[root@node1 ha.d]#

转载于:https://blog.51cto.com/liufan0321/1123196

weixin_34290390

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
DRBD+heartbeat+nfs

DRBD+heartbeat+nfs一、基础知识：1 简介：Distributed Replicated Block Device(DRBD)是一个用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。2 原理：DRBD的核心功能通过Linux的内核实现，最接近系统的IO栈，但它不能神奇地添加上层的功能比如检测到EXT3文件...
复制链接

扫一扫