注:内容参考鸟哥以及部分网友的帖子,经自己实际操作后发表,感谢鸟哥以及网友的技术贴。
如有遇到问题欢迎讨论:QQ346456
1
1.
2.
3.
确保包含主从服务器以及IP地址。
如:
192.168.1.101
192.168.1.102
4.
5.
6.
7.
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10 chkconfig –add drbd
7.11 chkconfig drbd on
加载DRBD模块
7.12 modprobe drbd
查看DRBD模块是否加载到内核
7.13 lsmod|grep drbd
8.
8.1
添加以下内容:
resource r0
protocol C;
startup {
disk
net
syncer
on host01{
on host02{
}
8.
8.1
8.2
创建设备元数据时失败
dd if=/dev/zero of =/dev/sdb1 bs=1M count=100
***************************************************
等待片刻,显示success表示drbd块创建成功
----------------
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
As with nodes, we count the total number of devices mirrored by DRBD
at http://usage.drbd.org.
The counter works anonymously. It creates a random number to identify
the device and sends that random number, along with the kernel and
DRBD version, to usage.drbd.org.
http://usage.drbd.org/cgi-bin/insert_usage.pl?
nu=716310175600466686&ru=15741444353112217792&rs=1085704704
* If you wish to opt out entirely, simply enter 'no'.
* To continue, just press [RETURN]
success
----------------
8.3
----------------
[need to type 'yes' to confirm] yes
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
----------------
9.
需要主从同时启动才能生效
service drbd start
10.
# service drbd status
----------------
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a
2013-05-27 20:45:19
m:res
0:r0
----------------
这里ro:Secondary/Secondary表示两台主机的状态都是备机状态,ds是磁盘状态,显示的状态内容为“不一致”,这是因为DRBD无法判断哪一方为主机,应以哪一方的磁盘数据作为标准。
如果红色内容为Secondary/Unkown表示两台机器未正常连接
11.
drbdsetup
之后查看主从的DRBD状态
service drbd status
12.
mkfs.ext4 /dev/drbd0
mount /dev/drbd0 /data_drbd
注:从节点不可用,即不可写也不可读,所以读写操作只能在主节点上进行。
13.
主:
cd /data_drbd
touch host01
cd ..
umount /data_drbd
drdbsetup /dev/drbd0 secondary
从:
drbdsetup /dev/drbd0 primary(或者drbdadm primary r0)
mount /dev/drbd0 /data_drbd
cd /data_drbd
touch host02
ls
会显示两个节点的全部内容。
注:由于首次配置完主备drbd后默认两台都是从库(secondary),且从库默认为只读,此时挂载/dev/drbd0设备会出错,所以需要激活主primary后再挂载,如
mount
2
2.1
heartbeat-3.0.4,heartbeat-devel,heartbeat-libs,Cluster-glue-libs,cluster-glue (libnet,PyXML,perl-TimeDate.noarch,Quota,cifs-utils安装光盘中包含,其他包可在rpmfind.net网站可以下载)
建议使用yum install安装rpm包,这样当需要安装依赖的程序包时系统会优先检查yum源,如果yum源中包含该程序包会自动安装。
2.2安装heartbeat
yum install xxxx…..过程略(需要配置好yum源以及IP地址)
2.3
编辑
debugfile /var/log/ha-debug #用于记录heartbeat的调试信息
logfile /var/log/ha-log #用于记录heartbeat的日志信息
logfacility local0 #系统日志级别
keepalive 2 #设定心跳(监测)间隔时间,默认单位为秒
warntime 10 ##警告时间,通常为deadtime时间的一半
deadtime 30 #
initdead 60 #网络启动时间,至少为deadtime的两倍。
hopfudge 1 #可选项:用于环状拓扑结构,在集群中总共跳跃节点的数量
udpport 694 #使用udp端口694
bcast eth0
ucast eth0 192.168.1.102 #采用单播,进行心跳监测,IP为对方主机
###RHEL6.4版本可能由于内核版本过低导致开启heartbeat失败,在CENTOS6.6版本上验证通过
auto_failback on #on表示当拥有该资源的属主恢复之后,资源迁移到属主上
node host01 #设置集群中的节点,节点名须与uname –n相匹配
node host02 #节点2
ping 192.168.1.254 #ping集群以外的节点,仲裁点,用于检测网络的连接性,第三IP
respawn root /usr/lib64/heartbeat/ipfail
apiauth ipfail gid=root uid=root #设置所指定的启动进程的权限
编辑
host01 IPaddr::192.168.1.200/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data_drbd::ext4 killnfsd
##当heartbeat配置的虚拟ip
##drbddisk默认为r0,文件系统在/dev/drbd0,挂载点/data_drbd,文件格式ext4,杀进程脚本killnfsd。
编辑
killall -9 nfsd; /etc/init.d/nfs restart; exit 0
编辑
auth 1
1 crc
YUM安装后需要手动配置编辑drbddisk脚本
vi /etc/ha.d/resource.d/drbddisk
#!/bin/bash
#
# This script is inteded to be used as resource script by heartbeat
#
# Copright 2003-2008 LINBIT Information Technologies
# Philipp Reisner, Lars Ellenberg
#
###
DEFAULTFILE="/etc/default/drbd"
DRBDADM="/sbin/drbdadm"
if [ -f $DEFAULTFILE ]; then
. $DEFAULTFILE
fi
if [ "$#" -eq 2 ]; then
RES="$1"
CMD="$2"
else
RES="all"
CMD="$1"
fi
drbd_set_role_from_proc_drbd()
{
local out
if ! test -e /proc/drbd; then
ROLE="Unconfigured"
return
fi
dev=$( $DRBDADM sh-dev $RES )
minor=${dev#/dev/drbd}
if [[ $minor = *[!0-9]* ]] ; then
# sh-minor is only supported since drbd 8.3.1
minor=$( $DRBDADM sh-minor $RES )
fi
if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then
ROLE=Unknown
return
fi
if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then
set -- $out
ROLE=${5%/**}
: ${ROLE:=Unconfigured} # if it does not show up
else
ROLE=Unknown
fi
}
case "$CMD" in
start)
# try several times, in case heartbeat deadtime
# was smaller than drbd ping time
try=6
while true; do
$DRBDADM primary $RES && break
let "--try" || exit 1 # LSB generic error
sleep 1
done
;;
stop)
# heartbeat (haresources mode) will retry failed stop
# for a number of times in addition to this internal retry.
try=3
while true; do
$DRBDADM secondary $RES && break
# We used to lie here, and pretend success for anything != 11,
# to avoid the reboot on failed stop recovery for "simple
# config errors" and such. But that is incorrect.
# Don't lie to your cluster manager.
# And don't do config errors...
let --try || exit 1 # LSB generic error
sleep 1
done
;;
status)
if [ "$RES" = "all" ]; then
echo "A resource name is required for status inquiries."
exit 10
fi
ST=$( $DRBDADM role $RES )
ROLE=${ST%/**}
case $ROLE in
Primary|Secondary|Unconfigured)
# expected
;;
*)
drbd_set_role_from_proc_drbd
esac
case $ROLE in
Primary)
echo "running (Primary)"
exit 0 # LSB status "service is OK"
;;
Secondary|Unconfigured)
echo "stopped ($ROLE)"
exit 3 # LSB status "service is not running"
;;
*)
echo "cannot determine status, may be running ($ROLE)"
exit 4 # LSB status "service status is unknown"
;;
esac
;;
*)
echo "Usage: drbddisk [resource] {start|stop|status}"
exit 1
;;
esac
exit 0
3
3.1
3.2
##/data_drbd默认NFS共享文件夹。
## 192.168.1.0/24
##rw表示读写权限
##no_root_squash(默认为root_squash,表示当以root用户访问该文件夹时默认按nobody权限赋予,开启no_root_squash时当以root访问该文件夹时保留root的权限,按默认值会导致root用户无权限访问NFS文件目录。
3.3
service heartbeat start