系统要求
2个网卡CentOS-6.3-i386最小化安装系统,2台机器的主备网卡都接到交换机上
主机设置ip
vi etc/sysconfig/network-scrpits/ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="static"
HWADDR="00:0C:29:02:C7:A1"
IPADDR="192.168.1.200"
NETMASK="255.255.255.0"
NM_CONTROLLED="yes"
ONBOOT="yes"
TYPE="Ethernet"
UUID="25f5290a-d5ce-49d0-84ef-9bfc9aff2629"
重启服务
service network restart
安装heartbeat-2.0.7
建立ha用到的用户及组
# groupadd haclient
创建haclient组。
# useradd –g haclient hacluster
创建hacluster用户,并把用户归于haclient组。
设置镜像文件为yum源
将网络yum源文件改名,不使用网络yum源
mv /etc/yum.repos.d/CentOS-Base.repo/etc/yum.repos.d/CentOS-Base.repo.bak
注:如果要使用网络yum源,可以用163的CentOS-Base.repo代替原文件
# CentOS-Base.repo
#
# The mirror system uses the connecting IPaddress of the client and the
# update status of each mirror to pickmirrors that are updated to and
# geographically close to the client. You should use this for CentOS updates
# unless you are manually picking othermirrors.
#
# If the mirrorlist= does not work for you,as a fall back you can try the
# remarked out baseurl= line instead.
#
#
[base]
name=CentOS-$releasever - Base - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/os/$basearch/
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#released updates
[updates]
name=CentOS-$releasever - Updates - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/updates/$basearch/
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/extras/$basearch/
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras
gpgcheck=1
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#additional packages that extendfunctionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/centosplus/$basearch/
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus
gpgcheck=1
enabled=0
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
#contrib - packages by Centos Users
[contrib]
name=CentOS-$releasever - Contrib - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/contrib/$basearch/
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=contrib
gpgcheck=1
enabled=0
gpgkey=http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-6
这个文件可以去http://mirrors.163.com/.help/下载,centos6的下载链接为
http://mirrors.163.com/.help/CentOS6-Base-163.repo
上传CentOS-6.3-i386-bin-DVD1.iso至/mnt
做一个映射目录mkdir /media/CentOS/
映射镜像文件到目录mount -o loop /mnt/CentOS-6.3-i386-bin-DVD1.iso /media/CentOS/
更改媒体源文件vi /etc/yum.repos.d/CentOS-Media.repo
# CentOS-Media.repo
#
# This repo is used to mount the defaultlocations for a CDROM / DVD on
# CentOS-6. You can use this repo andyum to install items directly off the
# DVD ISO that we release.
#
# To use this repo, put in your DVD and useit with the other repos too:
# yum --enablerepo=c6-media [command]
#
# or for ONLY the media repo, do this:
#
# yum --disablerepo=\* --enablerepo=c6-media [command]
[c6-media]
name=CentOS-$releasever - Media
baseurl=file:///media/CentOS/
file:///media/cdrom/
file:///media/cdrecorder/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
将baseurl的路径指向file:///media/CentOS/,enabled设为1
安装libnet依赖包
上传libnet-1.1.2.1-2.2.el6.rf.i686.rpm至/root
cd /root
rpm -ivh libnet-1.1.2.1-2.2.el6.rf.i686.rpm
安装heartbeat需要用到的依赖包
yum -y install glib2-devellibtool-ltdl-devl net-snmp-devel bzip2-devel ncurses-devel openssl-devel libtoollibxml2-devel gettext bison flex zlib-devel mailx which libxslt docbook-dtdsdocbook-style-xsl PyXML shadow-utils opensp autoconf automake gcc make gcc-c++
开始安装heartbeat-2.0.7
上传heartbeat-2.0.7至/root
cd /root/heartbeat-2.0.7
赋予操作权限 chmod -R 777 *
配置 ./ConfigureMe configure 如果有错误证明有依赖包没装全
编译与安装 make&&make install
如报
cc1: warnings being treated as errors
pils.c:245: error: initialization fromincompatible pointer type
pils.c:246: error: initialization fromincompatible pointer type
gmake[2]: *** [pils.lo] 错误 1
gmake[2]: Leaving directory`/root/heartbeat-2.0.7/lib/pils'
gmake[1]: *** [all-recursive]错误 1
gmake[1]: Leaving directory`/root/heartbeat-2.0.7/lib'
make: *** [all-recursive] 错误 1
则将/root/heartbeat-2.0.7/lib/pils里的makefile中的所有-Werror删除
如报
cc1: warnings being treated as errors
client_lib.c:1850: error: 'display_orderQ'defined but not used
gmake[2]: *** [client_lib.lo]错误 1
gmake[2]: Leaving directory`/root/heartbeat-2.0.7/lib/hbclient'
gmake[1]: *** [all-recursive]错误 1
gmake[1]: Leaving directory`/root/heartbeat-2.0.7/lib'
make: *** [all-recursive] 错误 1
则将/root/heartbeat-2.0.7/lib/hbclient里的makefile中的所有-Werror删除
如报
cc1: warnings being treated as errors
stonith_signal.h:34: error:'stonith_signal_set_simple_handler' defined but not used
gmake[4]: *** [apcmaster.lo]错误 1
gmake[4]: Leaving directory`/root/heartbeat-2.0.7/lib/plugins/stonith'
gmake[3]: *** [all-recursive]错误 1
gmake[3]: Leaving directory`/root/heartbeat-2.0.7/lib/plugins/stonith'
gmake[2]: *** [all-recursive]错误 1
gmake[2]: Leaving directory`/root/heartbeat-2.0.7/lib/plugins'
gmake[1]: *** [all-recursive]错误 1
gmake[1]: Leaving directory`/root/heartbeat-2.0.7/lib'
make: *** [all-recursive] 错误 1
则将/root/heartbeat-2.0.7/lib/plugins/stonith里的makefile中的所有-Werror删除
如报
cc1: warnings being treated as errors
conf_lex.c:1195: 错误:‘input’定义后未使用
gmake[2]: *** [recoverymgrd-conf_lex.o]错误 1
gmake[2]: Leaving directory`/root/heartbeat-2.0.7/telecom/recoverymgrd'
gmake[1]: *** [all-recursive]错误 1
gmake[1]: Leaving directory `/root/heartbeat-2.0.7/telecom'
make: *** [all-recursive] 错误 1
则将/root/heartbeat-2.0.7/telecom/recoverymgrd里的makefile中的所有-Werror删除
如报
chown hacluster/var/lib/heartbeat/cores/hacluster
chown: 无效的用户:"hacluster"
gmake[2]: [install-exec-local]错误 1 (忽略)
chmod 700/var/lib/heartbeat/cores/hacluster
gmake[2]: Nothing to be done for`install-data-am'.
gmake[2]: Leaving directory`/root/heartbeat-2.0.7'
gmake[1]: Leaving directory`/root/heartbeat-2.0.7'
则证明没添加用户及组,按照上面的添加用户及组后再编译安装。
结束后出现
chown hacluster /var/lib/heartbeat/cores/hacluster
chmod 700/var/lib/heartbeat/cores/hacluster
gmake[2]: Nothing to be done for`install-data-am'.
gmake[2]: Leaving directory`/root/heartbeat-2.0.7'
gmake[1]: Leaving directory`/root/heartbeat-2.0.7'
则证明安装成功
备机做同样的安装
HA配置
配置IP(IP为实际环境与上面略有不同)
主机配置第一块网卡IP(主要用来对外服务)
vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.8.32.12
NETMASK=255.255.255.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes
GATEWAY=10.8.32.201
主机配置第二块网卡IP(主要用来对内心跳服务)
vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.8.32.13
NETMASK=255.255.255.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes
备机配置第一块网卡
vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.8.32.13
NETMASK=255.255.255.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes
GATEWAY=10.8.32.201
备机配置第二块网卡
vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
ONBOOT=yes
BOOTPROTO=static
IPADDR=2.2.2.8
NETMASK=255.255.255.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
PEERDNS=yes
设置主机名
主机
vi/etc/sysconfig/network
NETWORKING=yes
HOSTNAME= ywgl-rz-db01 //主机名
备机
vi/etc/sysconfig/network
NETWORKING=yes
HOSTNAME= ywgl-rz-db02 //主机名
配置主机名和IP地址的映射关系
主机
vi /etc/hosts
10.8.32.12 ywgl-rz-db01
2.2.2.6 ywgl-rz-db01
10.8.32.13 ywgl-rz-db02
2.2.2.6 ywgl-rz-db02
备机
vi /etc/hosts
10.8.32.12 ywgl-rz-db01
2.2.2.6 ywgl-rz-db01
10.8.32.13 ywgl-rz-db02
2.2.2.6 ywgl-rz-db02
测试一下双机联通性
ping ywgl-rz-db01
ping ywgl-rz-db02
增加3个HA配置文件
配置心跳加密方式文件:/etc/ha.d/authkeys
vi /etc/ha.d/authkeys //没有此文件就自行建立
auth 3
3 md5 Hello!
authkeys文件用于设定heartbeat的认证方式,共有三种可用的认证方式:crc、md5和sha1,三种认证方式的安全性依次提高,但是占用的系统资源也依次增加。如果heartbeat集群运行在安全的网络上,可以使用crc方式,如果HA每个节点的硬件配置很高,建议使用sha1,这种认证方式安全级别最高,如果是处于网络安全和系统资源之间,可以使用md5认证方式。需要说明的一点是:无论auth后面指定的是什么数字,在下一行必须作为关键字再次出现,例如指定了“auth 6”,下面一定要有一行“6 认证类型”。
或者
auth 1
1 crc yourpassword
或者
auth 2
2 sha1 yourpassword
yourpassword作为数字签名密钥,用于对心跳数据包进行数字签名,保证主备的密码一样。
确保authkeys文件只能由root读取:
chmod600 /etc/ha.d/authkeys //必须把权限改成600
配置HA资源管理文件:haresources
在heartbeat中,通过haresources文件来配置共享的资源
vi /etc/ha.d/haresources //没有此文件就自行建立
ywgl-rz-db01 IPaddr::10.8.32.20/24/eth0 mysqld
haresources文件中只有一行,其含义就是,设置ywgl-rz-db01为主节点,当主节点ywgl-rz-db01宕机时,自动启用备用节点ywgl-rz-db02来提供服务。
集群服务器的ip地址为10.8.32.20,mysqld为集群启动时自动启动的脚本,必须放置在/etc/ha.d/resource.d/或/etc/init.d目录下(若不希望备机的mysql进程随heartbeat服务启动或关闭,在备机中可不加mysqld)。
配置心跳的配置文件ha.cf
vi /etc/ha.d/ha.cf //没有此文件就自行建立
bcasteth0 eth1
keepalive2
warntime2
warntime10
deadtime30
initdead60
udpport694
auto_failbackoff
nodeywgl-rz-db01
nodeywgl-rz-db02
logfile/var/log/ha-log.log
debugfile/var/log/ha-debug.log
watchdog/dev/watchdog
ping_groupgroup1 10.8.32.14 10.8.32.15
respawn root/usr/lib64/heartbeat/ipfail
apiauthipfail gid=root uid=root
参数解释
bcast eth0 eth1 #使用eth0、eth1做心跳广播监测
keepalive 2 #设定心跳(监测)时间为2秒
warntime 10 #指明心跳延迟的时间为十秒。当10秒钟内备份机不能联系上主机(当前活动的服务器,即无心跳信号),就会往日志中写入一个警告日志,但此时不会切换服务。
deadtime 30 #指定在30秒内没有心跳信号,则立即切换服务
initdead 60 #在某些系统上,系统启动或重启之后需要经过一段时间网络才能正常工作,该选项用于解决这种情况产生的时间间隔。取值至少为deadtime的两倍。
udpport 694 #使用udp端口694作为广播通信端口进行心跳监测
auto_failback off #主节点启动时,是否自动接管备节点的服务
nodeywgl-rz-db01 #节点1,必须要与 uname-n指令得到的结果一致
node ywgl-rz-db02 #节点2
respawnroot /usr/lib64/heartbeat/ipfail
#该选项是可选配置,列出与heartbeat一起启动和关闭的进程,该进程一般是和heartbeat集成的插件,这些进程遇到故障可以自动重新启动。最常用的进程是ipfail,此进程用于检测和处理网络故障,需要配合ping语句指定的ping node来检测网络的连通性。其中hacluster表示启动ipfail进程的身份,此时更换为root确保有权限,ipfail的路径要正确,此时用的是64路径举例。
#apiauth ipfail gid=haclient uid=hacluster#控制ip切换的时候所使用的用户。
若是hacluster和haclient用户和用户组是在安装heartbeat之后创建的话,则须要履行下面号令批改权限
批改heartbeat目次权限,可以用以下号令:
find / -type d -name “heartbeat” -execchown -R hacluster {} \;
find / -type d -name “heartbeat” -execchgrp -R haclient {} \;
#watchdog /dev/watchdog #该选项是可选配置,是通过Heartbeat来监控系统的运行状态。使用该特性,需要在内核中载入"softdog"内核模块,用来生成实际的设备文件,如果系统中没有这个内核模块,就需要指定此模块,重新编译内核。编译完成输入"insmod softdog"加载该模块。然后输入"grepmisc /proc/devices"(应为10),输入"cat /proc/misc |grep watchdog"(应为130)。最后,生成设备文件:"mknod/dev/watchdog c 10 130" 。即可使用此功能。
ping_group group1 10.8.32.14 10.8.32.15
#ping_group指令是用于建立伪集群成员,它们必须与ipfail指令一起使用,它们的作用是监测物理链路,也就是说如果集群节点与上述伪设备不相通,那么该节点也将无权接管资源或服务,它将释放掉资源。
#ping 10.8.32.1 #选择ping的节点,ping节点选择的越好,HA集群就越强壮,可以选择固定的路由器作为ping节点,但是最好不要选择集群中的成员作为ping节点,ping节点仅仅用来测试网络连接。最后需要注意的是不要把ping和ping group 这两行同时写出来,写ping的内容时,要保证ping group是注释的,写ping group时,要保证ping那行是注释着的,否则会出问题的,切换运行不正常。
logfile /var/log/ha-log.log
# ha的日志文件记录位置。如没有该目录,则需要手动添加
debugfile /var/log/ha-debug.log
# ha的debug文件记录位置。如没有该目录,则需要手动添加
备机拷贝同样的配置即可
将heartbeat服务设置为开机启动
# chkconfig --level2345 heartbeat on
查看是否设置成功
## chkconfig --list|grep heart
heartbeat 0:off 1:off 2:on 3:on 4:on 5:on 6:off
启动heartbeat服务
在主、备服务器上分别执行下列命令启动ha服务
#/etc/init.d/heartbeat start
检测HA运行状况
确保heartbeat服务的正常运行
在主、备服务器上分别执行
在主服务器上执行
#/etc/init.d/heartbeat status
heartbeat OK [pid 2899 et al] is runningonywgl-rz-db01 [ywgl-rz-db01]...
Ha服务运行正常
在备服务器上执行
#/etc/init.d/heartbeat status
heartbeat OK [pid 2899 et al] is runningonywgl-rz-db02 [ywgl-rz-db02]...
Ha服务运行正常
Ha双节点切换测试
停止主机heartbeat服务
/etc/init.d/heartbeat stop
等待60秒
在备机上查看ip是否正确分配
#ifconfig
eth0 Link encap:Ethernet HWaddr00:9C:02:A6:1C:3C
inet addr:10.8.32.13 Bcast:10.8.32.255 Mask:255.255.255.0
inet6 addr: fe80::29c:2ff:fea6:1c3c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:32570 errors:0 dropped:0 overruns:0 frame:0
TX packets:15568 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5784246 (5.5 MiB) TXbytes:2798357 (2.6 MiB)
Interrupt:16 Memory:f4000000-f4012800
eth0:0 Link encap:Ethernet HWaddr00:9C:02:A6:1C:3C
inet addr:10.8.32.20 Bcast:10.8.32.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:16 Memory:f4000000-f4012800
eth1 Link encap:Ethernet HWaddr00:9C:02:A6:1C:3E
inet addr:2.2.2.8 Bcast:2.2.2.255 Mask:255.255.255.0
inet6 addr: fe80::29c:2ff:fea6:1c3e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:20902 errors:0 dropped:0 overruns:0 frame:0
TX packets:4066 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:4258406 (4.0 MiB) TXbytes:1006032 (982.4 KiB)
Interrupt:17 Memory:f2000000-f2012800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:36 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:14848 (14.5 KiB) TXbytes:14848 (14.5 KiB)
正常
查看mysql是否正常启动
/etc/init.d/mysqldstatus
mysqld(pid 13531) 正在运行...
否则
/etc/init.d/mysqldstart
重启主机后关闭备机
在备机上查看ip是否正确分配
eth0 Link encap:Ethernet HWaddr 00:9C:02:A6:1D:2C
inet addr:10.8.32.12 Bcast:10.8.32.255 Mask:255.255.255.0
inet6 addr:fe80::29c:2ff:fea6:1d2c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:341023 errors:0 dropped:0overruns:0 frame:0
TX packets:191446 errors:0 dropped:0overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:64592218 (61.5 MiB) TX bytes:37011658 (35.2 MiB)
Interrupt:16 Memory:f4000000-f4012800
eth0:0 Link encap:Ethernet HWaddr 00:9C:02:A6:1D:2C
inet addr:10.8.32.20 Bcast:10.8.32.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:16 Memory:f4000000-f4012800
eth1 Link encap:Ethernet HWaddr 00:9C:02:A6:1D:2E
inet addr:2.2.2.6 Bcast:2.2.2.255 Mask:255.255.255.0
inet6 addr:fe80::29c:2ff:fea6:1d2e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:213148 errors:0 dropped:0overruns:0 frame:0
TX packets:58640 errors:0 dropped:0overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:45669655 (43.5 MiB) TX bytes:14570509 (13.8 MiB)
Interrupt:17 Memory:f2000000-f2012800
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1735 errors:0 dropped:0overruns:0 frame:0
TX packets:1735 errors:0 dropped:0overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:271944 (265.5 KiB) TX bytes:271944 (265.5 KiB)
正常
查看mysql是否正常启动
/etc/init.d/mysqldstatus
mysqld(pid 13531) 正在运行...
否则
/etc/init.d/mysqldstart
如果在主、备服务器运行状况如上所述,则表明heartbeat正常运作,能保证双机节点的正常切换。