1、环境描述
服务器A(主) 192.85.1.175
服务器B(从) 192.85.1.176
Mysql版本:5.1.61
系统版本:System OS:ubuntu 10.10 X86
2.安装heartbeat
1)安装heartbeat
sudo apt-get install heartbeat
2)配置说明
heartbeat的安装目录为/etc/ha.d目录下,
安装完成后,需要三个配置文件,为 ha.cf,haresources,authkeys。
此时目录下没有这三个文件,需要创建,我们可以在
/usr/share/doc/heartbeat目录里找到ha.cf、haresources、authkeys三个文件,只需将其拷贝到
/etc/ha.d目录下,即可
*.gz文件,使用 gunzip 命令解压
3.175服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.175 primary # Added by NetworkManager
(2)ha.cf 文件内容:(主配置文件)
#
# There are lots of options in this file. All you have to have is a set
# of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
# and a value for "auto_failback".
#
# ATTENTION: As the configuration file is read line by line,
# THE ORDER OF DIRECTIVE MATTERS!
#
# In particular, make sure that the udpport, serial baud rate
# etc. are set before the heartbeat media are defined!
# debug and log file directives go into effect when they
# are encountered.
#
# All will be fine if you keep them ordered as in this example.
#
#
# Note on logging:
# If all of debugfile, logfile and logfacility are not defined,
# logging is the same as use_logd yes. In other case, they are
# respectively effective. if detering the logging to syslog,
# logfacility must be "none".
#
# File to write debug messages to
debugfile /var/log/ha-debug #调试日志文件
#
#
# File to write other messages to
#
logfile /var/log/ha-log #系统运行日志文件
#
#
# Facility to use for syslog()/logger
#
logfacility local0 # 日志记录等级
#
#
# A note on specifying "how long" times below...
#
# The default time unit is seconds
# 10 means ten seconds
#
# You can also specify them in milliseconds
# 1500ms means 1.5 seconds
#
#
# keepalive: how long between heartbeats?
#
keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒
#
# deadtime: how long-to-declare-host-dead?
#
# If you set this too low you will get the problematic
# split-brain (or cluster partition) problem.
# See the FAQ for how to use warntime to tune deadtime.
#
deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡
#
# warntime: how long before issuing "late heartbeat" warning?
# See the FAQ for how to use warntime to tune deadtime.
#
warntime 10 #告警时间
#
#
# Very first dead time (initdead)
#
# On some machines/OSes, etc. the network takes a while to come up
# and start working right after you've been rebooted. As a result
# we have a separate dead time for when things first come up.
# It should be at least twice the normal dead time.
#
initdead 120 #初始化时间
#
#
# What UDP port to use for bcast/ucast communication?
#
udpport 694 #心跳信息传递的udp端口
#
# What interfaces to broadcast heartbeats over?
#
bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用
#bcast eth1 eth2 # Linux
#bcast le0 # Solaris
#bcast le1 le2 # Solaris
#
# Set up a multicast heartbeat medium
# mcast [dev] [mcast group] [port] [ttl] [loop]
#
# [dev] device to send/rcv heartbeats on
# [mcast group] multicast group to join (class D multicast address
# 224.0.0.0 - 239.255.255.255)
# [port] udp port to sendto/rcvfrom (set this value to the
# same value as "udpport" above)
# [ttl] the ttl value for outbound heartbeats. this effects
# how far the multicast packet will propagate. (0-255)
# Must be greater than zero.
# [loop] toggles loopback for outbound multicast heartbeats.
# if enabled, an outbound packet will be looped back and
# received by the interface it was sent on. (0 or 1)
# Set this value to zero.
#
#
#bcast eth0 225.0.0.1 694 1 0
#
# Set up a unicast / udp heartbeat medium
# ucast [dev] [peer-ip-addr]
#
# [dev] device to send/rcv heartbeats on
# [peer-ip-addr] IP address of peer to send packets to
#
ucast eth0 192.85.1.175
auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源
watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启
#
# Tell what machines are in the cluster
# node nodename ... -- must match uname -n
node primary #主节点名称,与uname -n显示必须一致
node backup #备用节点名称
#
# Less common options...
#
# Treats 10.10.10.254 as a psuedo-cluster-member
# Used together with ipfail below...
# note: don't use a cluster node as ping node
#
ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources (资源配置文件)
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys (认证信息配置文件)
#通讯密钥,两台机器上的文件内容必须完全一致
auth 3
3 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
4.176服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.176 backup # Added by NetworkManager
(2)ha.cf 文件内容:
#
# There are lots of options in this file. All you have to have is a set
# of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
# and a value for "auto_failback".
#
# ATTENTION: As the configuration file is read line by line,
# THE ORDER OF DIRECTIVE MATTERS!
#
# In particular, make sure that the udpport, serial baud rate
# etc. are set before the heartbeat media are defined!
# debug and log file directives go into effect when they
# are encountered.
#
# All will be fine if you keep them ordered as in this example.
#
#
# Note on logging:
# If all of debugfile, logfile and logfacility are not defined,
# logging is the same as use_logd yes. In other case, they are
# respectively effective. if detering the logging to syslog,
# logfacility must be "none".
#
# File to write debug messages to
debugfile /var/log/ha-debug #调试日志文件
#
#
# File to write other messages to
#
logfile /var/log/ha-log #系统运行日志文件
#
#
# Facility to use for syslog()/logger
#
logfacility local0 # 日志记录等级
#
#
# A note on specifying "how long" times below...
#
# The default time unit is seconds
# 10 means ten seconds
#
# You can also specify them in milliseconds
# 1500ms means 1.5 seconds
#
#
# keepalive: how long between heartbeats?
#
keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒
#
# deadtime: how long-to-declare-host-dead?
#
# If you set this too low you will get the problematic
# split-brain (or cluster partition) problem.
# See the FAQ for how to use warntime to tune deadtime.
#
deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡
#
# warntime: how long before issuing "late heartbeat" warning?
# See the FAQ for how to use warntime to tune deadtime.
#
warntime 10 #告警时间
#
#
# Very first dead time (initdead)
#
# On some machines/OSes, etc. the network takes a while to come up
# and start working right after you've been rebooted. As a result
# we have a separate dead time for when things first come up.
# It should be at least twice the normal dead time.
#
initdead 120 #初始化时间
#
#
# What UDP port to use for bcast/ucast communication?
#
udpport 694 #心跳信息传递的udp端口
#
# What interfaces to broadcast heartbeats over?
#
bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用
#bcast eth1 eth2 # Linux
#bcast le0 # Solaris
#bcast le1 le2 # Solaris
#
# Set up a multicast heartbeat medium
# mcast [dev] [mcast group] [port] [ttl] [loop]
#
# [dev] device to send/rcv heartbeats on
# [mcast group] multicast group to join (class D multicast address
# 224.0.0.0 - 239.255.255.255)
# [port] udp port to sendto/rcvfrom (set this value to the
# same value as "udpport" above)
# [ttl] the ttl value for outbound heartbeats. this effects
# how far the multicast packet will propagate. (0-255)
# Must be greater than zero.
# [loop] toggles loopback for outbound multicast heartbeats.
# if enabled, an outbound packet will be looped back and
# received by the interface it was sent on. (0 or 1)
# Set this value to zero.
#
#
#bcast eth0 225.0.0.1 694 1 0
#
# Set up a unicast / udp heartbeat medium
# ucast [dev] [peer-ip-addr]
#
# [dev] device to send/rcv heartbeats on
# [peer-ip-addr] IP address of peer to send packets to
#
ucast eth0 192.85.1.176
auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源
watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启
#
# Tell what machines are in the cluster
# node nodename ... -- must match uname -n
node primary #主节点名称,与uname -n显示必须一致
node backup #备用节点名称
#
# Less common options...
#
# Treats 10.10.10.254 as a psuedo-cluster-member
# Used together with ipfail below...
# note: don't use a cluster node as ping node
#
ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys
#通讯密钥,两台机器上的文件内容必须完全一致
auth 3
3 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
5.HA服务的启动、关闭以及测试
启动HA: service heartbeat start 或 /etc/init.d/heartbeat
关闭HA; service heartbeat stop 或 /etc/init.d/heartbeat
系统在启动时已经自动把heartbeat加载了。
使用http服务测试 heartbeat
首先启动httpd服务
#service httpd start
编辑各自主机的测试用html文件,放到/var/www/html/目录下。
启动node1的heartbeat,并执行这个指令进行监控: heartbeat status
例如直接使用 http://192.85.1.177/phpmyadmin ,可以登录管理数据库