Part III. 管理任务
目录
-
5. 配置和初始化Heartbeat
- 6. 升级 Heartbeat
-
-
6.1. 从 Heartbeat 2.1 集群升级 ,不使用 CRM
- 6.2.使用 CRM 升级 Heartbeat 2.1 clusters
-
-
6.2.1. Placing the cluster in unmanaged mode
6.2.2. Backing up the CIB
6.2.3. Stopping Heartbeat services
6.2.4. Wiping files related to the CRM
6.2.5. Restoring the CIB
6.2.6. Upgrading software
6.2.7. Restarting Heartbeat services
6.2.8. Returning the cluster to managed mode
6.2.9. Upgrading the CIB schema
/etc/ha.d/ha.cf
— 全局 cluster配置文件 ./etc/ha.d/authkeys
— 单个的节点的用于检验权限的配置文件
5.1. ha.cf
文件
ha.cf
文件的例子:
autojoin none mcast bond0 239.0.0.43 694 1 0 bcast eth2 warntime 5 deadtime 15 initdead 60 keepalive 2 node alice node bob pacemaker respawn
此示例假定Bond0是在群集的共享网络接口,,ETH2是DRBD的复制两个节点之间的专用接口。因此,BOND0可以用于多播的心跳,而在eth2广播eth2的是可以接受的,因为是不共享的网络。
接下来的选项配置节点的故障检测。他们设定的时间后,它的心跳发出警告,一个不再可用对等的节点可能死亡(warntime),在此时间后心跳认为一个节点确认的死亡(死区时间),并等待其他节点的最大时间办理登机手续在群集启动时(initdead)。保持连接设置心跳发送保持活动数据包的时间间隔。所有这些选项都以秒为单位。
节点选项标识群集成员。这里列出的选项值必须完全匹配的群集节点的主机名,使用uname-n。
使pacemaker 使pacemaker 群集管理器重新有效,并,确保Pacemaker 在失败的情况下会自动重新启动。
须知 | |
---|---|
在此之前心跳版本3.0.4,
|
5.2. authkeys
文件
/etc/ha.d/authkeys
包含用于相互群集节点认证的预共享的秘密。它应当只允许root可以读取,且格式如下:
auth <num> <num> <algorithm> <secret>
num是一个简单的键索引,从1开始。通常情况下,你只会有一个关键的authkeys文件。
算法是所使用的签名算法。您可以使用MD5或SHA1,不建议使用一个简单的循环冗余校验(CRC,而不是安全的)。
secret
是实际的认证密钥。
你可以创建一个 authkeys 文件,使用生成的秘密,用下面的shell命令:
( echo -ne "auth 1\n1 sha1 "; \ dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \ > /etc/ha.d/authkeys chmod 0600 /etc/ha.d/authkeys
5.3.
传播到节点的集群配置
为了传播在ha.cf和authkeys配置文件的内容,您可以使用ha_propagate的命令,你将调用使用下列命令:
/usr/lib/heartbeat/ha_propagate
或者
/usr/lib64/heartbeat/ha_propagate
该实用程序将使用scp把配置文件/etc/ha.d/ha.cf复制到任何节点 。还可以连接到节点上使用ssh和问题chkconfig的心跳,以便在系统启动时的心跳服务。
5.4.
启动 Heartbeat 服务
Heartbeat 服务器被启动就像你会在你的机器上的任何其他系统服务启动的。根据您的系统平台上,你可能会使用下面的命令:
/etc/init.d/heartbeat start
service heartbeat start
rcheartbeat start
通过 pacemaker
项目在您的ha.cf,心跳会现在开始的起搏器守护进程(CRMD由于历史的原因名为),以及与其他的服务。几秒钟后,你应该能够检测到 Heartbeat的过程,在你的进程表:
# ps -AHfww | grep heartbeat root 2772 1639 0 14:27 pts/0 00:00:00 grep heartbeat root 4175 1 0 Nov08 ? 00:37:57 heartbeat: master control process root 4224 4175 0 Nov08 ? 00:01:13 heartbeat: FIFO reader root 4227 4175 0 Nov08 ? 00:01:28 heartbeat: write: bcast eth2 root 4228 4175 0 Nov08 ? 00:01:29 heartbeat: read: bcast eth2 root 4229 4175 0 Nov08 ? 00:01:35 heartbeat: write: mcast bond0 root 4230 4175 0 Nov08 ? 00:01:32 heartbeat: read: mcast bond0 102 4233 4175 0 Nov08 ? 00:03:37 /usr/lib/heartbeat/ccm 102 4234 4175 0 Nov08 ? 00:15:02 /usr/lib/heartbeat/cib root 4235 4175 0 Nov08 ? 00:17:14 /usr/lib/heartbeat/lrmd -r root 4236 4175 0 Nov08 ? 00:02:48 /usr/lib/heartbeat/stonithd 102 4237 4175 0 Nov08 ? 00:00:54 /usr/lib/heartbeat/attrd 102 4238 4175 0 Nov08 ? 00:08:32 /usr/lib/heartbeat/crmd 102 5724 4238 0 Nov08 ? 00:04:47 /usr/lib/heartbeat/pengine
最终,你必须用如下命令确认集群已经在工作。
# crm_mon -1 ============ Last updated: Mon Dec 13 14:29:36 2010 Stack: Heartbeat Current DC: alice (083146b9-6e26-4ac8-a705-317095d0ba57) - partition with quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, unknown expected votes 24 Resources configured. ============ Online: [ alice bob ]
5.5. Where to go from here
现在,你有一个 Heartbeat配置,你会希望继续 配置起搏器,并添加群集资源。可用于您的阅读下列文件:
- Clusters From Scratch is an excellent guide for configuring Pacemaker clusters. The document primarily covers Pacemaker on Corosync, but starting with its chapter "Using Pacemaker Tools" applies identically to Heartbeat clusters.
- The DRBD User’s Guide has a chapter dedicated to integrating DRBD with Pacemaker clusters.
- Several Technical Guides covering Heartbeat/Pacemaker clusters for a variety of applications are available fromthe LINBIT web site.
Chapter 6.升级 Heartbeat
6.1. 不使用CRM升级Heartbeat 2.1 集群
不使用CRM Heartbeat 2.1 集群 ( 集群配置haresources文件,升级到3.0需要将当前配置到一个合适的Pacemaker
。)
须知: | |
---|---|
升级过程会造成短暂的停机,尽管如此,如果升级经过计划,测试和执行,这个停机时间会缩短到分钟级别,甚至于秒级别,这个取决于配置 |
您当前的备用节点,也就是说,目前的群集节点没有运行任何资源,你应该开始升级过程。如果您的集群使用的是主动 - 主动配置(两个节点上运行资源),选择其中一个并发出以下命令的所有资源转移到对等节点:
# hb_standby
然后,仅在该节点上,停止心跳服务:
# /etc/init.d/heartbeat stop
在升级过程中,重要的是要记得,Heartbeat 2.1 tree2.1 已拆分成模块化的部件。因此,你,将取代Heartbeat 成三个独立 的软件:Cluster Glue, Pacemaker, and Heartbeat 3 (包括集群消息层)。
- 从源代码升级:在解压缩归档文件可以安装Heartbeat2.1,运行卸载。然后,安装Cluster Glue, 和Heartbeat。
- 使用本地制造的包:手动安装软件包时,先卸载心跳包升级。然后安装Cluster Glue,Heartbeat3 ,资源代理,和Pacemaker。
- 使用一个包的库升级:升级时使用APT,YUM,或使用Zypper库,你应该能够运行安装命令的Heartbeat3 和Pacemaker的依赖将自动解决。Up
这是不要重启heartbeat的服务
现在必须指示Pacemaker在集群启动集群通讯层。要做到这一点,添加
crm respawn
到您的ha.cf配置文件.
重要: | |
---|---|
在这一点上,你也应该检查您的ha.cf文件中,对在ha.cf(5)手册页,并删除任何过时的选项。 |
当您的ha.cf修改是完整的,将该文件复制到对等节点。
你的集群将被以Pacemaker-enabled mode模式重启. 动作如下:
- 运行
/etc/init.d/heartbeat 停止你活动的节点 ,然后关闭你的集群资源。
- 运行
/etc/init.d/heartbeat 开始您的备用节点(您创建您的CIB)。这将启动本地 Heartbeat 实例和 Pacemaker。然后等待爱他的集群中的节点并检查。
- 运行
/etc/init.d/heartbeat 启动你的其他节点,然后启动本地
Heartbeat实例和 Pacemaker ,让 CIB 自动运行,然后启动应用
6.2. 从CRM功能的心跳2.1集群升级
本节概述了必要的步骤,升级内置的CRM启用群集心跳2.1,心跳3.0起搏器。
当适当的规划和执行,完成升级程序可以在一分钟的方式,没有应用停机时间。我们强烈建议您阅读和理解本节中介绍的步骤,然后再尝试一个生产集群uqpgrade的。本节所述的所有命令必须以root身份运行。不要所有群集节点上并行执行的各个步骤。相反,在每个节点上完成此过程,然后再继续下一个。
With this step, the cluster temporarily relinquishes control of its resources. This means that the cluster no longer monitors its nodes or resources for the duration of the upgrade, and will not rectify and application or node failures during this time. Currently running resources, however, will continue to run.
# crm_attribute -t crm_config -n is_managed_default -v false
In most configurations, individual resources do not set the is_managed
attribute individually, and hence the cluster-wide attribute is_managed_default
applies to all of them.
If in your specific configuration you do have resources that have this attribute set, you should remove it to make sure the default applies:
# crm_resource -t primitive -r <resource-id> -p is_managed -D
At this point, is is important to save a copy of the Cluster Information Base (CIB). The CIB to save is stored in a file named cib.xml
, normally located in /var/lib/heartbeat/crm
.
# cp /var/lib/heartbeat/crm/cib.xml ~
You need to perform this step on only one node currently connected to the cluster. Do not delete this file, it will be restored later.
You may now stop Heartbeat with /etc/init.d/heartbeat stop
or the preferred command to stop a system service on your distribution (service heartbeat stop
, rcheartbeat stop
, etc.).
In case you are running a legacy version of Heartbeat affected by a shutdown bug, then graceful crmd
shutdown will not work properly in unmanaged mode.
In this case, after you have initiated a graceful service shutdown with the above command, kill the crmd process manually:
- use
ps -AHfww
to retrieve the process ID ofcrmd
; - kill
crmd
with aTERM
signal.
警告 | |
---|---|
在继续本节之前,请确认你已经创建的备份副本的CIB您的群集节点之一,第6.2.2节中所述,“备份CIB”。 |
现在你需要从你的节点删除本地的 CRM相关文件 ,你必须删除所有CRM储存的CIB信息的文件 , 常用的有 /var/lib/heartbeat/crm
.
# rm /var/lib/heartbeat/crm/*
Note | |
---|---|
You should only proceed with this step if Heartbeat is still stopped on all cluster nodes, and all cluster nodes have had their CIB contents wiped. If you still have remaining nodes that have a residual CIB configuration, proceed as outlined in Section 6.2.4, “Wiping files related to the CRM”. |
Restoring the CIB means copying the CIB backup described in Section 6.2.2, “Backing up the CIB” to the/var/lib/heartbeat/crm
directory.
# cp ~/cib.xml /var/lib/heartbeat/crm/cib.xml # chown hacluster:haclient /var/lib/heartbeat/crm/cib.xml # chmod 0600 /var/lib/heartbeat/crm/cib.xml
You must perform this step on one node only, namely the first node on which you are about to upgrade the cluster software. On all other nodes, the /var/lib/heartbeat/crm
directory must remain empty — Pacemaker distributes the CIB automatically.
While upgrading, it is important to recall that the monolithic Heartbeat 2.1 tree has been split up into modular parts. Thus you will replace Heartbeat with three individual pieces of software: Cluster Glue, Pacemaker, and Heartbeat 3 which comprises just the cluster messaging layer.
- Upgrading from source: In the unpacked archive that you installed Heartbeat 2.1 from, run
make uninstall
. Then, install Cluster Glue and Heartbeat. - Upgrading using locally built packages: When installing packages manually, uninstall the
heartbeat
package first. Then installcluster-glue
, the version 3heartbeat
package,resource-agents
, andpacemaker
. - Upgrading using a package repository: When upgrading using an APT, YUM, or Zypper repository, you should just be able to run the install command for heartbeat version 3 and pacemaker, and the dependencies will be resolved automatically.
If this is the last node to be upgraded in your cluster, and your package management system did not restart Heartbeat services after the software upgrade, you should now proceed to Section 6.2.7, “Restarting Heartbeat services”. Otherwise, you should move to the next node and proceed as outlined in Section 6.1.1, “Stopping Heartbeat services” Section 6.2.6, “Upgrading software”.
Note | |
---|---|
This step may be omitted in case your package management system automatically restarts Heartbeat services during post-install. |
First, restart Heartbeat on the node where you restored the CIB (see Section 6.2.5, “Restoring the CIB”) with/etc/init.d/heartbeat
start. Then, repeat this command on the remaining cluster nodes. At this time,
- the cluster is still in unmanaged mode (meaning it does not start, stop, or monitor any resources),
- the cluster redistributes the old CIB among its nodes, and
- the cluster is still using the pre-upgrade CIB schema.
Once the cluster software has been upgraded, it is recommended to return the cluster into managed mode:
# crm_attribute -t crm_config -n is_managed_default -v true
Although an upgraded cluster can theoretically operate on the pre-upgrade CIB schema indefinitely, it is strongly recommended to upgrade the CIB to the current schema. To do so, run the following command after cluster communications between all nodes have been re-established:
# cibadmin --upgrade --force