使用别名ALIAS方式配置HACMP5.3
_
一、定义两台机的网卡地址,修改 /etc/hosts
/etc/hosts
10.0.0.70 lpar01_boot
10.0.0.71 lpar02_boot
10.1.0.71 lpar2_stb
10.1.0.70 lpar1_stb
18.0.45.10 lpar1_svc
18.0.45.11 lpar2_svc
18.0.45.100 lpar1_per
18.0.45.101 lpar2_per
二、建立共享VG和心跳VG (本例单独使用heartbeat VG 来做心跳,在使用中没有varyon)
建磁盘心跳的vg,该vg必须是concurrent类型的vg
在Lpar01 mkvg -C –V44 –y hb_vg hdisk4
注意:在64bit 下只能Enhanced Concurrent volume groups
在Lpar02 importvg -V44 -y hb_vg hdisk4
注意:2边vg的major要一致,可以用lvlstmajor命令查看,分配没有使用的major给该vg。
建HA资源组控制的vg
在Lpar01
mkvg -n –V45 -y sharevg1 hdisk5
mkvg -n –V46 -y sharevg2 hdisk6
在Lpar02
# importvg -V 45 -y sharevg1 hdisk5
#chvg –a n sharevg1
# importvg -V 46 -y sharevg2 hdisk6
#chvg –a n sharevg2
二、配置smitty hacmp
(1)Extended Configuration
àExtended Topology Configuration
àConfigure an HACMP ClusteràAdd/Change/Show an HACMP Cluster
* Cluster Name [cluster]
(2)Extended Configuration
àExtended Topology Configuration
à Configure HACMP Nodesà Add a Node to the HACMP Cluster
* Node Name [lpar01]
Communication Path to Node [lpar01_boot] +
(3)Extended Configuration
àExtended Topology Configuration
à Configure HACMP Nodesà Add a Node to the HACMP Cluster
* Node Name [lpar02]
Communication Path to Node [lpar02_boot]
(4)Extended Configuration
àDiscover HACMP-related Information from Configured Nodes
如果有警告或报错 通讯问题,检查每台机/usr/es/sbin/cluster/etc/rhosts,将上面lpar01_boot 和lpar02_boot分别加入到对方机(为了减少麻烦,可以直接把所有的label全部写进该文件)
(5)Extended Configuration
àExtended Topology Configurationà Configure HACMP Networks
# Discovered IP-based Network Types
> ether
Add an IP-Based Network to the HACMP Cluster
* Network Name [net_ether_01]
* Network Type ether
* Netmask [255.255.255.0]
* Enable IP Address Takeover via IP Aliases [Yes]
IP Address Offset for Heartbeating over IP Aliases [10.10.10.1] (可选)
(6)Extended Configuration
àExtended Topology Configurationà Configure HACMP Networks
# Discovered Serial Device Types
> diskhb
Tmssa
Add a Serial Network to the HACMP Cluster
[Entry Fields]
* Network Name [net_diskhb_01]
* Network Type diskhb
(7)Extended Configuration
àExtended Topology Configurationà Configure HACMP Communication Interfaces/Deviceà Add Discovered Communication Interface and Devicesà Communication Interfacesà net_ether_01
# Node / Network
# Interface IP Label IP Address
# net_ether_01 / lpar01
> en0 lpar01_boot 10.0.0.70
> en1 lpar1_stb 10.1.0.70
# net_ether_01 / lpar02
> en2 lpar02_boot 10.0.0.71
> en3 lpar2_stb 10.1.0.71
(8)Extended Configuration
àExtended Topology Configurationà Configure HACMP Communication Interfaces/Deviceà Add Discovered Communication Interface and Devicesà
# Node Device Device Path Pvid
lpar01 hdisk4 /dev/hdisk4 0000176f7b3
lpar02 hdisk4 /dev/hdisk4 0000176f7b3
lpar01 hdisk5 /dev/hdisk5 0000176f5b5
lpar02 hdisk5 /dev/hdisk5 0000176f5b5
lpar02 tmssa1 /dev/tmssa1
lpar01 tmssa2 /dev/tmssa2
lpar01 tty0 /dev/tty0
lpar01 tty1 /dev/tty1
(9)Extended Configuration
àExtended Topology ConfigurationàConfigure HACMP Persistent Node IP Label/Addressesà Add a Persistent Node IP Label/Addressàlpar01
[Entry Fields]
* Node Name lpar01
* Network Name [net_ether_01] +
* Node IP Label/Address [lpar1_per] +
(10)Extended Configuration
àExtended Topology ConfigurationàConfigure HACMP Persistent Node IP Label/Addressesà Add a Persistent Node IP Label/Addressàlpar01
[Entry Fields]
* Node Name lpar02
* Network Name [net_ether_01] +
* Node IP Label/Address [lpar2_per] +
(11)Extended Configuration
à Extended Resource Configurationà HACMP Extended Resources ConfigurationàConfigure HACMP Applicationsà Configure HACMP Application Servers-
[Entry Fields]
* Server Name [lpar01_app]
* Start Script [/hascript/hastart1]
* Stop Script [/hascript/hastop1]
Application Monitor Name(s)
(12)Extended Configuration
à Extended Resource Configurationà HACMP Extended Resources ConfigurationàConfigure HACMP Applicationsà Configure HACMP Application Servers-
[Entry Fields]
* Server Name [lpar02_app]
* Start Script [/hascript/hastart2]
* Stop Script [/hascript/hastop2]
Application Monitor Name(s)
(13)Extended Configuration
à Extended Resource Configurationà HACMP Extended Resources Configurationà Configure HACMP Service IP Labels/Addressesà Add a Service IP Label/Addressà Configurable on Multiple Nodesà net_ether_01
[Entry Fields]
* IP Label/Address lpar1_svc +
* Network Name net_ether_01
Alternate Hardware Address to accompany IP Label/A []
ddress
注意:如果想使用replace方式的IPAT,在此处输入网卡的MAC地址。使用replace方式,/etc/hosts文件的内容略有不同,要注意。
(14)Extended Configuration
à Extended Resource Configurationà HACMP Extended Resources ConfigurationàConfigure HACMP Service IP Labels/Addressesà Add a Service IP Label/Addressà Configurable on Multiple Nodesà net_ether_01
[Entry Fields]
* IP Label/Address lpar2_svc +
* Network Name net_ether_01
Alternate Hardware Address to accompany IP Label/A []
ddress
(15)Extended Configuration
à Extended Resource ConfigurationàHACMP Extended Resource Group Configurationà Add a Resource Group
[Entry Fields]
* Resource Group Name [lpar1_resource]
* Participating Nodes (Default Node Priority) [lpar01 lpar02] +
Startup Policy Online On Home Node Only +
Fallover Policy Fallover To Next Priority Node In The Lis> +
Fallback Policy Fallback To Higher Priority Node In The L> +
(16)Extended Configuration
à Extended Resource ConfigurationàHACMP Extended Resource Group Configurationà Add a Resource Group
[Entry Fields]
* Resource Group Name [lpar2_resource]
* Participating Nodes (Default Node Priority) [lpar02 lpar01]
Startup Policy Online On Home Node Only +
Fallover Policy Fallover To Next Priority Node In The Lis> +
Fallback Policy Fallback To Higher Priority Node In The L> +
(17)Extended Configuration
à Extended Resource ConfigurationàHACMP Extended Resource Group ConfigurationàChange/Show Resources and Attributes for a Resource Groupà lpar1_resource
Service IP Labels/Addresses [lpar1_svc] +
Application Servers [lpar01_app] +
Volume Groups [sharevg1]
(18)Extended Configuration
à Extended Resource ConfigurationàHACMP Extended Resource Group ConfigurationàChange/Show Resources and Attributes for a Resource Groupà lpar1_resource
Service IP Labels/Addresses [lpar2_svc] +
Application Servers [lpar02_app] +
Volume Groups [sharevg2]
(19)Extended Configuration
àExtended Verification and Synchronization
[Entry Fields]
* Verify, Synchronize or Both [Both] +
* Automatically correct errors found during [No] +
verification?
此处选择YES,系统会自动的修正同步校验时发现的问题,最好先使用No,自己先去解决出现的问题。
(20)#smitty clstart
Start Cluster Services
[Entry Fields]
* Start now, on system restart or both now
Start Cluster Services on these nodes [lpar01]
BROADCAST message at startup? true
Startup Cluster Information Daemon? false
Reacquire resources after forced down ? false
Ignore verification errors? false
Automatically correct errors found during No
cluster start?
注意:启动机器或hacmp 5.3 hacmp停了后,以下进程还是活的:
# lssrc -g cluster
Subsystem Group PID Status
clstrmgrES cluster 417996 active (注意这个是以前旧版本不一样)
三、同步校验出现的问题
(1)对于安装后的系统,HA校验会提示更改2个网络选项的值。
Network option "nonlocsrcroute" is set to 0,need set to 1
Network option "ipsrcrouterecv" is set to 0,need set to 1
用下面的命令更改
no –o nonlocsrcroute=1
no –o ipsrcrouterecv =1
(2)错误信息提示2个节点的共享卷组的time stamp不一致,可以让系统自动修正
(3)提示/usr/es/sbin/cluster/etc/clhosts.client文件内容不包括HA中的label,通过检查发现
该文件并不存在,使用系统自动修正,系统产生了该文件并添加了必要的内容。
四、磁盘心跳检查
在2个节点上,用于磁盘心跳的hb_vg并不需要varyon。
(1)# lssrc -ls topsvcs
Subsystem Group PID Status
topsvcs topsvcs 450634 active
Network Name Indx Defd Mbrs St Adapter ID Group ID
net_ether_01_0 [ 0] 2 0 D 10.1.0.80
net_ether_01_0 [ 0] en1
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent : 0 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 0 ICMP 0 Dropped: 0
NIM's PID: 512104
net_ether_01_1 [ 1] 2 2 S 10.0.0.80 10.0.0.85
net_ether_01_1 [ 1] en0 0x44ce904a 0x40086f06
HB Interval = 1.000 secs. Sensitivity = 10 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent : 667 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 959 ICMP 0 Dropped: 0
NIM's PID: 311506
diskhb_0 [ 2] 2 2 S 255.255.10.0 255.255.10.1
diskhb_0 [ 2] rhdisk4 0x84ce9049 0x80086f0b
HB Interval = 2.000 secs. Sensitivity = 4 missed beats
Missed HBs: Total: 0 Current group: 0
Packets sent : 313 ICMP 0 Errors: 0 No mbuf: 0
Packets received: 313 ICMP 0 Dropped: 0
NIM's PID: 442424
2 locally connected Clients with PIDs:
haemd(516282) hagsd(495686)
Dead Man Switch Enabled:
reset interval = 1 seconds
trip interval = 20 seconds
Configuration Instance = 20
Daemon employs no security
Segments pinned: Text Data.
Text segment size: 769 KB. Static data segment size: 981 KB.
Dynamic data segment size: 3841. Number of outstanding malloc: 157
User time 0 sec. System time 1 sec.
Number of page faults: 253. Process swapped out 0 times.
Number of nodes up: 2. Number of nodes down: 0.
(2)检查日志/var/ha/log目录下
带hdisk的是磁盘心跳文件,带en的是网络心跳文件
例如nim.topsvcs.en0.ha53 nim.topsvcs.en1.ha53 nim.topsvcs.rhdisk4.ha53
tail –f nim.topsvcs.rhdisk4.ha53
08/04 16:28:42.482: nim error successfully sent.
08/04 16:28:42.486: Received a SEND MSG command. Dst: .
08/04 16:28:42.486: Received a SEND MSG command. Dst: .
08/04 16:28:52.524: Received a SEND MSG command. Dst: .
08/04 16:29:02.571: Received a SEND MSG command. Dst: .
08/04 16:29:12.155: Receive thread blocked for 15 seconds