HACMP原理: 通过监控网络和网卡等信息,实现IP接管和应用监控(当监控主ihs出现问题是,backup ihs接管server IP,并对ihs设置进行启停)
1. 前提配置
备份所有host以及网络设置
a. 分别在0.90和0.80开通10.97端口(已通) - 10.97 OK(可在10.97执行下面命令)
telnet 192.168.0.90 9062 WC_defaulthost
telnet 192.168.0.90 9088 WC_defaulthost
telnet 192.168.0.80 9109 WC_defaulthost
telnet 192.168.0.80 9127 WC_defaulthost
b. 配置hosts文件 <先手动备份相关hosts>
192.168.0.94:
(
screen :
#192.168.10.92 test-v-web01 test-v-web01.jeff.com shop shop.jeff.com
add:
192.168.10.91 test-v-web01 test-v-web01.jeff.com shop shop.jeff.com
)
或
(
add:
192.168.10.97 test-v-web02 test-v-web02.jeff.com shop shop.jeff.com
)
192.168.0.106: (需每次手动修改, 修改回Machine A IHS时屏蔽10.97,放开10.92就可以)
(
screen :
#192.168.10.92 test-v-web01 test-v-web01.jeff.com shop shop.jeff.com
add:
192.168.10.91 test-v-web01 test-v-web01.jeff.com shop shop.jeff.com
)
或
(
add:
192.168.10.97 test-v-web02 test-v-web02.jeff.com shop shop.jeff.com
)
192.168.10.92:
192.168.10.A1 test-v-web01 test-v-web01.jeff.com hostweb01_boot1
192.168.10.92 test-v-web01 test-v-web01.jeff.com hostweb01_boot2
192.168.10.89 test-v-web01 test-v-web01.jeff.com media.jeff.com hostweb01_svc_media
192.168.10.90 test-v-web01 test-v-web01.jeff.com www.jeff.com hostweb01_svc_www
192.168.10.91 test-v-web01 test-v-web01.jeff.com shop shop.jeff.com hostweb01_svc_shop
192.168.10.B1 hostweb02_boot1
192.168.10.97 hostweb02_boot2
192.168.10.98 hostweb02_svc_media
192.168.10.99 hostweb02_svc_www
192.168.10.100 hostweb02_svc_shop
192.168.10.97:
screen :
#192.168.10.95 test-v-wcs02 test-v-wcs02.jeff.com
#192.168.0.96 test-v-wcs02 test-v-wcs02.jeff.com
#192.168.10.98 test-v-web02 test-v-web02.jeff.com uatshop uatshop.jeff.com shop.jeff.com
#192.168.10.97 test-v-web02 test-v-web02.jeff.com
add:
192.168.10.B1 test-v-web02 test-v-web02.jeff.com hostweb02_boot1
192.168.10.97 test-v-web02 test-v-web02.jeff.com hostweb02_boot2
192.168.10.98 test-v-web02 test-v-web02.jeff.com media.jeff.com hostweb02_svc_media
192.168.10.99 test-v-web02 test-v-web02.jeff.com www.jeff.com hostweb02_svc_www
192.168.10.100 test-v-web02 test-v-web02.jeff.com shop shop.jeff.com hostweb02_svc_shop
192.168.10.A1 hostweb01_boot1
192.168.10.92 hostweb01_boot2
192.168.10.89 hostweb01_svc_media
192.168.10.90 hostweb01_svc_www
192.168.10.91 hostweb01_svc_shop
192.168.0.90 test-v-wcs01 test-v-wcs01.jeff.com
192.168.0.94 test-v-wcs01 test-v-wcs01.jeff.com
192.168.10.103 test-v-wcsuat01 test-v-wcsuat01.jeff.com
192.168.0.104 test-v-wcsuat01 test-v-wcsuat01.jeff.com
(注:
192.168.10.A1 : machine A LPAR2 中物理网卡IP10.89修改后的物理IP,并10.89作为aliase,为service IP
192.168.10.B1 : machine A LPAR2 中物理网卡IP10.98修改后的物理IP,并10.98作为aliase,为service IP
)
c. 配置替换httpd.conf(10.97)
在10.97上面把/usr/IBM/HTTPServer/conf/httpd.conf_probackup 重命名 httpd.conf 替换原有 httpd.conf
d. 配置替换plugin_cfg.xml(10.97)
在10.97上面把/usr/IBM/HTTPServer/Plugins/config/test-v-web01/plugin-cfg.xml_probackup 重命名 plugin_cfg.xml 替换原有 plugin_cfg.xml
e. 若IHS存在应用程序使用的文件系统,即存在需要共享的存储文件目录,需要使用共享盘的形式挂载相关资源,并两台机均可访问
2. 安装HACMP for AIX (提供AIX配置于IBM确认HACMP版本,推荐和系统版本一致)
确认用户ID是否重复:lsuser -a id ALL
确认文件系统是否冲突: df -k
a. 安装前提(使用lslpp -l rsct 等确认): - OK
rsct.*
bos.adt.lib
bos.adt.libm
bos.adt.syscalls
bos.net.tcp.client
bos.net.tcp.server
bos.rte.SRC
bos.rte.libc
bos.rte.libcfg
bos.rte.libcur
bos.rte.libpthreads
bos.rte.odm
bos.data
bos.rte.lvm.rte
bos.clvm.enh
b. 安装 - 光盘安装(参看IBM步骤,需在所有节点上安装,即两台IHS) 略 (安装后会报false,需要打补丁)
*插入光盘 , 执行 smitty install_latest --- Install Software
其中注意:
INPUT device / directory for software /dev/cd0
ACCEPT new license agreements? YES
安装结束后,会报 failed,检查除以下包没装上外,其它都已安装上
版本6
cluster.doc.en_US.pprc.pdf
cluster.es.cgpprc.rte
cluster.es.pprc.cmds
cluster.es.spprc.*
cluster.es.sr.*
cluster.es.svcpprc.*
cluster.xd.*
glvm.rpv.*
版本5
clluster.hativoli
clusterhaview
netwiew
*打补丁: smitty install_latest , 安装全部
安装结束后,会报 failed,检查除以下包没装上外,其它都已安装上
glvm.rpv.*
cluster.xd.glvm
cluster.es.tc.*
cluster.es.svcpprc.*
cluster.es.sr.rte.*
cluster.es.spprc.*
cluster.es.pprc.*
cluster.es.genxd.*
cluster.es.cgpprc.*
参考: http://www.ibm.com/developerworks/cn/aix/library/0804_xinmin_hacmp/1.html
或
http://www.docin.com/p-483562102.html
注: 网络上安装步骤存在差异,具体视实际情况而定
* 网站补丁,根据版本到IBM网站下载对应网站最新补丁。若6.1版本如下:
www14.software.ibm.com/webapp/set2/sas/f/hacmp/download/aix61.html (Latest service pack)
(
*安装HACMP Smart Assist(向IBM确认HACMP是否光盘附带, 选装)
插入光盘,执行: smit install_all --- Install and Update from ALL Available Software
INPUT device/directory for software 并回车
具体参考: http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.hacmp.websphere/ha_ws_install_cdrom.htm
)
*安装确认:
确认安装: egrep -i "hacmp" /etc/inittab
输出: hacmp:2:once:/usr/es/sbin/cluster/etc/rc.init >/dev/console 2>&1
确认安装和补丁包: 主要为; lslpp -l cluster.*
确认clcomdES 已启动: lssrc -s clcomdES
c.确认hosts并修改.rhosts(/usr/es/sbin/cluster/etc/rhosts)(重要<前提>):
test-v-web01
hostweb01_boot1
hostweb01_boot2
hostweb01_svc_media
hostweb01_svc_www
hostweb01_svc_shop
test-v-web02
hostweb02_boot1
hostweb02_boot2
hostweb02_svc_media
hostweb02_svc_www
hostweb02_svc_shop
修改权限: chmod 644 /.rhosts
#
Service_ip用来与程序的客户端通信之用,主机当机后,该IP就会从主机漂移到另外一台备用机器B,有B接管该IP,原理就是通过该HOSTS文件来解析的
检查网络: netstat -i
d. 配置HACMP (注:配置需在主IHS进行,即Machine A LPAR2, 并添加资源组也需先添加Machine A LPAR2对应资源组)
配置参考: http://www.docin.com/p-483562102.html
http://blog.vsharing.com/zwj/A1089308.html
d10:编写启停脚本
media:
www:
#start www: wwwStart
cd /usr/local/apache2/bin
./apachectl - k start
#stop www: wwwStop
cd /usr/local/apache2/bin
./apachectl - k stop
shop:
#start IHS: shopStart
cd /usr/IBM/HTTPServer/bin/
./apachectl start
#stop IHS: shopStop
cd /usr/IBM/HTTPServer/bin/
./apachectl stop
权限赋值: chmod 755 shopStart
d.11 配置tty网络心跳
smitty tty->Change / add a TTY->rs232->sa->port number : 0
确认
host1: cat /etc/hosts>/dev/tty0
host2:cat</dev/tty0
d.12 创建集群
smitty hacmp->Extended Configuration->Extended Topology Configuration->Configure an HACMP Cluster->Add/Change/Show an HACMP Cluster (web_cluster)
d.13 添加节点
smitty hacmp-> Extended Configuration->Extended Topology Configuration->Configure HACMP Nodes->Add a Node to the HACMP Cluster
Node Name : 需要手动输入,为机器主机名: test-v-web01 / test-v-web02
Communication Path to Node : 通过F4选择 主机名的boot 地址 hostweb01_boot1 / hostweb01_boot2
同理可以添加第二个节点
d.14 创建IP 网络及接口
smitty hacmp-> Extended Configuration-> Extended Topology Configuration->Configure HACMP Networks->Add a Network to the HACMP Cluster->ether
Network Name : net_ether_01
Network Type : ether
Netmask.. : 255.255.255.0
Enable IP Address: YES
IP Add.. :
Enable IP Address Takeover via IP Aliases:此选项决定了HACMP的IP 切换方式,但值得一提的是只有"boot1/boot”、“boot2/standby”、“svc/service"”三个IP 分别为三个不同网段时必须选用IP Aliases 方式。
如果“boot1/boot"、“boot2/standby”其中一个与“svc/service"为同一网段时必须选用IP Replace 方式,则此选项应选“NO"
同样完成net_ether_02网络创建
d.15 向这些网络添加boot 地址网络接口
smitty hacmp-> Extended Configuration-> Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices->Add Pre-defined Communication Interfaces and Devices->Communication Interfaces
选择之前建立的net_ether_01增加boot 地址
IP Lable/Address : [hostweb01_boot1]
Node Name : [test-v-web01]
------
IP Lable/Address : [hostweb01_boot2]
Node Name : [test-v-web01]
同样,将其他boot 地址加入
d.15 添加心跳网络
smitty hacmp-> Extended Configuration-> Extended Topology Configuration->Configure HACMP Networks->Add a Network to the HACMP Cluster->rs232
Network Name 如:net_rs232_01
Network Type rs232
添加心跳设备接口
smitty hacmp-> Extended Configuration-> Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices->Add Pre-defined Communication Interfaces and Devices-> Communication Devices
选择之前建立的net_rs232_01
Device Name : [tty0]
Device Path : [/dev/tty0]
Node Name : [test-v-web01] / [test-v-web02]
d.16 察看确认拓扑(toplog)
smit hacmp->Extended Configuration->Extended Topology Configuration->Show HACMP Topology->Show Cluster Topology
Cluster Name: web_cluster
Cluster Connection Authentication Mode: Standard
Cluster Message Authentication Mode: None
Cluster Message Encryption: None
Use Persistent Labels for Communication: No
NODE host1:
Network net_ether_01
hostweb01_boot1 192.168.10.A1
Network net_ether_02
hostweb01_boot2 192.168.10.92
Network net_rs232_01
NODE host2:
Network net_ether_01
hostweb02_boot1 192.168.10.B1
Network net_ether_02
hostweb02_boot2 192.168.10.97
Network net_rs232_01
d.17 添加app server
smitty hacmp ->Extended Configuration->Extended Resource Configuration->HACMP Extended Resources Configuration->Configure HACMP Applications->Configure HACMP Application Servers->Add an Application Server
------
* Server Name [test-v-web01_app_ihs]
*Start Script [/usr/sbin/cluster/app/shopStart]
* Stop Script [/usr/sbin/cluster/app/shopStop]
------
* Server Name [test-v-web01_app_apache]
*Start Script [/usr/sbin/cluster/app/wwwStart]
* Stop Script [/usr/sbin/cluster/app/wwwStop]
------
* Server Name [test-v-web02_app_ihs]
*Start Script [/usr/sbin/cluster/app/shopStart]
* Stop Script [/usr/sbin/cluster/app/shopStop]
------
* Server Name [test-v-web02_app_apache]
*Start Script [/usr/sbin/cluster/app/wwwStart]
* Stop Script [/usr/sbin/cluster/app/wwwStop]
......
d.18 添加service ip
smity hacmp ->Extended Configuration->Extended Resource Configuration->HACMP Extended Resources Configuration->Configure HACMP Service IP Labels/Addresses->Add a Service IP Label/Address->Configurable on Multiple Nodes
选择net_ether_01
* IP Label/Address hostweb01_svc_shop
* Network Name net_ether_01
Alternate HW Address to accompany IP Label/Address []
同样增加其他服务ip 地址
d.19 创建资源组
smitty hacmp->Extended Configuration-> Extended Resource Configuration->HACMP Extended Resource Group Configuration-> Add a Resource Group
------
* Resource Group Name [test-v-web01_RG]
* Participating Nodes (Default Node Priority) [test-v-web01 test-v-web02]
-----
* Resource Group Name [test-v-web02_RG]
* Participating Nodes (Default Node Priority) [test-v-web02 test-v-web01]
d.20 配置资源组
smitty hacmp->Extended Configuration->Extended Resource Configuration->HACMP Extended Resource Group Configuration->Change/Show Resources and Attributes for a Resource Group
选择test-v-web01_RG
Service IP Labels/Addresses [hostweb01_svc_shop hostweb01_svc_www hostweb01_svc_media]
Application Servers [test-v-web01_app_ihs test-v-web01_app_apache ...]
Volume Groups []
Use forced varyon of volume groups, if necessary false
同样的方法配置test-v-web02_RG
d.21 检查和同步HACMP 配置
注意:以上配置均在test-v-web01上完成,同步至少2 次,先强制同步到test-v-web02
smitty hacmp ->Extended Configuration->Extended Verification and Synchronization
* 首次强制同步:
Automatically correct errors found during verification? [Yes]
Force synchronization if verification fails? [Yes]
* 二次同步:
Automatically correct errors found during verification? [Yes]
Force synchronization if verification fails? [No]
注:此处结果为OK 才能继续,否则按后续故障章节根据错误信息查找原因处理
d.22 修改syncd daemon 的数据刷新频率
smitty hacmp -> HACMP Extended Configuration-> Extended Performance Tuning Parameters Configuration-> Change/Show syncd frequency
修改为 10
d.23 配置clinfo
注:对于双节点,clstat等监控集群信息软件的基础为clinfoES服务,必须运行在每个Node 节点上。
1)修改确认每台机器的/es/sbin/cluster/etc/clhosts 为:
127.0.0.1 loopback localhost
192.168.10.A1 hostweb01_boot1
192.168.10.92 hostweb01_boot2 test-v-web01
192.168.10.89 hostweb01_svc_media
192.168.10.90 hostweb01_svc_www
192.168.10.91 hostweb01_svc_shop
192.168.10.B1 hostweb02_boot1
192.168.10.97 hostweb02_boot2 test-v-web02
192.168.10.98 hostweb02_svc_media
192.168.10.99 hostweb02_svc_www
192.168.10.100 hostweb02_svc_shop
*将snmp v3 转换为snmp v1 : /usr/sbin/snmpv3_ssw -1
*修改启动clinfoES:
chssys -s clinfoES -a "-a"
startsc -s clinfoES
/usr/es/sbin/cluster/clstat 运行不报错。
*注意:此步骤不能疏漏,必须确保clinfo 实施完成后正常运行,否则后续集群状态检查cldump、clstat 将均报错,集群状态将无法检查监控
若安装Smart Assist 可按下面配置
d.31 配置:HACMP Smart Assist
步骤: http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.hacmp.websphere/ha_ws_config_cluster_resource.htm
d.32 Configuring the HACMP cluster and nodes(SA)
Initialization and Standard Configuration > Configure an HACMP Cluster and Nodes
d.33 Discovering and configuring WebSphere components(SA)
Initialization and Standard Configuration > Configuration Assistants > Make Applications Highly Available (use Smart Assists) > Add an Application to the HACMP™ Configuration
WebSphere® Smart Assistant > IBM HTTP Server > Add an IBM HTTP Server to the Cluster
步骤:http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.hacmp.websphere/ha_ws_add_http_server.htm
参考:
资源:http://publib.boulder.ibm.com/infocenter/aix/v7r1 /index.jsp?topic=%2Fcom.ibm.aix.hacmp.websphere%2Fha_ws_http_server_hacmp.htm
Resource group
HACMP application server
HACMP custom monitor
Start script
Stop script
监控: monitor: http://publib.boulder.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.hacmp.websphere/ha_ws_set_app_mon.htm
附:
建立集群:通过以下路径进入添加集群界面,然后输入集群名称。
smitty hacmp->Extended Configuration->Extended Topology Configuration ->Configure an HACMP Cluster->Add/Change/Show an HACMP Cluster Add/Change/Show an HACMP Cluster
添加节点:通过以下路径进入添加集群节点界面,输入节点名和此节点的通信接口 ( 这里使用上面提到的 Boot ip)。
smitty hacmp->Extended Configuration->Extended Topology Configuration->Configure HACMP Nodes->Add a Node to the HACMP Cluster
在两节点上收集 HACMP 相关信息 ( 可选 ):通过以下路径进行集群信息收集
smitty hacmp->Extended Configuration->Discover HACMP-related Information from Configured Nodes
添加网络
smitty hacmp->Extended Configuration->Extended Topology Configuration->Configure HACMP Networks->Add a Network to the HACMP Cluster
添加通信接口
smitty hacmp->Extended Configuration->Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices->Add Discovered Communication Interface and Devices->Communication Interfaces->ALL - Select Point-to-Point Pair of Discovered Communication Devices to Add
添加通信设备(用于串口心跳和磁盘心跳)
smitty hacmp->Extended Configuration->Extended Topology Configuration->Configure HACMP Communication Interfaces/Devices->Add Communication Interfaces/Devices->Add Discovered Communication Interface and Devices->Communication Devices
添加应用服务器:在 Server Name 处输入应用的名称,在 Start Script 和 Stop Script 处输入应用的启动和停止脚本的路径。
smitty hacmp->Extended Configuration->Extended Resource Configuration->HACMP Extended Resources Configuration->Configure HACMP Applications Servers->Configure HACMP Application Servers->Add an Application Server
添加服务 IP
smitty hacmp->Extended Configuration->Extended Resource Configuration->HACMP Extended Resources Configuration->Configure HACMP Service IP Labels/Addresses->Add a Service IP Label/Address->Configurable on Multiple Nodes
添加资源组
smitty hacmp->Extended Configuration->Extended Resource Configuration->HACMP Extended Resource Group Configuration->Add a Resource Group Add a Resource Group
更改资源组属性
smitty hacmp->Extended Configuration->Extended Resource Configuration->HACMP Extended Resource Group Configuration->Change/Show Resources and Attributes for a Resource Group
验证配置并与集群中的其它节点进行同步
smitty hacmp->Extended Configuration->Extended Verification and Synchronization