AIX5.3+HACMP5.3+ORACLE 10GRAC安装报错记录
环境:AIX 5312 + HACMP5.3+ORACLE10.2.0.4
1. 问题1:在安装CRS时,第一个节点执行root.sh的时候,卡在
Startup will be queued to init within 30 seconds.
root.sh执行过程如下:
root@ykcs1:[/oracle/app/crs]#sh root.sh
WARNING: directory '/oracle/app' is not owned by root
WARNING: directory '/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/app' is not owned by root
WARNING: directory '/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: ykcs1 ykcs1-pri ykcs1
node 2: ykcs2 ykcs2-pri ykcs2
Creating OCR keys for user 'root', privgrp 'system'..
Operation successful.
Now formatting voting device: /dev/rzvote_512m
Format of 1 voting devices complete.
Startup will be queued to init within 30 seconds.
经排查,是在配置CRS IP的时候,修改了主机hosts文件未重启,导致。
2. 问题2:安装CRS卡在
Expecting the CRS daemons to be up within 600 seconds.
10分钟后报错如下:
Failure at final check of Oracle CRS stack.
10
由于第一次接触10G RAC排错不是很拿手,试过网上很多方法如权限、共享磁盘、清理/var/tmp/.oracle等方式,均没有效果。最后在同事的指导下,查看进程发现CSSD进程启动失败,后台LOG如下:
oracle@ykcs1_/oracle/app/crs/log/ykcs1/cssd$ more ocssd.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle. All rights reserved.
[CSSD]2013-08-05 10:06:06.826 >USER: Oracle Database 10g CSS Release 10.2.0.1.0 Production Copyright 1996, 2004 Oracle. All rights reserved.
[CSSD]2013-08-05 10:06:06.826 >USER: CSS daemon log for node ykcs1, number 1, in cluster crs
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ykcs1DBG_CSSD))
[CSSD]2013-08-05 10:06:06.915 [1] >TRACE:clssscmain: local-only set to false
[CSSD]2013-08-05 10:06:07.063 [1] >TRACE:clssnmReadNodeInfo: added node 1 (ykcs1) to cluster
[CSSD]2013-08-05 10:06:07.075 [1] >TRACE:clssnmReadNodeInfo: added node 2 (ykcs2) to cluster
[CSSD]2013-08-05 10:06:07.173 [1029] >TRACE:clssnm_skgxninit: initialized skgxn version (2/0/IBM AIX skgxn)
[CSSD]2013-08-05 10:06:07.358 [1029] >ERROR:clssnm_skgxnmon: Failure 0 registering.(1/1 [HA_GS_NOT_OK]/sskgxn_gs_in)
[CSSD]2013-08-05 10:06:07.360 [1] >TRACE:clssnmInitNMInfo: misscount set to 600
[CSSD]2013-08-05 10:06:07.365 [1] >TRACE:clssnmDiskStateChange: state from 1 to 2 disk (0//dev/rzvote_512m)
[CSSD]2013-08-05 10:06:09.389 [1030] >TRACE:clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rzvote_512m)
[CSSD]2013-08-05 10:06:09.472 [1] >TRACE:clssscSclsFatal: read value of disable
[CSSD]2013-08-05 10:06:09.472 [1544] >TRACE:clssnmFatalThread: spawned
[CSSD]2013-08-05 10:06:09.472 [1] >TRACE:clssscSclsFatal: read value of disable
[CSSD]2013-08-05 10:06:09.472 [1801] >TRACE:clssnmconnect: connecting to node 1, flags 0x0001, connector 1
[CSSD]2013-08-05 10:06:09.528 [2058] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_1))
[CSSD]2013-08-05 10:06:09.528 [2058] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ykcs1_crs))
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmRcfgMgrThread: Connection complete
[CSSD]2013-08-05 10:06:09.549 [3086] >TRACE:clssnmPollingThread: Connection complete
[CSSD]2013-08-05 10:06:09.549 [3343] >TRACE:clssnmSendingThread: Connection complete
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmRcfgMgrThread: Local Join
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmDoSyncUpdate: Initiating sync 1
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmSetupAckWait: Ack message type (11)
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmSetupAckWait: node(1) is ALIVE
[CSSD]2013-08-05 10:06:09.549 [3600] >TRACE:clssnmSendSync: syncSeqNo(1)
[CSSD]2013-08-05 10:06:09.549 [1801] >TRACE:clssnmHandleSync: Acknowledging sync: src[1] srcName[ykcs1] seq[1] sync[1]
[CSSD]2013-08-05 10:06:09.550 [3600] >TRACE:clssnmWaitForAcks: Ack message type(11), ackCount(1)
[CSSD]2013-08-05 10:06:09.649 [1] >USER: NMEVENT_SUSPEND [00][00][00][00]
[CSSD]2013-08-05 10:06:10.550 [3600] >TRACE:clssnmWaitForAcks: done, msg type(11)
[CSSD]2013-08-05 10:06:10.550 [3600] >TRACE:clssnmSetupAckWait: Ack message type (13)
[CSSD]2013-08-05 10:06:10.550 [3600] >TRACE:clssnmSetupAckWait: node(1) is ACTIVE
[CSSD]2013-08-05 10:06:10.550 [3600] >TRACE:clssnmSendVote: syncSeqNo(1)
[CSSD]2013-08-05 10:06:10.550 [3600] >TRACE:clssnmWaitForAcks: Ack message type(13), ackCount(1)
[CSSD]2013-08-05 10:06:10.550 [1801] >TRACE:clssnmSendVoteInfo: node(1) syncSeqNo(1)
[CSSD]2013-08-05 10:06:11.550 [3600] >TRACE:clssnmWaitForAcks: done, msg type(13)
[CSSD]2013-08-05 10:06:11.550 [3600] >TRACE:clssnmCheckDskInfo: Checking disk info...
[CSSD]2013-08-05 10:06:12.550 [3600] >ERROR:clssnmCheckDskInfo: We appear to be dead skgxn 0
[CSSD]2013-08-05 10:06:12.550 [3600] >ERROR:clssnmDoSyncUpdate: checkDskInfo signaled shutdown
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE:clssscctx: dump of 0x11000ddf0, len 3752
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE: 0x11000ddf0 00 00 00 01 10 9a 08 b0 - 00 00 00 01 10 95 6e 50 ..............nP
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE: 0x11000de00 00 00 00 00 00 00 00 00 - 00 00 00 01 10 00 d9 d0 ................
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE: 0x11000de10 00 00 00 01 10 01 74 f0 - 00 00 00 01 10 00 ec b0 ......t.........
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE: 0x11000de20 00 00 00 70 00 00 00 00 - 00 00 00 01 10 00 dd f0 ...p............
[CSSD]2013-08-05 10:06:12.550 [3600] >TRACE: 0x11000de30 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................
ocssd.log (0%)
确定为HA的问题,搜索HA_GS_NOT_OK问题后确定为oracle未加进用户组hagsuser中
将oracle加入hagsuser组之后CRS创建正常。
3. 配置VIP后,安装完数据库软件后,配置监听,配置监听第2节点监听服务无法启动,netstat -in查看VIP发现vip2个均在1号节点上。
排查原因为,安装系统时,由于2号节点配置的问en1,en2网卡,en0网卡无法使用,1号节点配置的en0,en1网卡,配置VIP后,由于2号节点en0无法使用,节点的VIP飘到1号节点上。
最后只能重新配置AIX 虚拟机的网卡。以后保证安装的两台机器IP对应网卡名称一致。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26739940/viewspace-767746/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/26739940/viewspace-767746/