环境如下:10.2.0.3 rac on aix 5.3
现在发现时常会出现一个节点重启,如下是ocssd.log
不清楚是什么原因,求教
[ CSSD]2014-09-15 13:29:01.766 [4889] >TRACE: clssgmMasterCMSync: Synchronizing group/lock status
[ CSSD]2014-09-15 13:29:01.772 [4889] >TRACE: clssgmMasterSendDBDone: group/lock status synchronization complete
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 8 with 2 nodes
[ CSSD]CLSS-3001: local node number 2, master node number 2
[ CSSD]2014-09-15 13:29:01.772 [4889] >TRACE: clssgmReconfigThread: completed for reconfig(8), with status(1)
[ CSSD]2014-10-05 10:55:30.375 >USER: Copyright 2014, Oracle version 10.2.0.4.0
[ CSSD]2014-10-05 10:55:30.375 >USER: CSS daemon log for node sbtdb2, number 2, in cluster crs
[ CSSD]2014-10-05 10:55:30.399 [1] >TRACE: clssscmain: local-only set to false
[ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=sbtdb2DBG_CSSD))
[ CSSD]2014-10-05 10:55:30.430 [1] >TRACE: clssnmReadNodeInfo: added node 1 (sbtdb1) to cluster
[ CSSD]2014-10-05 10:55:30.451 [1] >TRACE: clssnmReadNodeInfo: added node 2 (sbtdb2) to cluster
[ CSSD]2014-10-05 10:55:30.464 [1029] >TRACE: clssnm_skgxninit: Compatible vendor clusterware not in use
[ CSSD]2014-10-05 10:55:30.464 [1029] >TRACE: clssnm_skgxnmon: skgxn init failed
[ CSSD]2014-10-05 10:55:30.470 [1] >TRACE: clssnmNMInitialize: misscount set to (30)
[ CSSD]2014-10-05 10:55:30.473 [1] >TRACE: clssnmNMInitialize: Network heartbeat thresholds are: impending reconfig 15000 ms, reconfig st
art (misscount) 30000 ms
[ CSSD]2014-10-05 10:55:30.484 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/rhdiskpower2)
[ CSSD]2014-10-05 10:55:30.487 [1030] >TRACE: clssnmvDPT: spawned for disk 0 (/dev/rhdiskpower2)
[ CSSD]2014-10-05 10:55:30.493 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/rhdiskpower3)
[ CSSD]2014-10-05 10:55:30.493 [1287] >TRACE: clssnmvDPT: spawned for disk 1 (/dev/rhdiskpower3)
[ CSSD]2014-10-05 10:55:30.498 [1] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/rhdiskpower4)
[ CSSD]2014-10-05 10:55:30.501 [1544] >TRACE: clssnmvDPT: spawned for disk 2 (/dev/rhdiskpower4)
[ CSSD]2014-10-05 10:55:32.598 [1287] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/rhdiskpower3)
[ CSSD]2014-10-05 10:55:32.598 [1030] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/rhdiskpower2)
[ CSSD]2014-10-05 10:55:32.601 [1801] >TRACE: clssnmvKillBlockThread: spawned for disk 1 (/dev/rhdiskpower3) initial sleep interval (1000
)ms
[ CSSD]2014-10-05 10:55:32.601 [1544] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/rhdiskpower4)
[ CSSD]2014-10-05 10:55:32.604 [2058] >TRACE: clssnmvKillBlockThread: spawned for disk 0 (/dev/rhdiskpower2) initial sleep interval (1000
)ms
[ CSSD]2014-10-05 10:55:32.607 [1287] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(10) wrtcnt(1706838) LATS(565400123) Disk last
SeqNo(1706838)
[ CSSD]2014-10-05 10:55:32.607 [2315] >TRACE: clssnmvKillBlockThread: spawned for disk 2 (/dev/rhdiskpower4) initial sleep interval (1000
)ms
[ CSSD]2014-10-05 10:55:32.610 [1] >TRACE: clssnmFatalInit: fatal mode enabled
[ CSSD]2014-10-05 10:55:32.610 [1030] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(10) wrtcnt(1706838) LATS(565400126) Disk last
SeqNo(1706838)
[ CSSD]2014-10-05 10:55:32.696 [2829] >TRACE: clssnmClusterListener: Listening on (ADDRESS=(PROTOCOL=tcp)(HOST=sbtdb2-priv)(PORT=49895))
[ CSSD]2014-10-05 10:55:32.700 [2829] >TRACE: clssnmconnect: connecting to node(1), con(1112de510), flags 0x0003
[ CSSD]2014-10-05 10:55:32.744 [2829] >TRACE: clssnmConnComplete: MSGSRC 1, type 6, node 1, flags 0x0003, con 1112de510, probe 0
[ CSSD]2014-10-05 10:55:32.744 [2829] >TRACE: clssnmConnComplete: node 1, sbtdb1, con(1112de510), probcon(0), ninfcon(1112de510), node un
ique 1410758938, prev unique 0, msg unique 1410758938 node state 0
[ CSSD]2014-10-05 10:55:32.744 [2829] >TRACE: clssnmConnComplete: connected to node 1 (con 1112de510), ninfcon (1112de510), state (0), fl
ag (1039)
[ CSSD]2014-10-05 10:55:32.767 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=Oracle_CSS_LclLstnr_crs_2))
[ CSSD]2014-10-05 10:55:32.767 [3086] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sbtdb2_crs))
[ CSSD]2014-10-05 10:55:32.776 [3857] >TRACE: clssgmPeerListener: Listening on (ADDRESS=(PROTOCOL=tcp)(DEV=25)(HOST=199.169.10.18)(PORT=3
2791))
[ CSSD]2014-10-05 10:55:32.860 [2829] >TRACE: clssnmHandleSync: diskTimeout set to (27000)ms
[ CSSD]2014-10-05 10:55:32.860 [2829] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[sbtdb1] seq[5] sync[10]
[ CSSD]2014-10-05 10:55:32.860 [4628] >TRACE: clssnmRcfgMgrThread: initial lastleader(1) unique(1410758938)
[ CSSD]2014-10-05 10:55:32.860 [2829] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(10)
[ CSSD]2014-10-05 10:55:32.861 [2829] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new)
[ CSSD]2014-10-05 10:55:32.861 [2829] >TRACE: clssnmUpdateNodeState: node 1, state (4/3) unique (1410758938/1410758938) prevConuni(0) bir
th (0/8) (old/new)
[ CSSD]2014-10-05 10:55:32.861 [2829] >TRACE: clssnmUpdateNodeState: node 2, state (1/3) unique (1412477730/1412477730) prevConuni(0) bir
th (0/10) (old/new)
[ CSSD]2014-10-05 10:55:32.861 [2829] >USER: clssnmHandleUpdate: SYNC(10) from node(1) completed
[ CSSD]2014-10-05 10:55:32.861 [2829] >USER: clssnmHandleUpdate: NODE 1 (sbtdb1) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2014-10-05 10:55:32.861 [2829] >USER: clssnmHandleUpdate: NODE 2 (sbtdb2) IS ACTIVE MEMBER OF CLUSTER
[ CSSD]2014-10-05 10:55:32.861 [2829] >TRACE: clssnmHandleUpdate: diskTimeout set to (200000)ms
[ CSSD]2014-10-05 10:55:32.885 [1] >USER: NMEVENT_SUSPEND [00][00][00][00]
[ CSSD]2014-10-05 10:55:32.888 [4885] >TRACE: clssgmReconfigThread: started for reconfig (10)
[ CSSD]2014-10-05 10:55:32.888 [4885] >USER: NMEVENT_RECONFIG [00][00][00][06]
[ CSSD]2014-10-05 10:55:32.888 [4885] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 10
[ CSSD]2014-10-05 10:55:32.889 [3857] >TRACE: clssgmInitialRecv: (111f108f0) accepted a new connection from node 1 born at 8 active (2, 2
), vers (10,3,1,2)
[ CSSD]2014-10-05 10:55:32.889 [3857] >TRACE: clssgmInitialRecv: conns done (2/2)
[ CSSD]2014-10-05 10:55:32.889 [4885] >TRACE: clssgmEstablishMasterNode: MASTER for 10 is node(1) birth(8)
[ CSSD]2014-10-05 10:55:32.889 [4885] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs
[ CSSD]2014-10-05 10:55:32.994 [3857] >TRACE: clssgmHandleDBDone(): src/dest (1/65535) size(72) incarn 10
[ CSSD]CLSS-3000: reconfiguration successful, incarnation 10 with 2 nodes
[ CSSD]CLSS-3001: local node number 2, master node number 1
[ CSSD]2014-10-05 10:55:32.995 [4885] >TRACE: clssgmReconfigThread: completed for reconfig(10), with status(1)
[ CSSD]2014-10-05 10:55:33.826 [3857] >TRACE: clssgmCommonAddMember: clsomon joined (2/0x1000000/#CSS_CLSSOMON)
[ CSSD]2014-10-05 12:02:45.211 [3857] >TRACE: clscsendx: (111f4edd0) Connection not active
[ CSSD]2014-10-05 12:02:45.211 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f4edd0), client (111f4f1b0), proc (0)
[ CSSD]2014-10-05 12:02:45.211 [3857] >TRACE: clscsendx: (111f4f4d0) Connection not active
[ CSSD]2014-10-05 12:02:45.211 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f4f4d0), client (111f4f950), proc (0)
[ CSSD]2014-10-05 12:02:45.232 [3857] >TRACE: clscsendx: (111f5cbd0) Connection not active
[ CSSD]2014-10-05 12:02:45.232 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f5cbd0), client (111f5c290), proc (0)
[ CSSD]2014-10-05 12:02:45.234 [3857] >TRACE: clscsendx: (111f5d1f0) Connection not active
[ CSSD]2014-10-05 12:02:45.234 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f5d1f0), client (111f5d5d0), proc (0)
[ CSSD]2014-10-05 12:02:45.238 [3857] >TRACE: clscsendx: (111f52b70) Connection not active
[ CSSD]2014-10-05 12:02:45.238 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f52b70), client (111f52f50), proc (0)
[ CSSD]2014-10-05 12:02:45.238 [3857] >TRACE: clscsendx: (111f676d0) Connection not active
[ CSSD]2014-10-05 12:02:45.238 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f676d0), client (111f67ab0), proc (0)
[ CSSD]2014-10-05 12:02:45.265 [3857] >TRACE: clscsendx: (111f4b8f0) Connection not active
[ CSSD]2014-10-05 12:02:45.265 [3857] >TRACE: clssgmSendClient: Send failed rc 6, con (111f4b8f0), client (111f4b010), proc (0)