oracle11g 11.2.0.3 rac
[grid@RLB01-DB02 ~]$ crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 94fd4da8e8724f0fbf6e26dceebae53b (/dev/oracleasm/disks/VOTE) [VOTE_OCR]
Located 1 voting disk(s).
[grid@lb01-db02 ~]$ /u01/app/11.2.0/grid/bin/crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 OFFLINE OFFLINE
ora.cluster_interconnect.haip
1 OFFLINE OFFLINE
ora.crf
1 ONLINE ONLINE lb01-db02
ora.crsd
1 OFFLINE OFFLINE
ora.cssd
1 ONLINE OFFLINE STARTING
ora.cssdmonitor
1 ONLINE ONLINE lb01-db02
ora.ctssd
1 OFFLINE OFFLINE
ora.diskmon
1 OFFLINE OFFLINE
ora.evmd
1 OFFLINE OFFLINE
ora.gipcd
1 ONLINE ONLINE lb01-db02
ora.gpnpd
1 ONLINE ONLINE lb01-db02
ora.mdnsd
1 ONLINE ONLINE lb01-db02
[grid@RLB01-DB02 ~]$ vim /u01/app/11.2.0/grid/log/rlb01-db02/cssd/ocssd.log
2019-02-25 10:19:50.798: [ CSSD][1098787136]clssnmvDHBValidateNCopy: node 1, rlb01-db01, has a disk HB, but no network HB, DHB has rcfg 251082553, wrtcnt, 167308331, LATS 148031504, lastSeqNo 167308330, uniqueness 1523724667, timestamp 1551061593/1562825448
2019-02-25 10:19:51.315: [ CSSD][1089845568]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2019-02-25 10:19:51.520: [ CSSD][1102690624]clssscSelect: cookie accept request 0x2aaaac024f00
2019-02-25 10:19:51.520: [ CSSD][1102690624]clssgmAllocProc: (0xf6ebf00) allocated
2019-02-25 10:19:51.521: [ CSSD][1102690624]clssgmClientConnectMsg: properties of cmProc 0xf6ebf00 - 1,2,3,4,5
2019-02-25 10:19:51.521: [ CSSD][1102690624]clssgmClientConnectMsg: Connect from con(0x5049) proc(0xf6ebf00) pid(18920) version 11:2:1:4, properties: 1,2,3,4,5
2019-02-25 10:19:51.521: [ CSSD][1102690624]clssgmClientConnectMsg: msg flags 0x0000
2019-02-25 10:19:51.522: [ CSSD][1102690624]clssscSelect: cookie accept request 0xf6ebf00
2019-02-25 10:19:51.522: [ CSSD][1102690624]clssscevtypSHRCON: getting client with cmproc 0xf6ebf00
2019-02-25 10:19:51.522: [ CSSD][1102690624]clssgmRegisterClient: proc(4/0xf6ebf00), client(1/0xf6ec070)
2019-02-25 10:19:51.523: [ CSSD][1102690624]clssgmJoinGrock: global grock CRF- new client 0xf6ec070 with con 0x5078, requested num -1, flags 0x4000e00
2019-02-25 10:19:51.523: [ CSSD][1102690624]clssgmJoinGrock: ignoring grock join for client not requiring fencing until group information has been received from the master; group name CRF-, member number -1, flags 0x4000e00
2019-02-25 10:19:51.523: [ CSSD][1102690624]clssgmDiscEndpcl: gipcDestroy 0x5078
2019-02-25 10:19:51.523: [ CSSD][1102690624]clssgmDeadProc: proc 0xf6ebf00
2019-02-25 10:19:51.523: [ CSSD][1102690624]clssgmDestroyProc: cleaning up proc(0xf6ebf00) con(0x5049) skgpid ospid 18920 with 0 clients, refcount 0
2019-02-25 10:19:51.524: [ CSSD][1102690624]clssgmDiscEndpcl: gipcDestroy 0x5049
2019-02-25 10:19:51.801: [ CSSD][1098787136]clssnmvDHBValidateNCopy: node 1, rlb01-db01, has a disk HB, but no network HB, DHB has rcfg 251082553, wrtcnt, 167308333, LATS 148032504, lastSeqNo 167308331, uniqueness 1523724667, timestamp 1551061595/1562827288
2019-02-25 10:19:51.822: [ CSSD][1105844544]clssnmSendingThread: sending join msg to all nodes
2019-02-25 10:19:51.822: [ CSSD][1105844544]clssnmSendingThread: sent 4 join msgs to all nodes
2019-02-25 10:19:52.317: [ CSSD][1089845568]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2019-02-25 10:19:53.319: [ CSSD][1089845568]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2019-02-25 10:19:53.816: [ CSSD][1107421504]clssnmRcfgMgrThread: Local Join
2019-02-25 10:19:53.816: [ CSSD][1107421504]clssnmLocalJoinEvent: begin on node(2), waittime 193000
2019-02-25 10:19:53.816: [ CSSD][1107421504]clssnmLocalJoinEvent: set curtime (148034524) for my node
2019-02-25 10:19:53.816: [ CSSD][1107421504]clssnmLocalJoinEvent: scanning 32 nodes
2019-02-25 10:19:53.816: [ CSSD][1107421504]clssnmLocalJoinEvent: Node rlb01-db01, number 1, was shut down
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmLocalJoinEvent: Starting initial cluster reconfig
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmDoSyncUpdate: Initiating sync 0
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssscCompareSwapEventValue: changed NMReconfigInProgress val 2, from -1, changes 1
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmDoSyncUpdate: local disk timeout set to 200000 ms, remote disk timeout set to 200000
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmDoSyncUpdate: new values for local disk timeout and remote disk timeout will take effect when the sync is completed.
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSetFirstIncarn: Node 1 incarnation 251082553
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSetFirstIncarn: Node 2 incarnation 251082552
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSetFirstIncarn: Incarnation set to 251082554
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmDoSyncUpdate: Starting cluster reconfig with incarnation 251082554
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSetupAckWait: Ack message type (11)
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSetupAckWait: node(2) is ALIVE
2019-02-25 10:19:53.818: [ CSSD][1107421504]clssnmSendSync: syncSeqNo(251082554), indicating EXADATA fence initialization incomplete
2019-02-25 10:19:53.818: [ CSSD][1107421504]List of nodes that have ACKed my sync: NULL
------暂时定位为BUG,可能是bug 13334158 and bug 13811209,
这两个bug都在11.2.0.3 GI PSU3 及以上PSU修复了,因此建议安装最新11.2.0.3 GI PSU5 (patch 14727310)
[root@RLB01-DB01 ~]# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all
[root@RLB01-DB01 ~]# /u01/app/11.2.0/grid/bin/crsctl start cluster -all