有点类似于这个bug
Bug 18971331 : DB NODE#1 IS CRASHED WITH HUGE CONSUMPTION OF CPU RESOURCE TO OCSSD.BIN
Hdr: 18971331 11.2.0.4 PCW 11.2.0.4 CSS PRODID-5 PORTID-226
Abstract: DB NODE#1 IS CRASHED WITH HUGE CONSUMPTION OF CPU RESOURCE TO OCSSD.BIN
*** 06/12/14 07:19 pm ***
PROBLEM:
--------
The problem of the customer is crashed a DB node#1 of Exadata rack at two
times between 2 months. At this point, we focused on crashing DB node#1 at
05:23 May 25.
teams so that while initializing a problem, %CPU of an ocssd.bin is using
2287.2% which is an analysis result of Linux Team.
DIAGNOSTIC ANALYSIS:
--------------------
<<< ocssd.log of DB Node#1 >>>
...
2014-05-25 05:25:46.049: [ CSSD][764115264]clssgmShutDown: Received
abortive shutdown request from client.
2014-05-25 05:25:46.049: [
CSSD][764115264]###################################
2014-05-25 05:25:46.049: [ CSSD][764115264]clssscExit: CSSD aborting from
thread GMClientListener
2014-05-25 05:25:46.049: [
CSSD][764115264]###################################
2014-05-25 05:25:46.049: [ CSSD][764115264](:CSSSC00012
clssscExit: A
fatal error occurred and the CSS daemon is terminating abnormally
bcexadb01, number 1, has experienced a failure in thread number 15 and is
shutting down
2014-05-25 05:25:46.050: [ CSSD][764115264]clssgmThreadRecovery:recovering
clntlsnr mutex
2014-05-25 05:25:46.050: [ CSSD][764115264]clssnmQueueClientEvent:
Sending Event(6), type 6, incarn 288609881
2014-05-25 05:25:46.050: [ CSSD][764115264]clssnmQueueClientEvent: Node[1]
state = 3, birth = 288609879, unique = 1396727807
2014-05-25 05:25:46.050: [ CSSD][764115264]clssnmQueueClientEvent: Node[2]
state = 3, birth = 288609877, unique = 1393111836
2014-05-25 05:25:46.050: [ CSSD][764115264]clssscExit: Starting CRSD
cleanup
...
<<< ocssd.log of DB Node#2 >>>
...
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmWaitOnEviction: Node
kill could not beperformed. Admin or connection validation failed
2014-05-25 05:26:45.950: [ CSSD][3102861632]clssnmvDiskEvict: Kill block
write, file o/192.168.10.7;192.168.10.8/DBFS_DG_CD_02_bcexaceladm02 flags
0x00010004, kill block unique 1396727807, stamp 3554575728/3554575728
2014-05-25 05:26:45.950: [ CSSD][3098130752]clssnmvDiskEvict: Kill block
write, file o/192.168.10.9;192.168.10.10/DBFS_DG_CD_02_bcexaceladm03 flags
0x00010004, kill block unique 1396727807, stamp 3554575728/3554575728
2014-05-25 05:26:45.950: [ CSSD][3107592512]clssnmvDiskEvict: Kill block
write, file o/192.168.10.5;192.168.10.6/DBFS_DG_CD_03_bcexaceladm01 flags
0x00010004, kill block unique 1396727807, stamp 3554575728/3554575728
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmWaitOnEvictions: node 1,
undead 1, EXADATA fence handle 13 kill reqest id 0, last DHB (1400963145,
4235069254, 4232085), seedhbimpd TRUE
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmrCheckEDFenceStatus:
Node bcexadb01, number 1, EXADATA fence handle 13 has status 1, status
request return code ORA-0
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmCheckKillStatus: Node 1,
bcexadb01, down, due to timeout of DHB; last NHB TOD invariant clock
4235069294, TOD invariant clock 4235069254
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmWaitOnEviction: node(1)
exceeded graceful shutdown period, IPMI-kill allowed if configured
2014-05-25 05:26:45.950: [ CSSD][3219560160]clssgmQueueGrockEvent:
groupName(DGBCDB0) count(2) master(1) event(2), incarn 4, mbrc 2, to member
1, events 0x0, state 0x0
2014-05-25 05:26:45.950: [ CSSD][3088668992]clssnmWaitOnEviction: Node
kill could not beperformed. Admin or connection validation failed
...
WORKAROUND:
-----------
NONE.
RELATED BUGS:
-------------