APPLIES TO:
Oracle Database - Enterprise Edition - Version 11.2.0.2 to 11.2.0.4 [Release 11.2]
Information in this document applies to any platform.
SYMPTOMS
GI logfile "ocssd.log" grows beyond the default file size.
The logfile rotation fails with error LFI-00142: Unable to delete an existing file [ocssd][110] not owned by Oracle.
This problem can cause file system space issues!
Example -> ocssd.log located in $GRID_HOME/log/<hostname>/cssd/ below has grown to 5Gb in size
-rw-r--r-- 1 grid oinstall 5517323953 Jun 6 07:57 ocssd.log ----> Here!
-rw-rw-r-- 1 grid oinstall 483269 Jun 5 15:37 cssdOUT.log
-rw------- 1 grid oinstall 74092544 Jan 31 10:36 core.9352
-rw-r--r-- 1 grid oinstall 158399110 Jan 29 14:35 ocssd.l01
-rw-r--r-- 1 grid oinstall 158423139 Jan 24 17:00 ocssd.l02
-rw-r--r-- 1 grid oinstall 158422898 Jan 18 06:26 ocssd.l03
-rw-r--r-- 1 grid oinstall 158402809 Jan 11 20:54 ocssd.l04
-rw-r--r-- 1 grid oinstall 158413241 Jan 5 13:07 ocssd.l05
-rw-r--r-- 1 grid oinstall 158413772 Dec 30 17:44 ocssd.l06
-rw-r--r-- 1 grid oinstall 158404163 Dec 24 15:57 ocssd.l07
-rw-r--r-- 1 grid oinstall 158391594 Dec 18 16:22 ocssd.l08
-rw-r--r-- 1 grid oinstall 158406347 Dec 12 22:09 ocssd.l09
-rw-r--r-- 1 grid oinstall 158420893 Dec 6 12:09 ocssd.l10
In the above example the expected file size should not exceed 150mb. Query CSS as follows to confirm logfile size limit -->
% crsctl get css logfilesize
CRS-4676: Successful get logfilesize 157286400 for Cluster Synchronization Services.
CHANGES
None
CAUSE
It is caused by unpublished Bug 18700935 - CLOUD:ACLDX0085 OCSSD LOG IS NOT ROTATED
At some point in time, the Clusterware alert log reports an attempted logfile rotation failure.
As a result, the last logfile 'ocssd.110' is never deleted. This may be due to the logfile being open during logfile delete or a permissions issue on the file itself.
The ocssd.bin thread that performs log file rotation 'clsd_logThread' encounters the delete failure and this causes the logfile never to be deleted/rotated, resulting in ocssd.log continually growing in size.
Extract of the error in $GRID_HOME/log/<hostname>/alert<hostname>.log
[cssd(29355)]CRS-1713:CSSD daemon is started in clustered mode
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0009:log file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log" reopened
2014-06-05 15:37:44.512:
[cssd(29355)]CRS-0019:file rotation terminated. log file: "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.log"
[cssd(29355)]CRS-0014:An error occurred while attempting to delete file "/u01/app/11.2.0.3/grid/log/ed28db01/cssd/ocssd.l10" during log file rotation. Additional diagnostics: LFI-00142: Unable to delete an existing file [ocssd][l10] not owned by Oracle.
SOLUTION
The CSSD thread that encountered the LFI-00142 error needs to be restarted to ensure log rotation works again.
Manually deleting the logfile will not resolve the log rotation problem.
1). Shutdown CRS on the node reporting the problem.
# crsctl stop crs
2). Once CRS is down, proceed to manually delete the 'ocssd.l10' file, or copy the logfile to another location if you need to keep a backup.
# rm $GRID_HOME/log/<hostname>/cssd/ocssd.l10
3). Startup Clusterware again
# crsctl start crs
If you are NOT able to schedule downtime and file size growth in the GRID Home is causing a space issue then copy the logs to another location and do the following
% echo 0 > ocssd.l10
Please note this does not resolve the log rotation problem but only allows you to free up some space.
4). Bug 18700935 has been fixed in 11.2.0.4.5 PSU for Unix/Linux platform and 11.2.0.4.12 Bundle for Windows platform. Please apply the patch if required.