接到用户报障,4节点RAC中有两个节点一个月前就无法正常启动。
一、问题现象
1、现场检查发现4个节点服务器中有两台(node2,node4)处于离线状态。
2、在node2 和node4上检查发现无crs相关进程,oracle crs未启动。
3、在node2和node4使用root 执行crsctl start has时无法启动成功。
4、节点间网络访问正常。
二、处理过程
1、node2上准备使用crsctl start res ora.crsd -init命令启用HAS 时有如下报错
[root@cloud4 bin]# ./crsctl start res ora.crsd -init
CRS-2672: Attempting to start 'ora.crf' on 'cloud4'
CRS-2672: Attempting to start 'ora.cssd' on 'cloud4'
CRS-2672: Attempting to start 'ora.diskmon' on 'cloud4'
CRS-2676: Start of 'ora.diskmon' on 'cloud4' succeeded
CRS-2676: Start of 'ora.crf' on 'cloud4' succeeded
CRS-2676: Start of 'ora.cssd' on 'cloud4' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'cloud4'
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'cloud4'
CRS-2676: Start of 'ora.ctssd' on 'cloud4' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'cloud4' succeeded
CRS-2679: Attempting to clean 'ora.asm' on 'cloud4'
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
CRS-2681: Clean of 'ora.asm' on 'cloud4' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'cloud4'
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/diag/crs/cloud4/crs/trace/ohasd_oraagent_grid.trc".
CRS-2674: Start of 'ora.asm' on 'cloud4' failed
CRS-2679: Attempting to clean 'ora.asm' on 'cloud4'
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
ORA-09925: Unable to create audit trail file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9925
CRS-2681: Clean of 'ora.asm' on 'cloud4' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'cloud4'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'cloud4' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'cloud4'
CRS-2677: Stop of 'ora.ctssd' on 'cloud4' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'cloud4'
CRS-2677: Stop of 'ora.cssd' on 'cloud4' succeeded
CRS-4000: Command Start failed, or completed with errors.
ORA-09925: Unable to create audit trail file
应该是文件权限问题,无法正常创建audit file。
2、在grid用户下检查$GRID_HOME/rdbms/aduit 目录权限发现,改目录的属主为root,属组也为root,且只有rx-r-----的权限
3、检查发现grid安装目录和oracle 安装目录很多目录的属主和属组错乱,对于grid 的权限错误可以使用root.sh来修复也可以在正常节点把
如在node1上拷贝对应文件权限
getfacl /u01/app/12.1.0/grid > grid.ora
在异常节点做如下操作,恢复用户权限
set –restore grid.ora
4、按照上述检查方法检查node4 各目录的权限,然后进行修复。
5、在node2和node4上使用crsctl start res ora.crsd -init启动服务。
三、问题原因
oracle GI和oracle DATABASE安装完毕后最好不要随意对目录的权限进行调整,调整后会导致cluster或db启动不了,也可能导致其他各种各样的问题。
本例中用户对cluster的安装目录进行了更改属主属组的操作导致cluster无法启动。
四、管理建议
oracle 安装完毕后主要涉及几个用户
root,HA高可用性管理用户
grid,ASM管理用户
oracle,DB管理A用户
加强对用户的权限控制无关人员禁止使用root登陆,对安装目录不要自建目录和修改目录权限。