数据库实例自动crash并报ORA-27157、ORA-27300等错误

原文地址

rhel7.2上安装12C RAC数据库后,其中一个数据库实例经常会自动crash。查看alert日志发现以下错误信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Errors  in  file  /d12/app/oracle/diag/rdbms/rac12c/rac12c2/trace/rac12c2_j000_21047 .trc:
ORA-27157: OS post /wait  facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
Fri Sep 09 16:50:53 2016
Errors  in  file  /d12/app/oracle/diag/rdbms/rac12c/rac12c2/trace/rac12c2_rmv0_20798 .trc:
ORA-27157: OS post /wait  facility removed
Fri Sep 09 16:50:53 2016
Errors  in  file  /d12/app/oracle/diag/rdbms/rac12c/rac12c2/trace/rac12c2_q005_21328 .trc:
ORA-27157: OS post /wait  facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

 

错误原因描述:

在rhel7.2中,systemd-logind服务引入了一个新特性:在一个user完全退出OS后会remove掉所有的IPC对象。 
该特性由/etc/systemd/logind.conf参数文件中RemoveIPC选项来控制。详细请看man logind.conf(5)。

在rhel7.2中,RemoveIPC的默认值是yes

因此,当最后一个oracle或者grid用户退出时,操作系统会remove掉这个user的shared memory segments和semaphores
而Oracle ASM和database的SGA需要使用 shared memory segments,因此remove shared memory segments将会crash掉Oracle ASM和database instances。

请参考Redhat bug 1264533 - https://bugzilla.redhat.com/show_bug.cgi?id=1264533

这个问题会影响使用shared memory segments和semaphores的所有应用,因此,Oracle ASM 实例和Oracle Database 实例均受到影响。
oel7.2为了避免这个问题,在/etc/systemd/logind.conf配置文件中明确设置RemoveIPC为no。

 

该问题会导致的现象:

1
2
3
1) Installing 11.2 and 12c GI /CRS  fails, because ASM crashes towards the end of the installation.
2) Upgrading to 11.2 and 12c GI /CRS  fails.
3) After Redhat Linux is upgraded to 7.2, 11.2 and 12c ASM and database instances crash.

systemd-logind可能会在任何时候remove IPC对象,发生错误的时候对应的日志现象也不同。比如:

1
2
3
4
5
Most common error that occurs is that the following is found  in  the asm or database alert.log:
ORA-27157: OS post /wait  facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1

  

1
2
3
The second observed error occurs during installation and upgrade when asmca fails with the following error:
KFOD-00313: No ASM instances available. CSS group services were successfully initilized by kgxgncin
KFOD-00105: Could not  open  pfile  'init@.ora'

  

1
2
3
4
5
6
7
8
9
The third observed error occurred during installation and upgrade:
Creation of ASM password  file  failed. Following error occurred: Error  in  Process:  /d12/app/12 .1.0 /grid/bin/orapwd
 
  Enter password  for  SYS:
 
OPW-00009: Could not establish connection to Automatic Storage Management instance
 
2015 /11/20  21:38:45 CLSRSC-184: Configuration of ASM failed
2015 /11/20  21:38:46 CLSRSC-258: Failed to configure and start ASM

  

1
2
3
The fourth observed error is the following message is found  in  the  /var/log/messages  file  around the  time  that asm or database instance crashed:
Nov 20 21:38:43 testc201 kernel: traps: oracle[24861]  trap  divide error
ip:3896db8 sp:7ffef1de3c40 error:0  in  oracle[400000+ef57000]

  

修改方法:

1).设置/etc/systemd/logind.conf中RemoveIPC=no
2).重启服务器或者重启systemd-logind
重启systemd-logind:

1
2
# systemctl daemon-reload
# systemctl restart systemd-logind

  

MOS Doc:

ALERT: Setting RemoveIPC=yes on Redhat 7.2 Crashes ASM and Database Instances as Well as Any Application That Uses a Shared Memory Segment (SHM) or Semaphores (SEM) (Doc ID 2081410.1)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值