问题描述
1)
RAC1节点hang住, oracle bug导致了CPU高,然后集群启动隔离,但是cpu太高,隔离不了
问题原因
1)
Bug 21286665 - "Streams AQ: enqueue blocked on low memory" waITs with fix 18828868 - suPErseded (文档 ID 21286665.8)
日志如图:
解决方案
1)
主库shutdown abort,业务调整到从库。正常运行
2)
排查数据库日志,寻找问题
3)
发现问题所在,找到BUG 1)
Bug 21286665 - "Streams AQ: enqueue blocked on low memory" wAIts with fix 18828868
4)
下载补丁修复BUG
p22502456_112040_Linux-x86-64.zip
升级opatch 下载optach的最新版本 p6880880_112000_Linux-x86-64.zip、、
补丁升级过程:
升级补丁需要关闭数据库
1。下载补丁上传至$ORACLE_HOME/Opatch目录下,解压备用
2. [oracle@xxx OPatch]$ pwd
/home/app/oracle/product/11.2.0/OPatch
[oracle@xxx OPatch]$ cd 22502456/
[oracle@xxx 22502456]$ ../OPatch/opatch apply
3. 验证 ../OPatch/opatch lsinventory
4. 或者进入系统应用一下这些补丁,然后查询验证:
@?/rdbms/admin/catbundle.sql psu apply
select * from dba_registry_history;
2主库升级完补丁以后,要开启数据库stARtup时报错
ORA-15025: could not open disk "/dev/asm_SSD3"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denIEd
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd4"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd5"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
SUCCESS: diskgroup SSDDATA was dismounted
ERROR: diskgroup SSDDATA was not mounted
ORA-15025: could not open disk "/dev/asm_ssd2"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd3"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd4"
ORA-27041: unable to open file
Linux-x86_64 Error: 13: Permission denied
Additional information: 9
ORA-15025: could not open disk "/dev/asm_ssd5"
ORA-27041: unable to open file
需要使用grid用户登录后用集群命令开启数据库
[grid@shdbrac1 ~]$ srvctl status database -d shdbrac
Instance shdbrac1 is running on node shdbrac1
Instance shdbrac2 is running on node shdbrac2
srvctl status instance -d shdbrac -i shdbrac1
srvctl stop/start instance -d shdbrac -i shdbrac1
两个RAC节点都升级完以后,发现DG归档日志报错,
SUCCESS: diskgroup SSDDATA was dismounted
ERROR: diskgroup SSDDATA was not mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+SSDDATA/shdbrac/controlfile/current.257.946912997'
ORA-17503: ksfdopn:2 Failed to open file +SSDDATA/shdbrac/controlfile/current.257.946912997
ORA-15001: diskgroup "SSDDATA" does not exist or is not mounted
...skipping...
returning error ORA-16191
------------------------------------------------------------
PING[ARC2]: Heartbeat failed to connect to standby 'stbdb'. Error is 16191.
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a passWord file
and Remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
DG关闭数据库, 重启后发生以下报错:
ORA-16136 signalled during: alter database recover managed standby database cancel...
alter database open read only
AUDIT_TRAIL initialization parameter is changed to OS, as DB is NOT compatible for database opened with read-only access
Fri Apr 03 21:22:56 2020
Beginning Standby Crash Recovery.
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf
Media Recovery Waiting for thread 2 sequence 29011
Fri Apr 03 21:23:53 2020
Standby Crash Recovery aborted due to error 1013.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/shdbstb_ora_11251.trc:
ORA-01013: user requested cancel of current Operation
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed Standby Crash Recovery.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'
ORA-10458 signalled during: alter database open read only...
alter database open
Beginning Standby Crash Recovery.
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Media Recovery Log /oradata/stbdb/archivelog/1_63869_946912997.dbf
Media Recovery Waiting for thread 2 sequence 29011
Fri Apr 03 21:24:12 2020
Standby Crash Recovery aborted due to error 1013.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-01013: user requested cancel of current operation
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed Standby Crash Recovery.
Errors in file /home/app/oracle/diag/rdbms/stbdb/stbdb/trace/stbdb_ora_11251.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/oradata/stbdb/datafile/system.266.946913027'
ORA-10458 signalled during: alter database open...
Shutting down instance (abort)
License high water mark = 7
USER (ospid: 11251): terminating the instance
Fri Apr 03 21:24:16 2020
opiodr aborting process Unknown ospid (11280) as a result of ORA-1092
Fri Apr 03 21:24:16 2020
ORA-1092 : opitsk aborting process
Instance terminated by USER, pid = 11251
Fri Apr 03 21:24:19 2020
Instance shutdown complete
Fri Apr 03 21:25:38 2020
Starting ORACLE instance (NORmal)
经检查,问题原因在于口令文件有误
将节点1的$ORACLE_HOME/dbs下面的口令文件,copy到节点二和DG后,问题解决。