Oracle 11g ORA-7445 [opiaba] 和 ORA-600 [17147]错误案例分享

本文详细描述了一个Oracle11g数据库在RAC集群环境中遇到ORA-7445和ORA-600错误的案例,涉及大量UPDATE语句和超过65535个绑定变量引发的问题,提供了解决方案包括打补丁和优化SQL语句。
摘要由CSDN通过智能技术生成

        本期将为大家分享“Oracle 11g ORA-7445 [opiaba] 和 ORA-600 [17147]错误”的解决方案。

        关键词:ORA-7445 [opiaba] 、ORA-600 [17147]、65535 binds、ORA-01006: bind variable

        数据库版本:11.2.0.4.181016,操作系统版本:redhat linux 7.4,两节点的RAC集群。其中一个数据库实例出现异常重启,经排查是PL/SQL语句块中嵌入多个UPDATE语句,SQL语句中绑定变量数量超过了65535个限制触发BUG问题。

        1、收到数据库告警信息后,第一时间登录数据库服务器,然后检查数据库运行状态。

集群环境:
srvctl status database -d ywzd2 -v
Instance ywzd21 is running on node ywzd-db03. Instance status: Open.
Instance ywzd22 is running on node ywzd-db04. Instance status: Open.

单机环境:
sqlplus / as sysdba
SQL> select startup_time,status from v$instance;
STARTUP_TIME        STATUS
----------------------- ------------
06-DEC-2023 12:29:59    OPEN

        2、检查数据库故障期间的alert日志,可以看到实例遇到ORA-07445[opiaba()+639]和ORA-00600[17147]内部错误,接着核心PMON 进程崩溃,导致了 DB 实例重启。

第一段日志出现ORA-07445[opiaba()+639]错误
Wed Dec 06 12:29:21 2023
Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x185806F, opiaba()+639] [flags: 0x0, count: 1]
Errors in file /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/trace/ywzd21_ora_538783.trc  (incident=418676):
ORA-07445: exception encountered: core dump [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x185806F] [SI_KERNEL(general_protection)] []
Incident details in: /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/incident/incdir_418676/ywzd21_ora_538783_i418676.trc
第二段日志出现ORA-00600[17147]错误
Wed Dec 06 12:29:47 2023
Errors in file /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/trace/ywzd21_pmon_512337.trc  (incident=411292):
ORA-00600: internal error code, arguments: [17147], [0x14DAE09578], [], [], [], [], [], [], [], [], [], []
Incident details in: /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/incident/incdir_411292/ywzd21_pmon_512337_i411292.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Wed Dec 06 12:29:48 2023
Dumping diagnostic data in directory=[cdmp_20231206122948], requested by (instance=1, osid=512337 (PMON)), summary=[incident=411292].
Errors in file /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/trace/ywzd21_pmon_512337.trc:
ORA-00600: internal error code, arguments: [17147], [0x14DAE09578], [], [], [], [], [], [], [], [], [], []
PMON (ospid: 512337): terminating the instance due to error 472
System state dump requested by (instance=1, osid=512337 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/trace/ywzd21_diag_512357_20231206122952.trc
Instance terminated by PMON, pid = 512337
第三段日志集群实例自动重启
Wed Dec 06 12:29:59 2023
Starting ORACLE instance (normal)

批量检查命令:
grep -ib1E 'opiaba|\[17147|terminating the instance due to error 472' /u01/app/oracle/diag/rdbms/ywzd2/ywzd21/trace/alert_ywzd21.log

        3、进一步检查用户进程的跟踪文件,该进程正在执行一条非常长的、使用了超过 10 万个绑定变量的SQL语句。该PLSQL语句块包含大量的UPDATE语句,并且绑定变量个数超过65535个,触发ora-7445 [opiaba]错误。

========= Dump for incident 328879 (ORA 600 [17147]) ========
----- Beginning of Customized Incident Dump(s) -----
********** Internal heap ERROR 17147 addr=0x12b0accc78 *********
...

dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- Current SQL Statement for this session (sql_id=btng01wnhnvnw) -----
begin

update 业务表1
set 字段1 = :1 ,
字段2 = :2 ,
字段3 = :3 ,
字段4 = :4 ,
字段5 = :5 ,
字段6 = :6 ,
字段7 = '1'
where 字段8 = :7
and 字段9 = :8
and 字段10 = :9 ;

....(sql 太长,部分省略)...

update 业务表N
set 字段1 = :100381 ,
字段2 = :100382 ,
字段3 = :100383
where 字段4 = :100384
and 字段5 = :100385
and 字段6 = :100386 ;
end;

        4、进一步检查PMON进程的跟踪文件,堆栈信息显示数据库遇到“[03]: kgherror [KGH]”错误。

dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
----- SQL Statement (None) -----
Current SQL information unavailable - no cursor.
...

----- Incident Context Dump -----
Address: 0x7fff22dbad78
Incident ID: 327263
Problem Key: ORA 600 [17147]
Error: ORA-600 [17147] [0x12B0ACCC78] [] [] [] [] [] [] [] [] [] []
[00]: dbgexExplicitEndInc [diag_dde]
[01]: dbgeEndDDEInvocationImpl [diag_dde]
[02]: dbgeEndDDEInvocation [diag_dde]
[03]: kgherror [KGH]<-- Signaling
[04]: kghfrmrg [KGH]
[05]: kghfre [KGH]
[06]: kghsfx [KGH]
[07]: kghunfhp [KGH]
[08]: kghfrh_internal [KGH]
[09]: kghfrh [KGH]
[10]: kksFreeHeap [cursor]
[11]: kksLockDelete [cursor]
[12]: kksLockOperation [cursor]
[13]: kgllkdl [LIBCACHE]
[14]: kgllkds [LIBCACHE]
[15]: kssxdl [VOS]
[16]: kssdel [VOS]
[17]: kssdch [VOS]
[18]: ksuxds [ksu]
[19]: kssxdl [VOS]
[20]: kssdel [VOS]
[21]: kssdch [VOS]
[22]: ksudlp [ksu]
[23]: kssxdl [VOS]
[24]: kssdel [VOS]
[25]: ksuxdl [ksu]
[26]: ksuxda_del_proc [ksu]
[27]: ksucln_dpc_cleanup [ksu]
[28]: ksucln_dpc_dfs [ksu]
[29]: ksucln_dpc_main [ksu]
[30]: ksucln_dpc [ksu]
[31]: ksucln [ksu]
[32]: ksbrdp [background_proc]
[33]: opirip []

        5、对这种批量更新的大量使用绑定变量的长SQL/PLSQL,DB 系统在设计时未预料到此种情形,从而触发Oracle BUG导致数据库重启。结合官方文档1466343.1找到两种解决方式,分别为应用补丁12578873或修改应用程序减少绑定变量。

        1、打补丁可以避免实例崩溃,跟业务部门沟通补丁修复窗口。

        2、业务部门优化SQL语句,减少绑定变量个数,将同一条SQL/PLSQL的绑定变量个数控制在65535以内。

        3、排查数据库是否还存在相似的问题,通过SQL审查数据库是否存在绑定变量大于1000个的情况。

select sql_id, count(*) bind_count
  from v$sql_bind_capture t
 where t.CHILD_NUMBER = 0
 group by t.SQL_ID
having count(*) > 1000
 order by count(*);

        4、堆栈调用函数学习

-------------堆栈调用函数-------------------
kgdsdst   
skdstdst                operating system dependent kernel dump a stack trace
ksedst1                 kernel service (VOS) error debug dump the call stack stack trace 1 helper
ksedst                  kernel service (VOS) error debug dump the call stack
dbkedDefDump            debug and diagnostics RDBMS diagnostic data extractor Dumper do DDE default dump
ksedmp                  kernel service (VOS) error debug dump process state
ssexhd                  operating system dependent system dump routine
__sighandler        
opiaba
opiprs                  oracle program interface parse
kksParseChildCursor     kernel compile shared objects (cursor) parse child cursor
rpiswu2                 recursive program interface switch user in recursive sql
kksLoadChild            kernel compile shared objects (cursor) load child
kxsGetRuntimeLock       kernel execution shared cursor get runtime lock
kksfbc                  kernel compile shared objects (cursor) find bound cursor
kkspsc0                 kernel compile shared objects (cursor) parse shared cursor
kksParseCursor          kernel compile shared objects (cursor) parse cursor
opiosq0                 oracle program interface prepare to parse a sql command 0
kpooprx                 kernel programmatic interface oracle open, parse, and optionally execute
kpoal8                  kernel programmatic interface oracle V8 bundled execution
opiodr                  oracle program interface oracle code request driver, route the current request
ttcpip                  two task common pipe read/write
opitsk                  oracle program interface two task function dispatcher
opiino                  oracle program interface initialize opi
opiodr 
opidrv                  oracle program interface route current request driver, entry side into two task interface
sou2o                   main oracle executable entry point
opimai_real             oracle program interface main real oracle start point
ssthrdmain              operating system dependent system main for every thread in a threaded oracle
main                    general c starting function

        对于一个提交的SQL语句,存在两种可选的解析过程, 一种叫做硬解析,一种叫做软解析。 一个硬解析需要经解析、制定执行路径、优化访问计划等许多的步骤。硬解释不仅仅耗费大量的cpu,更重要的是会占据重要的闩(latch)资源,严重的影响系统的规模的扩大(即限制了系统的并发行)。当一个sql语句提交后,首先检查一下共享缓冲池(shared pool)里有没有与之完全相同的语句,如果有的话只须执行软解析即可,否则就得进行硬解析。 而唯一使得oracle 能够重复利用执行计划的方法就是采用绑定变量。绑定变量的实质就是用于替代sql语句中的常量的替代变量。绑定变量能够使得每次提交的sql语句都完全一样。

        数据库每15分钟会捕获一次绑定变量,但不会捕获每次SQL执行的变量值。同时当SQL语句硬解析时,SQL对应的绑定变量会被捕获。通过视图v$sql_bind_capture查询捕获绑定变量。

1、创建测试表
create table test01 as select * from dba_tables;
2、修改共享游标参数值为SIMILAR
alter session set CURSOR_SHARING=SIMILAR; 
3、第一次执行语句1
select table_name from test01 where table_name='OBJ$';
4、查看语句1的SQL_ID和执行次数
set linesize 1000
col sql_text for a80
select sql_id,EXECUTIONS,LOADS,sql_text from v$sql 
where sql_text like 'select table_name from test01 where table_name%';
SQL_ID          EXECUTIONS      LOADS SQL_TEXT
------------- ---------- ---------- ----------------------------------------------------------------------------------------------------
9dhppyxtr9qck           1      1 select table_name from test01 where table_name=:"SYS_B_0"
5、通过SQL_ID获取语句1的绑定变量值
col value_string for a30
select child_address,to_char(LAST_CAPTURED,'dd-mm-yyyy hh24:mi:ss'),VALUE_STRING 
from v$sql_bind_capture where sql_id='9dhppyxtr9qck';
CHILD_ADDRESS     TO_CHAR(LAST_CAPTUR VALUE_STRING
---------------- ------------------- ------------------------------
0000000068BD8220 17-03-2024 20:32:52 OBJ$
6、查看参数_cursor_bind_capture_interval的值为900秒,默认15分钟
set linesize 1000
col name for a50
col value for a30
select nam.ksppinm NAME,val.KSPPSTVL VALUE 
from x$ksppi nam, x$ksppsv val 
where nam.indx = val.indx 
and nam.ksppinm like '%_cursor_bind_capture_interval%' 
order by 1;
7、修改动态参数为10秒捕获一次
alter system set "_cursor_bind_capture_interval"=10;
8、再次执行语句1,参数值不一样,传入参数须为4个字节,避免硬解析
select table_name from test01 where table_name='TAB$';
9、再次查询绑定变值,已从COL$变为TAB$
CHILD_ADDRESS     TO_CHAR(LAST_CAPTUR VALUE_STRING
---------------- ------------------- ------------------------------
0000000068BD8220 17-03-2024 20:35:56 TAB$

  • Bug 12578873 - ORA-7445 [opiaba] when using ore than 65535 bind variables (文档 ID 12578873.8)
  • Instance crashed after ORA-7445 [opiaba] and ORA-600 [17147] (Doc ID 1466343.1)
  • ORA-7445 [opiaba] (Doc ID 737378.1)
  • Not Every Bind Values Captured in v$sql_bind_capture (Doc ID 1370816.1)

        以上就是本期关于“Oracle 11g ORA-7445 [opiaba] 和 ORA-600 [17147]错误”的案例分享。希望能给大家带来帮助!

        欢迎关注“数据库运维之道”公众号,一起学习数据库技术!

  • 32
    点赞
  • 19
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值