记一次ORA-7445 [opiaba] when using more than 65535 bind variables引起的宕库事件
【背景说明】
数据库版本:oracle 11.2.0.4 使用Rose HA做的集群
系统版本:Red Hat Enterprise Linux Server release 6.3 (Santiago)
问题发现:使用绑定变量多于65535会导致数据库异常关闭
BUG号:bug 12578873
解决办法:1)打补丁避免使用绑定变量大于65535而导致数据库关闭
2)或者改写SQL使得使用的绑定变量少于65535
【问题说明及分析】
一、数据库日志记录
1.1 alert日志
点击(此处)折叠或打开
- Sat Sep 10 08:54:25 2016
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_ora_100658.trc (incident=400505):
- ORA-04030: out of process memory when trying to allocate 32792 bytes(kxs-heap-b,bind var buf)
- Incident details in: /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400505/HDORCL001_ora_100658_i400505.trc
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Sat Sep 10 08:54:50 2016
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_ora_100658.trc (incident=400506):
- ORA-04030: out of process memory when trying to allocate 824504 bytes (pga heap,kco buffer)
- ORA-04030: out of process memory when trying to allocate 32792 bytes (kxs-heap-b,bind var buf)
- Incident details in: /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400506/HDORCL001_ora_100658_i400506.trc
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400505/HDORCL001_ora_100658_i400505.trc:
- ORA-04030: out of process memory when trying to allocate 824504 bytes (pga heap,kco buffer)
- ORA-04030: out of process memory when trying to allocate 32792 bytes (kxs-heap-b,bind var buf)
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_ora_100658.trc (incident=400507):
- ORA-04030: out of process memory when trying to allocate 82456 bytes (pga heap,control file i/o buffer)
- ORA-04030: out of process memory when trying to allocate 32792 bytes (kxs-heap-b,bind var buf)
- Incident details in: /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400507/HDORCL001_ora_100658_i400507.trc
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Sat Sep 10 08:54:52 2016
- Sweep [inc][400507]: completed
- Sweep [inc][400506]: completed
- Sweep [inc][400505]: completed
- Sweep [inc2][400506]: completed
- Sat Sep 10 08:55:14 2016
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400505/HDORCL001_ora_100658_i400505.trc:
- ORA-04030: out of process memory when trying to allocate 82456 bytes (pga heap,control file i/o buffer)
- ORA-04030: out of process memory when trying to allocate 32792 bytes (kxs-heap-b,bind var buf)
- Sat Sep 10 08:55:15 2016
- Dumping diagnostic data in directory=[cdmp_20160310085515], requested by (instance=1, osid=100658), sumSepy=[incident=400507].
- Sat Sep 10 08:55:53 2016
- Sweep [inc2][400507]: completed
- Sweep [inc2][400505]: completed
- Sat Sep 10 08:57:56 2016
- Exception [type: SIGSEGV, SI_KERNEL(general_protection)] [ADDR:0x0] [PC:0x186016B, opiaba()+639] [flags: 0x0, count: 1]
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_ora_61897.trc (incident=400881):
- ORA-07445: exception encountered: core dump [opiaba()+639] [SIGSEGV] [ADDR:0x0] [PC:0x186016B] [SI_KERNEL(general_protection)] []
- Incident details in: /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400881/HDORCL001_ora_61897_i400881.trc
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Sat Sep 10 08:58:22 2016
- Dumping diagnostic data in directory=[cdmp_20160310085822], requested by (instance=1, osid=61897), sumSepy=[incident=400881].
- Sat Sep 10 08:58:23 2016
- Sweep [inc][400881]: completed
- Sweep [inc2][400881]: completed
- Sat Sep 10 08:59:01 2016
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_pmon_10332.trc (incident=400025):
- ORA-00600: internal error code, arguments: [17147], [0x820336C18], [], [], [], [], [], [], [], [], [], []
- Incident details in: /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/incident/incdir_400025/HDORCL001_pmon_10332_i400025.trc
- Use ADRCI or Support Workbench to package the incident.
- See Note 411.1 at My Oracle Support for error and packaging details.
- Errors in file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_pmon_10332.trc:
- ORA-00600: internal error code, arguments: [17147], [0x820336C18], [], [], [], [], [], [], [], [], [], []
- PMON (ospid: 10332): terminating the instance due to error 472
- Sat Sep 10 08:59:03 2016
- System state dump requested by (instance=1, osid=10332 (PMON)), sumSepy=[abnormal instance termination].
- System State dumped to trace file /u01/app/oracle/diag/rdbms/hd_asp_001/HDORCL001/trace/HDORCL001_diag_10360_20160310085903.trc
- Sat Sep 10 08:59:07 2016
- ORA-1092 : opitsk aborting process
- Sat Sep 10 08:59:07 2016
- License high water Sepk = 154
- Instance terminated by PMON, pid = 10332
- USER (ospid: 66661): terminating the instance
- Instance terminated by USER, pid = 66661
下面为alert提到的trc文件:
1)HDORCL001_ora_100658_i400505.trc:
点击(此处)折叠或打开
- *** 2016-09-10 08:54:25.296
- dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
- ----- Current SQL Statement for this session (sql_id=50j572vctadfr) -----
- begin
-
- update bill_om_divide_dtl odd set odd.group_no=:1
- where
- odd.locno=:2
- and odd.owner_no=:3 and
- odd.divide_no=:4 and
- odd.box_no= :5
- ;
- update bill_om_divide_dtl odd set odd.group_no=:6
- where
- odd.locno=:7
- and odd.owner_no=:8 and
- odd.divide_no=:9 and
- odd.box_no= :10
- ;
- update bill_om_divide_dtl odd set odd.group_no=:11
- where
- odd.locno=:12
- and odd.owner_no=:13 and
- odd.divide_no=:14 and
- …….
- odd.box_no= :64940
- ;
- update bill_om_divide_dtl odd set odd.group_no=:64941
- where
- odd.locno=:64942
- and odd.owner_no=:64943 and
- odd.divide_no=:64944 and
- odd.box_no= :64945
-
- ;end
2 )HDORCL001_ora_61897_i400881.trc
点击(此处)折叠或打开
- *** 2016-09-10 08:57:57.009
- dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x3, level=3, mask=0x0)
- ----- Current SQL Statement for this session (sql_id=81xr3cba5q52v) -----
- begin
-
- update bill_om_divide_dtl odd set odd.group_no=:1
- where
- odd.locno=:2
- and odd.owner_no=:3 and
- odd.divide_no=:4 and
- odd.box_no= :5
- ;
- update bill_om_divide_dtl odd set odd.group_no=:6
- where
- odd.locno=:7
- and odd.owner_no=:8 and
- odd.divide_no=:9 and
- odd.box_no= :10
- ;
- update bill_om_divide_dtl odd set odd.group_no=:11
- where
- odd.locno=:12
- and odd.owner_no=:13 and
- odd.divide_no=:14 and
- odd.box_no= :15
- ;
- update bill_om_divide_dtl odd set odd.group_no=:16
- …..
- ;
- update bill_om_divide_dtl odd set odd.group_no=:70046
- where
- odd.locno=:70047
- and odd.owner_no=:70048 and
- odd.divide_no=:70049 and
- odd.box_no= :70050
- ;
- update bill_om_divide_dtl odd set odd.group_no=:70051
- where
- odd.locno=:70052
- and odd.owner_no=:70053 and
- odd.divide_no=:70054 and
- odd.box_no= :70055
-
- ;end
点击(此处)折叠或打开
- User=0x979c2fb18 Session=0x979c2fb18 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=54fe410d
- LibraryHandle: Address=0x99d9dcbc8 Hash=d45b145b LockMode=N PinMode=0 LoadLockMode=0 Status=VALD
- ObjectName: Name=begin
-
- update bill_om_divide_dtl odd set odd.group_no=:1
- where
- odd.locno=:2
- and odd.owner_no=:3 and
- odd.divide_no=:4 and
- odd.box_no= :5
- ;
- update bill_om_divide_dtl odd set odd.group_no=:6
- where
- odd.locno=:7
- and odd.owner_no=:8 and
- odd.divide_no=:9 and
- odd.box_no= :10
- ;
- update bill_om_divide_dtl odd set odd.group_no=:11
- where
- odd.locno=:12
- and odd.owner_no=:13 and
- ….
二、数据库日志分析
1、宕机之前有ORA-07445: exception encountered: core dump [opiaba()+639]的出现,以及都存在pmon进程相关的ORA-600核心错误;
经查Oracle support网站资料找到id 737378.1的文档,筛选数据库当前版本11.2.0.4相关数据,得到以下提示:
![](http://img.blog.itpub.net/blog/attachment/201609/18/31324175_1474193625xCzf.png?x-oss-process=style/bb)
![](http://img.blog.itpub.net/blog/attachment/201609/18/31324175_147419353991FZ.png?x-oss-process=style/bb)
3、 宕机跟踪日志都出现 update bill_om_divide_dtl 拼装的长 SQL ;
三、结论
通过分析alert和trc日志,发现 此次宕机都与 SQL 密切相关,由两方面原因引起: SQL 绑定变量过多导致内存分配出现异常,触发 bug; 绑定变量过多导致触发 bug 12578873, 该补丁只是避免类似SQL时引起宕机。
为避免打上该补丁触发其他情况, 建议修改SQL 使用少于65535的绑定变量个数。本案例建议使用loop 修改, 另外注意大量的绑定变量使用在SQL解析时需要更久的DB TIME。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31324175/viewspace-2125110/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/31324175/viewspace-2125110/