最近遇到这样一个事情,在动态的调整sga的一个参数的时候,数据库就down了:
alter system set DB_KEEP_CACHE_SIZE = 1 g
*
ERROR at line 1 :
ORA - 03113 : end - of - file on communication channel
SQL >
对应的,在alertlog中的报错为:
Errors in file /u01/app/oracle/product/10.2.0/admin/MICPRD/bdump/micprd_mman_3264732.trc:
ORA-00600: internal error code, arguments: [kmgs_update_target_size_1], [0], [], [], [], [], [], []
Sat Nov 13 03:56:01 2010
Errors in file /u01/app/oracle/product/10.2.0/admin/MICPRD/bdump/micprd_mman_3264732.trc:
ORA-00600: internal error code, arguments: [kmgs_update_target_size_1], [0], [], [], [], [], [], []
Sat Nov 13 03:56:01 2010
MMAN: terminating instance due to error 822
Sat Nov 13 03:56:01 2010
Errors in file /u01/app/oracle/product/10.2.0/admin/MICPRD/bdump/micprd_mmon_2969868.trc:
ORA-00600: internal error code, arguments: [kmgs_perform_parameter_updates_1], [0], [], [], [], [], [], []
Sat Nov 13 03:56:03 2010
Errors in file /u01/app/oracle/product/10.2.0/admin/MICPRD/bdump/micprd_mmon_2969868.trc:
ORA-00600: internal error code, arguments: [kmgs_parameter_update_timeout_1], [600], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kmgs_perform_parameter_updates_1], [0], [], [], [], [], [], []
Instance terminated by MMAN, pid = 3264732
对应的tace文件中的call stack为:
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
ksedst+001c bl ksedst1 000000000 ? FFFFFFFFFFFC7D4 ?
ksedmp+0290 bl ksedst 1048DFFC0 ?
ksfdmp+0018 bl 03F5B014
kgerinv+00dc bl _ptrgl
kgeasnmierr+004c bl kgerinv FFFFFFFFFFFCB00 ? 000000000 ?
101045830 ? 000000010 ?
FFFFFFFFFFFCBA0 ?
kmgs_update_target_ bl kgeasnmierr 110190110 ? 110440040 ?
size+006c 104A818A0 ? 100000001 ?
000000000 ? 000000000 ?
12E0BE826D694B2F ?
000000000 ?
kmgs_update_paramet bl kmgs_update_target_ 900000000083144 ?
er_ctx+0354 size FFFFFFFFFFFCDA0 ?
kmgs_update_param_m bl kmgs_update_paramet 000000002 ? FFFFFFFFFFFD468 ?
anual_helper+0078 er_ctx 010008000 ? 110011970 ?
kmgs_update_param_m bl kmgs_update_param_m 000000000 ? 000000001 ?
anual+0064 anual_helper 000000002 ?
kmgsdrv+16cc bl kmgs_update_param_m 9000000000F6F64 ?
anual
ksbabs+03a8 bl _ptrgl
ksbrdp+0408 bl _ptrgl
opirip+03fc bl 03F59B54
opidrv+0448 bl opirip 110296850 ? 410298190 ?
FFFFFFFFFFFF960 ?
sou2o+0090 bl opidrv 3202AB393C ? 4A0144924 ?
FFFFFFFFFFFF960 ?
opimai_real+0150 bl 01FC4EF4
main+0098 bl opimai_real 000000000 ? 000000000 ?
__start+0098 bl main 000000000 ? 000000000 ?
经查metalink,应该是这以下的2个bug有关:
- Bug 6737235: DATABASE WITH SGA_TARGET CRASH WHEN ALTER SGA COMPONENTS
- Bug 6146397: ORA-00600 [KMGS_UPDATE_TARGET_SIZE_1], [0] INSTANCE TERMINATED BY MMAN
我操作的数据库是aix上oracle 10.2.0.3数据库。目前这2个bug都还没有在10203上的fix,只是建议升级到10204或者用workaround的解决。workaround是用alter system scope=spfile的方式修改,再重启数据库生效。
这让我感觉10g上的自动内存管理不太靠谱,虽然自动内存管理可以动态调整buffer cache和其他几个pool如share pool之间的比例,但是如果一个系统是成熟的系统,很好的进行了绑定变量,那么就不太会发生各个池子之间比例的变动。如果连动态的调整一下sga中的一个组件,都会导致MMAN将数据库down掉的话,那oracle是不是太脆弱了呢?
另外,在查询关于这个ora-600的相关报错信息的时候,还查到了另一个有意思的bug:
- MMON Terminates Instance With ORA-600 [kmgs_pre_process_request_6] or ORA-600 [kmgs_update_target_size_1] When Resizing Caches [ID 373802.1]
这个是Bug 4433838,影响10201至10203的所有平台,这个bug虽然和我的情况有点出入,文档说是MMON进程终止实例,我的情况是MMAN进程终止实例。但文档说触发这个bug的cause是sga的大小为4G的整数倍。呵呵,碰巧,我的sga_target和sga_max_size为12G,正是4G的倍数。而且ora-600之后的参数和trace文件中的信息也和我类似。
因此,综上,在10204以下的版本设置调整SGA的时候,需要注意以下几点:
1、SGA大小不要设置成4G的整数倍。
2、调整SGA内的组件的时候,不要动态的调整,利用scope=spfile再重启数据库进行修改。
附:
显示 Bug 属性 Bug 属性
类型 B - Defect 已在产品版本中修复 -
严重性 2 - Severe Loss of Service 产品版本 10.2.0.3
状态 36 - Duplicate Bug. To Filer 平台 23 - Oracle Solaris on SPARC (64-bit)
创建时间 08-Jan-2008 平台版本 -
更新时间 11-Aug-2009 基本 Bug 6192679
数据库版本 10.2.0.3
影响平台 Generic
产品源 Oracle
显示相关产品 相关产品
产品线 Oracle Database Products 系列 Oracle Database
区域 Oracle Database 产品 5 - Oracle Server - Enterprise Edition
Hdr: 6737235 10.2.0.3 RDBMS 10.2.0.3 MEMORY MGMT PRODID-5 PORTID-23 6192679
Abstract: DATABASE WITH SGA_TARGET CRASH WHEN ALTER SGA COMPONENTS
*** 01/08/08 10:47 pm ***
TAR:
----
PROBLEM:
--------
The parameter setting are as follows:
NAME TYPE VALUE
------------------------------------ ----------- ---------
shared_pool_size big integer 608M
sga_max_size big integer 12G
sga_target big integer 12G
db_cache_size big integer 9G
*.java_pool_size=209715200
*.large_pool_size=209715200
*.streams_pool_size=50331648
DEFAULT buffer cache in the v$sga_dynamic_components shows the below value :
CURRENT_SIZE MIN_SIZE USER_SPECIFIED_SIZE
------------ -------- -------------------
1.1744E+10 1.1744E+10 9663676416
When try to increase the shared_pool_size to 700M :
SQL> alter system set shared_pool_size=700M;
alter system set shared_pool_size=700M
*
ERROR at line 1:
ORA-2097: parameter cannot be modified because specified value is
invalid
ORA-4033: Insufficient memory to grow pool
SQL> select name from v$database;
select name from v$database
*
ERROR at line 1:
ORA-3135: connection lost contact
After the above error message the database crashes.
below is the message in the alert log :
Errors in file
/u01/app/oracle/admin/LSTDDB1/bdump/lstddb1_mman_12427.trc:
ORA-600: internal error code, arguments: [kmgs_update_target_size_1],
[0], [], [], [], [], [], []
When try to decrease the size of the default buffer cache getting:
SQL> alter system set db_cache_size=10G;
alter system set db_cache_size=10G
*
ERROR at line 1:
ORA-3113: end-of-file on communication channel
and the database crashes.
the alert shows the following:
Errors in file /u01/app/oracle/admin/LSTDDB1/bdump/lstddb1_mman_11825.trc:
ORA-600: internal error code, arguments: [kmgs_pre_process_request_6], [6],
[700], [576], [3], [0x67C7B9398], [], []
DIAGNOSTIC ANALYSIS:
--------------------
WORKAROUND:
-----------
None
RELATED BUGS:
-------------
Bug 6192679
REPRODUCIBILITY:
----------------
Everytime at customer site
TEST CASE:
----------
STACK TRACE:
------------
SUPPORTING INFORMATION:
-----------------------
24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------
DIAL-IN INFORMATION:
--------------------
IMPACT DATE:
------------
*** 01/08/08 10:49 pm ***
*** 01/08/08 10:53 pm ***
*** 01/08/08 10:53 pm *** (CHG: Sta->16)
*** 01/09/08 07:51 am *** (CHG: Asg->NEW OWNER OWNER)
*** 01/09/08 08:53 am *** (CHG: Sta->10 SubComp->MEMORY MGMT)
*** 01/09/08 08:53 am ***
*** 01/10/08 08:20 pm ***
*** 01/10/08 08:20 pm *** (CHG: Sta->16)
*** 01/11/08 08:33 am *** (CHG: Sta->10)
*** 01/11/08 08:33 am ***
*** 01/28/08 10:43 pm ***
*** 01/28/08 10:43 pm *** (CHG: Sta->16)
*** 01/29/08 06:17 am *** (CHG: Sta->10)
*** 01/29/08 06:17 am ***
*** 01/29/08 06:42 am *** (CHG: Sta->16)
*** 01/29/08 06:42 am ***
*** 01/29/08 06:44 am *** (CHG: Sta->10)
*** 01/29/08 06:44 am *** (CHG: Sta->16)
*** 01/29/08 10:00 am *** (CHG: Sta->11)
*** 01/29/08 10:00 am ***
*** 01/29/08 10:00 am *** (ADD: Impact/Symptom->INTERNAL ERROR )
*** 01/29/08 10:01 am *** (CHG: Asg->NEW OWNER OWNER)
*** 01/29/08 10:01 am ***
*** 02/13/08 06:07 pm ***
*** 05/06/08 06:21 pm ***
*** 05/06/08 06:29 pm ***
*** 05/06/08 06:31 pm ***
*** 08/07/08 10:57 am *** (CHG: Sta->36)
显示 Bug 属性 Bug 属性
类型 B - Defect 已在产品版本中修复 -
严重性 3 - Minimal Loss of Service 产品版本 10.2.0.3.0
状态 33 - Suspended, Req'd Info not Avail 平台 212 - IBM AIX on POWER Systems (64-bit)
创建时间 21-Jun-2007 平台版本 -
更新时间 17-Sep-2007 基本 Bug -
数据库版本 10.2.0.3
影响平台 Generic
产品源 Oracle
显示相关产品 相关产品
产品线 Oracle Database Products 系列 Oracle Database
区域 Oracle Database 产品 5 - Oracle Server - Enterprise Edition
Hdr: 6146397 10.2.0.3 RDBMS 10.2.0.3.0 MEMORY MGMT PRODID-5 PORTID-212
Abstract: ORA-600 [KMGS_UPDATE_TARGET_SIZE_1], [0] INSTANCE TERMINATED BY MMAN
*** 06/21/07 05:05 pm ***
TAR:
----
PROBLEM:
--------
When dynamically allocating memory for running multiple sized buffer pools
statement fails and instance terminates
DIAGNOSTIC ANALYSIS:
--------------------
ALTER SYSTEM SET db_16k_cache_size='256M' SCOPE=BOTH;
ORA-600: internal error code, arguments: [kmgs_update_target_size_1], [0],
[],
..
MMAN: terminating instance due to error 822
WORKAROUND:
-----------
1) ALTER SYSTEM SET db_16k_cache_size=' < size > ' scope=spfile
2) Restart instance
RELATED BUGS:
-------------
REPRODUCIBILITY:
----------------
TEST CASE:
----------
STACK TRACE:
------------
ksedst ksedmp ksfdmp kgerinv kgeasnmierr kmgs_update_target_size
kmgs_update_parameter_ctx kmgs_update_param_manual_helper
kmgs_update_param_manual kmgsdrv ksbabs ksbrdp opirip opidrv
sou2o opimai_real main start
SUPPORTING INFORMATION:
-----------------------
irp3_mman_1794226.trc
alert_IRP3.log
24 HOUR CONTACT INFORMATION FOR P1 BUGS:
----------------------------------------
DIAL-IN INFORMATION:
--------------------
IMPACT DATE:
------------
*** 06/21/07 05:13 pm *** (CHG: Sta->16)
*** 06/21/07 10:32 pm *** (CHG: Asg->NEW OWNER OWNER)
*** 06/22/07 06:26 am ***
*** 06/22/07 06:28 am *** (CHG: Sta->10)
*** 09/17/07 10:27 pm *** (CHG: Sta->33 SubComp->MEMORY MGMT)
*** 09/17/07 10:27 pm ***
修改时间 23-MAR-2010 类型 PROBLEM 状态 PUBLISHED
In this Document
Symptoms
Cause
Solution
References
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
Information in this document applies to any platform.
Symptoms
The MMAN process terminates the instance with the the following error is reported in the database alert.log file:
ORA-00600: internal error code, arguments: [kmgs_pre_process_request_6]
OR
ORA-00600: internal error code, arguments: [kmgs_update_target_size_1]
The call stack from the ORA-00600 trace file will be similar to:
ksddoa < - ksdpcg < - ksdpec < - ksfpec < - kgerinv < - kgeasnmierr < - quest < - kmgsdrv
or
kmgs_update_target_size < - kmgs_update_parameter_ctx < - kmgs_update_param_manual_helper < - kmgs_update_param_manual
Cause
This is due to Bug 4433838
Similar issues have been filed as bugs which have also been determined to be duplicates of Bug 4433838 .
The error occurs when the parameter SGA_TARGET is set to an exact multiple of 4Gb .
For instance , if the SGA_TARGET value in HEXADECIMAL has 00000000 as the last 4 bytes then you can hit this problem . The bug indicates this happens on an exact multiple of 4G .
Example:
SGA_TARGET ( 20 Gb ) = 21474836480 -- > HEXADECIMAL -->( 0x500000000) --->(value in HEXADECIMAL has 00000000 as the last 4 bytes).
SGA_TARGET (4 Gb) = 4294967296 --> HEXADECIMAL -->( 0x100000000) --->(value in HEXADECIMAL has 00000000 as the last 4 bytes).
Solution
1. Upgrade to 10.2.0.4 or higher.
OR
2 Apply the one-off Patch 4433838 depending on its availability for your patchset-level and platform
OR
3. Use the workaround:
Set your SGA_TARGET parameter to any non-multiple values of 4 Gb.
In other words, ensure that your SGA_TARGET value in HEXADECIMAL does not have 00000000 as the last 4 bytes.
Example:
If your SGA_TARGET is set to 20 Gb (21474836480) = 0x500000000 (in Hexadecimal). The workaround consists of changing its value to 0x510000000 (in Hexadecimal) = 21743271936
If your SGA_TARGET is set to 4 Gb (4294967296) = 0x100000000 (in Hexadecimal). The workaround consists of changing its value to 0x100000001 (in Hexadecimal) = 4294967297
References
BUG:4433838 - ORA-600 [KMGS_PRE_PROCESS_REQUEST_6] TERMINATES INSTANCE WHEN RESIZING CACHES
BUG:6192679 - MMAN ORA-600[KMGS_UPDATE_TARGET_SIZE_1],[0] AND INSTANCE DOWN
显示相关信息 相关的
产品
* Oracle Database Products > Oracle Database > Oracle Database > Oracle Server - Enterprise Edition