Errors in file /oracle/oracle/admin/maclean/udump/u1_ora_13757.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00604: error occurred at recursive SQL level 1 ORA-04031: unable to allocate 1048 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghteInit") ORA-00604: error occurred at recursive SQL level 1 ORA-04031: unable to allocate 1048 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","seg:kggfaAllocSeg") Thu Oct 13 08:25:05 2011 Log from www.oracledatabase12g.com & www.askmaclean.com Errors in file /oracle/oracle/admin/maclean/udump/u1_ora_1444.trc: ORA-00603: ORACLE server session terminated by fatal error ORA-00604: error occurred at recursive SQL level 1 ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghtInit") ORA-00604: error occurred at recursive SQL level 1 ORA-04031: unable to allocate 4120 bytes of shared memory ("shared pool","select name,online$,contents...","Typecheck","kgghtInit")以上错误出现的同时实例出现大量的row cache lock字典缓存和cursor:pin S wait on X等待事件,说明共享池中的row cache字典缓存和SQL area 执行计划因为Free Memory不足而被不断换出,导致硬解析增多并SQL解析性能下降,进一步造成了应用程序挂起,赶到现场后对该ORA-04031错误进行了分析。 SGA中的内存池包含不同大小的内存块。当数据库启动时,会有一块大的内存被分配并使用Free list的空闲列表追踪。随着时间推移,这些内存被不断分配和释放,内存块(chunk)被按照其大小在不同的Fress list中移动,当SGA里任何一个内存池出现不能满足内部分配一整块连续的内存块请求时,就可能出现ORA-04031错误。实际使用中造成ORA-04031错误的原因可能是Oracle软件bug、产品缺陷、应用程序设计不当、Oracle内存参数设置不当。 这里出现ORA-04031错误的内存池是shared pool即共享池,为了搞清楚ORA-04031错误发生的实际原因,我们通过AWR报告分析共享池的使用情况。 以下是 ORA-04031 问题发生前一天AWR报告中的共享池内存使用情况:
Pool | Name | Begin MB | End MB | % Diff |
---|---|---|---|---|
large | free memory | 112.00 | 112.00 | 0.00 |
shared | ASH buffers | 25.60 | 25.60 | 0.00 |
shared | CCursor | 19.44 | 20.16 | 3.70 |
shared | Checkpoint queue | 5.87 | 5.87 | 0.00 |
shared | PCursor | 10.57 | 11.14 | 5.38 |
shared | event statistics per sess | 7.72 | 7.72 | 0.00 |
shared | free memory | 32.99 | 33.00 | 0.02 |
shared | gcs resources | 78.75 | 78.75 | 0.00 |
shared | gcs shadows | 49.61 | 49.61 | 0.00 |
shared | ges big msg buffers | 15.03 | 15.03 | 0.00 |
shared | ges reserved msg buffers | 7.86 | 7.86 | 0.00 |
shared | ges resource | 5.28 | 5.28 | 0.00 |
shared | kglsim heap | 16.63 | 16.63 | 0.00 |
shared | kglsim object batch | 25.63 | 25.63 | 0.00 |
shared | library cache | 21.32 | 22.01 | 3.23 |
shared | row cache | 7.13 | 7.13 | 0.00 |
shared | sql area | 64.06 | 61.55 | -3.91 |
streams | free memory | 64.00 | 64.00 | 0.00 |
buffer_cache | 3,936.00 | 3,936.00 | 0.00 | |
fixed_sga | 2.08 | 2.08 | 0.00 | |
log_buffer | 3.09 | 3.09 | 0.00 |
Pool | Name | Begin MB | End MB | % Diff |
---|---|---|---|---|
large | free memory | 112.00 | 112.00 | 0.00 |
shared | ASH buffers | 25.60 | 25.60 | 0.00 |
shared | Checkpoint queue | 5.87 | 5.87 | 0.00 |
shared | KCL name table | 9.00 | 9.00 | 0.00 |
shared | event statistics per sess | 7.72 | 7.72 | 0.00 |
shared | free memory | 25.56 | 25.52 | -0.12 |
shared | gcs resources | 143.39 | 143.39 | 0.00 |
shared | gcs shadows | 90.33 | 90.33 | 0.00 |
shared | ges big msg buffers | 15.03 | 15.03 | 0.00 |
shared | ges reserved msg buffers | 7.86 | 7.86 | 0.00 |
shared | library cache | 7.59 | 7.65 | 0.80 |
shared | row cache | 7.13 | 7.13 | 0.00 |
shared | sql area | 8.70 | 7.35 | -15.57 |
streams | free memory | 64.00 | 64.00 | 0.00 |
buffer_cache | 7,168.00 | 7,168.00 | 0.00 | |
fixed_sga | 2.09 | 2.09 | 0.00 | |
log_buffer | 3.09 | 3.09 | 0.00 |
‘gcs_resources’ = initial_allocation * 120 bytes = “_gcs_resources parameter” * 120 bytes ‘gcs_shadows’ = initial_allocation * 72 bytes = “_gcs_shadow_locks parameter” * 72 bytes select * from v$resource_limit where resource_name like '%gcs%'; RESOURCE_NAME CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_ALLOCATION LIMIT_VALUE ------------------------------ ------------------- --------------- --- gcs_resources 507772 514607 976083 976083 gcs_shadows 133862 139927 976083 976083
我们可以通过现有的v$resource_limit视图中的INITIAL_ALLOCATION估算Buffer cache增加后的INITIAL_ALLOCATION数量,例如我们准备将db_cache_size从10g增加到20g,那么可以通过下列公式算出有必要增加的共享池大小: add_to_shared_pool_size= 140 * Buffer_cache增加的兆数 * 192 bytes * 1.6 = 140 * 10* 1024 * 192 * 1.6 = 440401920 = 420M 问题总结 由于RAC环境中Oracle 使用共享池中的gcs resource/shadow 资源管理 全局缓存 , 当实例的Buffer Cache总量增加时gcs resource/shadow 这些资源的数目也会相应上升 , 这导致共享池中可用的剩余空间大幅下降,又因为 gcs 全局缓存资源在共享池中享有较高的优先级( perm ,且在10.2中 gcs资源不能和其他如row cache或library cache 共享一个Extent的内存区间) , 引发了大量的row/dictionary cache字典缓存和SQL执行计划被换出共享池, 引发大量的解析等待: cursor pin s on x 和 row cache lock ,
本文转自maclean_007 51CTO博客,原文链接:http://blog.51cto.com/maclean/1278146