ORA-4031

--ORA-04031是个经典错误,一般情况是由于shared pool被耗尽导致,例如:未绑定变量导致的hard parse或大并发的shared pool本身太小,一般情况ORA-04031发生时会同时又相关等待事件出现
cursor: pin S wait on X;library cache lock;latch: library cache;latch: shared pool等。

AWR里shared Pool Advisory的各项指标说明,此区域可以说明是否需要调节shared_pool
Est LC===> Estimate library cache
Shared Pool Size(M)==>用以评估的shared pool共享池大小,在AMM /ASMM环境下shared_pool大小都可能浮动
SP Size Factr==>共享池大小的比例因子,(Shared Pool Size for Estim/SHARED_POOL_SIZE),当此值为1时,shared pool size是实际的大小
Est LC Size(M)==>评估的library cache大小,单位为MB,因为是shared pool中包含library cache当然还有其他例如dictionary cache
Est LC Mem Obj==>评估指定大小共享池内的library cache memory object的数量
Est LC Time Saved(s)==>指定的共享池大小情况下可找到需要的library cache memory objects从而节约的解析时间,这些节约的解析时间也是花费在共享池内重复加载需要的对象(reload),
这些对象可能因为共享池没有足够的free memory而被刷出去
Est LC Time Saved Factr==>Est LC Time Saved(s)的比例因子(Est LC Time Saved(s)/Current LC Time Saved(s))
Est LC Load Time (s)==>在指定的共享池大小情况下解析的耗时
Est LC Load Time Factr==>Est LC Load Time(s)的比例因子,(Est LC Load Time(s)/Current LC Load Time(s))
Est LC  Mem Obj Hits(K)==>在指定的共享池大小情况下需要的library cache memory object正好在共享池中被找到的次数

相关信息来自v$shared_pool_advice视图



---此语句就对应着awr上Library Cache Activity区域---此区域可以判定是否是shared pool小导致,一般shared Pool Advisory也可反映出来这个问题
SELECT l.NAMESPACE,--library cache里相关对象
       l.GETS,--相关对象library cache lock被申请的次数
       (1 - l.GETHITRATIO) * 100  " get  misses pct",--丢失,即未申请到local
       l.GETHITS,--相关对象正好在library cache handle找到的次数
       l.PINS,--这个对象上的被申请pin锁的次数
       (1 - l.PINHITRATIO) * 100 "pins misses pct",--丢失,即未申请到latch
       l.PINHITS,--pin对象的heap metadata正好在shared pool上的次数
       l.RELOADS,--从object handle被创建建开始不是第一次PIN该对象的PIN ,且该次PIN要求对象从磁盘上读取加载的次数 ;Reloads值较高的情况 建议增大shared_pool_size
       l.INVALIDATIONS--对象被修改导致该命名空间所包含对象被标记为无效的次数
  FROM V$LIBRARYCACHE l;


某节点直接因ORA-4031错误hang死实例,
Errors in file /u01/app/oracle/diag/rdbms/querydb/querydb2/trace/querydb2_m000_21164.trc:
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KKSSP^2372","kglseshtSegs")
Wed Oct 08 21:44:11 2014
Process m000 died, see its trace file
Wed Oct 08 21:44:15 2014
Errors in file /u01/app/oracle/diag/rdbms/querydb/querydb2/trace/querydb2_smon_7226.trc:
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^e949e218","kglHeapInitialize:temp")
Wed Oct 08 21:44:31 2014
Errors in file /u01/app/oracle/diag/rdbms/querydb/querydb2/trace/querydb2_smon_7226.trc:
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","lock table sys.mon_mods$ in ...","SQLA","tmp")
Wed Oct 08 21:44:47 2014
Errors in file /u01/app/oracle/diag/rdbms/querydb/querydb2/trace/querydb2_smon_7226.trc:
ORA-04031: unable to allocate 32 bytes of shared memory ("shared pool","unknown object","KGLH0^e949e218","kglHeapInitialize:temp")

---从上面的错误可以看到问题发生在shared pool上,而不是large pool 或java pool,调去awr报告来看
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:     10031 04-Oct-14 20:00:18       457       8.7
  End Snap:     10032 04-Oct-14 21:00:01       791       9.3
   Elapsed:               59.72 (mins)
   DB Time:            6,091.04 (mins)==>数据库相应时间超高
   

Cache Sizes                       Begin        End
~~~~~~~~~~~                  ---------- ----------
               Buffer Cache:    40,832M    40,832M  Std Block Size:         8K
           Shared Pool Size:     4,224M     4,224M      Log Buffer:    44,656K

Load Profile              Per Second    Per Transaction   Per Exec   Per Call
~~~~~~~~~~~~         ---------------    --------------- ---------- ----------
      DB Time(s):              102.0               12.7       0.14       0.18
       DB CPU(s):                2.0                0.3       0.00       0.00
       Redo size:          282,965.6           35,170.7
   Logical reads:           31,776.1            3,949.6
   Block changes:            1,168.5              145.2
  Physical reads:              940.6              116.9
 Physical writes:              186.2               23.2
      User calls:              573.1               71.2
          Parses:              213.4               26.5
     Hard parses:               66.4                8.3==>同比和环比下,此时间段的hard parse太高
W/A MB processed:                4.4                0.6
          Logons:                0.2                0.0
        Executes:              717.9               89.2
       Rollbacks:                0.0                0.0
    Transactions:                8.1


Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
cursor: pin S wait on X              34,869     204,419   5862   55.9 Concurrenc==>
SGA: allocation forcing compon    1,562,520      91,720     59   25.1 Other
library cache: mutex X               27,076      30,436   1124    8.3 Concurrenc
db file sequential read           1,821,755       8,849      5    2.4 User I/O
DB CPU                                            7,280           2.0

-> "Pct Misses"  should be very low
                         Get    Pct            Pin    Pct             Invali-
Namespace           Requests   Miss       Requests   Miss    Reloads  dations
--------------- ------------ ------ -------------- ------ ---------- --------
ACCOUNT_STATUS         1,128    0.4              0    N/A          0        0
BODY                     471    8.9            764   16.6         84        0
CLUSTER                6,040    0.2          6,112    0.2          0        0
DBLINK                 1,195    0.0              0    N/A          0        0
EDITION                  552    0.0          1,087    0.0          0        0
INDEX                216,927   24.3        216,480   24.7        789        0
LOCATION                  84   17.9             84   17.9          0        0
OBJECT ID                136  100.0              0    N/A          0        0
QUEUE                      6   33.3          1,009    0.5          3        0
RULESET                    0    N/A             14   14.3          2        0
SCHEMA                 5,826    0.0              0    N/A          0        0
SQL AREA             811,075    6.7      3,813,011   23.3    482,756   13,900==>shared pool 太小,从而pin对象reload很大,导致前面的硬解析很高,从而
SQL AREA BUILD       537,350   11.6              0    N/A          0        0
SQL AREA STATS       209,927   95.8        209,258   95.8         61        0
SUBSCRIPTION               0    N/A             84   26.2         22        0
TABLE/PROCEDURE      261,995    0.9      1,092,520   19.6     59,542        0
TRIGGER                3,526   42.7          3,526   42.8          4        0
          -------------------------------------------------------------

-----通过查询发生ORA-04031时间段的cursor:pin S wait on X等待,发现有有2k多个
SELECT event, blocking_session, sql_id, COUNT(*)
  FROM dba_hist_active_sess_history ash
 WHERE sample_time >=
       to_timestamp('2014-10-04 20:00:00', 'yyyy-mm-dd hh24:mi:ss')
   AND sample_time <=
       to_timestamp('2014-10-04 21:10:00', 'yyyy-mm-dd hh24:mi:ss')
   AND event = 'cursor: pin S wait on X'
 GROUP BY event, blocking_session, sql_id;


SELECT event, blocking_session, sql_id, COUNT(*)
  FROM dba_hist_active_sess_history ash
 WHERE sample_time >=
       to_timestamp('2014-10-04 20:00:00', 'yyyy-mm-dd hh24:mi:ss')
   AND sample_time <=
       to_timestamp('2014-10-04 21:10:00', 'yyyy-mm-dd hh24:mi:ss')
   AND ash.session_id = 4284
 GROUP BY event, blocking_session, sql_id;

----原因就是:shared pool太小,正好此时间段是数据库的业务高峰期,从而当后面查询的对象信息在前面已经加载到了shared pool中,
由于太小,从而很快被刷出去,导致后面一次的访问,要重新做硬解析,这点从解析和reload次数看出;并发高,导致要以X模式pin住library cache object,
这就意味这个process acquires the mutex in exclusive mode即library cache latch以X模式,多个session是这样,从而产生cursor: pin S wait on X等待。
shared pool太小&&并发高===导致shared pool里的library cache共享游标被刷出去,产生大量硬解析==>很多session以library cache latch以X模式===>产生大量不同sql_id的cursor: pin S wait on X===>需要调整shared_pool_size



内存中shared pool结构及机制
Shared Pool当中主要包含了2部分:library cache和dictionary cache;
Library cache包含了共享SQL区(shared SQL areas),私有SQL区(private SQLareas,如果配置了共享服务器),PL/SQL存储过程以及包,还有一些控制信息,比如说locks以及library cache handles。
Dictionary cache包含了表,视图的依赖信息,比如表结构,它的用户,Oracle在解析SQL的时候就会频繁的访问dictionary cache。

library chache结构,是有多个hash bucket组成的hash table
多个hash bucket<====有很多的lch组成的lch list======>library cache handle<==lch里有大量对象指针,通过此指针,指到相关对象========>library cache handle<==========>....
lch指向的library cache object对象大概有
dependency table:指向本对象所依赖的对象。
child table:指向本对象的子对象,比如某个游标的子游标
authorization table:对象的授权信息。
type:shared cursor,index,table,cluster,view,synonym,sequence,procedure,function,package,table body,package body,trigger等等。
data blocks:data block是一个指针,指向了data heap,即存放真实数据的地方

相关锁发生场景:
library cache lock latch获得lock后,则就释放lock latch,必须先获得latch===>library cache lock发生在查找lch时====通过lch找lco时==>===先申请library cache pin latch,获得pin后,则就释放pin latch,必须先获得latch===>library cache pin发生在查找访问lco的heap上

Library Cache lock有3中模式:
Share(S):当读取一个library cache object的时候获得
Exclusive(X):当创建/修改一个library cache object的时候获得
Null(N):用来确保对象依赖性
Library Cache pin有2种模式:
Share(S):读取object heap
Exclusive(X):修改object heap

从Oracle10g开始,Oracle正在逐步用mutex取代library cache中的latch,
cursor:pin S和cursor:pin X相当于share和exclusive类型的library cache pin,
cursor:pin S wait on X则表示share方式正在等待exclusive锁定

详情相关信息,请参考oracle11g核心技术

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值