11g等待事件之library cache: mutex X

最新推荐文章于 2024-02-22 14:13:16 发布

xiaoxiao树

最新推荐文章于 2024-02-22 14:13:16 发布

阅读量1.4w

点赞数

分类专栏： oracle 体系结构

oracle 体系结构专栏收录该内容

19 篇文章 1 订阅

订阅专栏

http://www.xifenfei.com/2012/05/library-cache-latch%e7%ad%89%e5%be%85%e4%ba%8b%e4%bb%b6.html

产生library cache latch原因

 
        The library cache latches protect the cached SQL statements and objects' definitions held  
       
        in 
         the library cache within the shared pool. The library cache latch must be acquired  
       
        in 
         order to add a new statement to the library cache. During a parse, Oracle searches  
       
        the library cache  
        for 
        a matching statement. If one is not found,  
        then 
        Oracle will parse  
       
        the SQL statement, obtain the library cache latch and insert the new SQL.

每一个sql被执行之前，先要到library cache中根据hash_value查找parent cursor,这就需要先获得library cache latch；找到parent cursor后，就会去查找对应的child cursor,当发现无法找到时，就会释放library cache latch，获得share pool latch分配空间给硬解析后的产生的执行计划；然后再次获得library cache latch进行把执行计划放入share pool,转入library cache pin+lock(null模式)开始执行sql.library cache latch 的个数有限(与CPU_COUNT参数相关)，当数据库中出现大量硬解析的时候，某一个sql无法得到library cache latch就会开始spin,达到spin count后还没得到，就会开始sleep,达到sleep时间后,醒来还再次试图过的library cache latch得不到就在spin再得不到又sleep…依此类推.
综上可知: 在sql执行的过程中可以看出在出现High Versions Count和Hard Parse的情况下都有可能出现library cache latch等待.

http://www.dbafree.net/?p=810

library cache: mutex X替代了之前的library cache latch，主要作用是在hash bucket中定位handle时使用。（比如SQL硬解析时，需要往hash bucket中新增一个cursor时，需要library cache latch）。如下图所示:

文档上面的解释如下：
The library cache mutex is acquired for similar purposes that the library cache latches were acquired in prior versions of Oracle. In 10g, mutexes were introduced for certain operations in the library cache. Starting with 11g, the library cache latches were replaced by mutexes, hence this new wait event.
Mutexes are a lighter-weight and more granular concurrency mechanism than latches. Mutexes take advantage of CPU architectures that offer the compare and swap instructions (or similar). The reason for obtaining a mutex in the first place, is to ensure that certain operations are properly managed for concurrency. E.g., if one session is changing a data structure in memory, then another session must wait to acquire the mutex before it can make a similar change – this prevents unintended changes that would lead to corruptions or crashes if not serialized.
This wait event is present whenever a library cache mutex is held in exclusive mode by a session and other sessions need to wait for it to be released. There are many different operations in the library cache that will require a mutex, so its important to recognize which “location” (in Oracle’s code) is involved in the wait. “Location” is useful to Oracle Support engineers for diagnosing the cause for this wait event.
Individual Waits:

Parameters:
P1 = “idn” = Unique Mutex Identifier
P2 = “value”
P3 = “where” = location in code (internal identifier) where mutex is being waited for
@The meaning of the code for “where” can be found by looking in kgl0.h for entries with the prefix “”kglml_XXX”. For example, if P3=2, then it corresponds to “kglml_kglget2″. You can then search source code for this symbol to see where the mutex is acquired.

测试一个硬解析的场景如下：

conn scott/tiger

create table testlib1 (id number) ;
create table testlib2 (id number) ;
create table testlib3 (id number) ;
create table testlib4 (id number) ;
create table testlib5 (id number) ;
create table testlib6 (id number) ;
create table testlib7 (id number) ;
create table testlib8 (id number) ;
create table testlib9 (id number) ;
create table testlib10 (id number) ;
create table testlib11 (id number) ;
create table testlib12 (id number) ;
create table testlib13 (id number) ;
create table testlib14 (id number) ;
create table testlib15 (id number) ;
create table testlib16 (id number) ;
create table testlib17 (id number) ;
create table testlib18 (id number) ;
create table testlib19 (id number) ;
create table testlib20 (id number) ;

vi test4.sh

#!/bin/ksh
i="$1"
while true
do
echo $i
sqlplus scott/tiger << EOF
select * from testlib$i where id = $RANDOM;
exit
EOF
done

alter system flush shared_pool;
exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

nohup sh test4.sh 1 &
nohup sh test4.sh 2 &
nohup sh test4.sh 3 &
nohup sh test4.sh 4 &
nohup sh test4.sh 5 &
nohup sh test4.sh 6 &
nohup sh test4.sh 7 &
nohup sh test4.sh 8 &
nohup sh test4.sh 9 &
nohup sh test4.sh 10 &
nohup sh test4.sh 11 &
nohup sh test4.sh 12 &
nohup sh test4.sh 13 &
nohup sh test4.sh 14 &
nohup sh test4.sh 15 &
nohup sh test4.sh 16 &
nohup sh test4.sh 17 &
nohup sh test4.sh 18 &
nohup sh test4.sh 19 &
nohup sh test4.sh 20 &

exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

生成awr report，查看等待事件：

Top 5 Timed Foreground Events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
DB CPU                                              104         111.9
library cache: mutex X                1,920           3      1    2.9 Concurrenc
latch: shared pool                      237           1      5    1.2 Concurrenc
latch: row cache objects                146           1      5     .9 Concurrenc
cursor: pin S wait on X                  42           1     13     .6 Concurrenc
^LHost CPU (CPUs:   16 Cores:    8 Sockets:    2)
~~~~~~~~         Load Average
               Begin       End     %User   %System      %WIO     %Idle
           --------- --------- --------- --------- --------- ---------
                1.12      6.89      25.0      18.5       2.7      56.1

我们的hard parse

Load Profile              Per Second    Per Transaction   Per Exec   Per Call
~~~~~~~~~~~~         ---------------    --------------- ---------- ----------
      DB Time(s):                2.8                4.0       0.00       0.00
       DB CPU(s):                3.1                4.5       0.00       0.00
       Redo size:           31,740.9           46,051.8
   Logical reads:            3,625.8            5,260.6
   Block changes:               83.5              121.1
  Physical reads:                0.2                0.4
 Physical writes:                0.2                0.2
      User calls:            2,719.4            3,945.5
          Parses:            1,856.1            2,693.0
     Hard parses:              137.4              199.4
W/A MB processed:                0.8                1.2
          Logons:              129.7              188.1
        Executes:            1,830.6            2,656.0
       Rollbacks:                0.2                0.3
    Transactions:                0.7

查了下metalink，与"library cache: mutex X"等待相关的BUG大约有30多个，会在11G各个版本之间发生，所以碰到这个等待事件，可以先确认下是不是碰上BUG了。以下两个BUG，在网上好象大家经常碰到，给出metalink相关的描述：

Bug 5928271 - Excessive waits on "library cache: mutex X" [ID 5928271.8]

Versions >= 11.1.0.6 but BELOW 11.2
This fix alleviates some waits on "library cache: mutex X" when looking
up a library cache object.

Bug 9530750 High waits for 'library cache: mutex X' for cursor Build lock
Versions BELOW 12.1
High waits may be seen for "library cache: mutex X" for
a build lock mutex with a call stack including kksGetBuildLock.

Rediscovery Notes:
The mutex is for an object with a name like "$BUILD$.xxxxxxxx"

http://blog.itpub.net/18474/viewspace-1060805/

library cache: mutex等待事件分析方法及案例

mutex是11g中采用的轻量级锁，替代latch(更轻量，更细粒度)。其中mutex X表示eXclusive模式，如果是mutex s则是共享模式。引起library cache mutex冲突的原因很多。

[@more@]

一、 Library cache: mutex等待事件产生常见原因：

l os资源不足(cpu、内存)

l sga设置不合理，shared_pool不足，或动态调整导致hard parse

l hot object contention

l hard parse

l sql high version count 由于子游标太多，扫描时会形成锁

l library cache object失效导致重编译

l bug

二、诊断方法

1. 首先检查os资源如cpu内存是否存在不足

2. 其次，检查是否存在Hot object contention

方法一：通过ash报告

Top sql with top events最活跃的sql及其top wait events

Top sql with top row sources 最活跃的sql及其详细运行计划信息，可以凭此定位sql中

哪一部分最消耗时间。

Top event p1/p2/p3 values显示wait event详细参数

方法二：通过sql

通过sql发现hot objects:

select KGLNAOBJ, KGLNAOWN, KGLHDNSP, KGLOBTYP, KGLOBT23, KGLOBT24

from x$kglob where KGLOBT23 > 1000000 or KGLOBT24 > 1000000

order by KGLOBT24;

也可以用以下sql找到相应sql，hot objects，以及每句sql等待事件占比

select sw.p1,s.sql_id, count(*),(ratio_to_report(count(*)) over ())*100 pct

from v$session s,v$session_wait sw

where s.event like '%mutex%'and s.sid=sw.sid

group by sw.p1,s.sql_id

order by count(*) desc;

select sql_id,sql_text from v$sql where sql_id=…;

如果要查看历史信息，可以从历史表查询：

select * from (

select p1, sql_id,

count(*),

(ratio_to_report(count(*)) over ())*100 pct

from dba_hist_active_sess_history

where event='library cache: mutex X'

and snap_id between and

and dbid =

group by p1, sql_id

order by count(*) desc)

where rownum <= 10;

然后根据p1查热点对象

select KGLNAOBJ, KGLNAOWN, KGLHDNSP, KGLOBTYP

from x$kglob where KGLNAHSH= ;

3. 如果不存在hot object contention，则检查AWR报告（和baseline做比较）

1) Hard parse

检查awr的load profile中的hard parse再检查sql orderd by pase call）

检查是否发生了shared_pool size的调整

2) high version count高

检查sql orderd by version count是否正常

cursor_sharing参数是否设成了SIMILAR(建议用FORCE取代)

3) Library cache object失效

检查 Library Cache Activity的invalidations. 如果失效很多，检查是否执行了ddl操作

check ddl's performed during the time such as truncate, drop, grants, dbms_stats

4. mutex sleeps

检查AWR中'Mutex Sleep Summary'，主要关注location和times，

如果Sleeps次数多但time少，则影响不大。

也可通过sql查询（注意是instance启动以来的，而不是某个时段内）

select to_char(sysdate, 'HH:MI:SS') time, KGLNAHSH hash, sum(sleeps) sleeps,location,MUTEX_TYPE

, substr(KGLNAOBJ,1,40) object

from x$kglob , v$mutex_sleep --也可查历史表v$mutex_sleep_history

where kglnahsh=mutex_identifier

group by KGLNAOBJ,KGLNAHSH,location,MUTEX_TYPE

order by sleeps

xiaoxiao树

关注

0
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
11g等待事件之library cache: mutex X

http://www.xifenfei.com/2012/05/library-cache-latch%e7%ad%89%e5%be%85%e4%ba%8b%e4%bb%b6.html产生library cache latch原因The library cache latches protect the cached SQL statements and o
复制链接

扫一扫