library cache: mutex X等待事件是当一个会话以 exclusive mode持有library cache mutex时,另一个会话当前无论何时申请此mutex时必须等待其释放。很多时候对 library cache做不同的操作时都需要申请一个mutex, 所以最重要的是确认mutex的位置”location”, 该位置有助于分析该类等待事件的原因。
对应参数:
P1 = “idn” = Unique Mutex Identifier
P2 = Mutex “value” = in order to find the mutex value, read note 1298015.1
P3 = “where” = location in code (internal identifier) where mutex is being waited for
P1 is idn – can be used for finding the cursor related to mutex
For library cache: mutex X
1. P1 represents library cache hash bucket number (if idn <= 131072)
2. P1 represents hash value of the library cache object under protection (if idn > 131072)
mutex的问题排查较为复杂,通常一个现象有多个原因,并且mutex问题又会引起其它级连问题。常见原因:
1. 频繁的hard parse
2. High sql version count,需要检查一个很长的chain of versions
3. 引为某些原因导致的Invalidations and reloads
4. shared pool配置过小
5. cursor_sharing=similar和session_cached_cursors配置不当
6. BUGs
该类问题时通常需要引集的信息:
• AWR Report
• ASH Report
• Hanganalyze trace
• Systemstate dump (contain call stack level)
系统级可用的相关视图:GV$MUTEX_SLEEP 和GV$MUTEX_SLEEP_HISTORY
分析该事件的相关SQL:
-- which library cache objects are the target of the operation.
select *
from
(select
case when (kglhdadr = kglhdpar)
then 'Parent'
else 'Child '||kglobt09 end cursor,
kglhdadr ADDRESS,
substr(kglnaobj,1,20) NAME,
kglnahsh HASH_VALUE,
kglobtyd TYPE,
kglobt23 LOCKED_TOTAL,
kglobt24 PINNED_TOTAL,
kglhdexc EXECUTIONS,
kglhdnsp NAMESPACE
from
x$kglob
-- where kglobtyd != 'CURSOR'
order by
kglobt24 desc)
where
rownum <= 10;
-- to find blocker session
SELECT count (*), to_number(substr(to_char(rawtohex(p2raw)), 1, 8),'XXXXXXXX') blocking_sid
FROM v$session
WHERE event = 'library cache: mutex X'
group by to_number(substr(to_char(rawtohex(p2raw)),1, 8), 'XXXXXXXX');
-- to find mutex from ash
select event, p1, count(1)
from v$active_session_history
where sample_time > (sysdate - 20/1440)
and event = 'library cache: mutex X'
group by
event, p1
order by 3;
-- if idn > 131072, the P1 is the object the mutex is against.For the topmost P1 run:
select KGLNAOBJ, KGLNAOWN, KGLHDNSP, KGLOBTYP
from x$kglob
where KGLNAHSH= {value of P1}
最近遇到过一个案例数据库几个session 在执行相同的insert values sql时library cache: mutex X争用严重.
SQL> select child_number,count(*) from v$sql where sql_id='b92jmb1qngsyd' group by child_number;
CHILD_NUMBER COUNT(*)
------------ ----------
0 47
1 47
2 47
3 47
4 47
5 46
6 46
7 47
8 48
9 46
10 47
11 47
12 47
13 47
14 47
15 48
16 47
17 47
...
83 47
84 47
85 47
86 47
87 47
88 47
89 47
90 47
91 47
92 47
93 46
94 47
95 47
96 47
97 47
98 47
99 47
100 rows selected.
sql> @no_shared
SQL_ID NONSHARED_REASON COUNT(*)
------------- ----------------------------- ----------
b92jmb1qngsyd BIND_MISMATCH 4730
b92jmb1qngsyd PURGED_CURSOR 2
b92jmb1qngsyd HASH_MATCH_FAILED 1
b92jmb1qngsyd BIND_LENGTH_UPGRADEABLE 952
select /*+rule*/m.position,m.bind_name , m.max_length,count(*) child_cursor_count
from v$sql s, v$sql_bind_metadata m
where s.sql_id = 'b92jmb1qngsyd'
and s.child_address = m.address group by m.position,m.bind_name , m.max_length
order by 1, 2;
-- 确认存在部分变量varchar长度区间问题
SQL> select mutex_type,location_id,location,sleeps from x$mutex_sleep where location_id=85;
MUTEX_TYPE LOCATION_ID LOCATION SLEEPS
-------------------------------- ----------- ---------------------------------------- ----------
Library Cache 85 kgllkdl1 85 15001254
当时有做SSD 但级别不够(10), 确认了blocker 是no in wait, 并且是on cpu, 执行的是相同的insert sql, 该sql是应用同步数据使用执行频率较高。
当时也有做errorstack
----- Abridged Call Stack Trace -----
146297 ksedsts()+544
-ksdxcb()+3216
146298
146299
----- End of Abridged Call Stack Trace -----
以call stack和event为关键字,在MOS中发现一篇Session Spin on Kglrdtin Holding ‘Library Cache: Mutex X’ (文档 ID 2219897.1)
和当前的版本现象较为相似因当前的平台hpux无相关one off patch,且无升级11.2.0.4计划记划,这里只记录一下该问题。
— another case
SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';
COUNT(*)
----------
21372
SQL> select version_count from v$sqlarea where sql_id='4uqp4jczbf1qt';
VERSION_COUNT
-------------
74
SQL> @no_shared
Enter value for sqlid: 4uqp4jczbf1qt
SQL_ID NONSHARED_REASON COUNT(*)
------------- ----------------------------- ----------
4uqp4jczbf1qt BIND_MISMATCH 21428
4uqp4jczbf1qt HASH_MATCH_FAILED 290
4uqp4jczbf1qt BIND_LENGTH_UPGRADEABLE 17489
SQL> @pd obso
Show all parameters and session values from x$ksppi/x$ksppcv...
INDX I_HEX NAME VALUE DESCRIPTION
---------- ----- -------------------------------- ---------- ------------------------------------------------
1723 6BB _cursor_obsolete_threshold 100 Number of cursors per parent before obsoletion.
SQL> select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count
2 from v$sql s, v$sql_bind_metadata m
3 where s.sql_id = '4uqp4jczbf1qt'
4 and m.datatype=1
5 and s.child_address = m.address group by m.position,m.bind_name , m.max_length
6 order by 1, 2;
POSITION BIND_NAME MAX_LENGTH CHILD_CURSOR_COUNT
---------- ------------------------------ ---------- ------------------
1 V0 32 22192
2 V1 32 22192
7 V6 32 22192
8 V7 32 22192
9 V8 32 22192
10 V9 32 22192
12 V11 32 22192
13 V12 32 18682
13 V12 128 3188
13 V12 2000 322
16 V15 32 22192
17 V16 32 21977
17 V16 128 215
18 V17 32 22192
19 V18 32 22192
20 V19 32 22192
21 V20 32 22192
22 V21 32 22192
30 V29 32 22192
31 V30 32 22192
32 V31 32 22192
33 V32 32 22192
34 V33 32 22192
35 V34 32 22192
36 V35 32 22192
37 V36 32 22192
38 V37 32 22192
39 V38 32 20037
39 V38 128 2155
42 V41 32 22192
43 V42 32 22192
47 V46 32 22192
Event 10503 level
EVENT 10503 enables users to specify a character bind buffer length. Depending on the length used, the character binds in the child cursor can all be created using the same bind length, skipping bind graduation and keeping the child chain relatively small. This helps to alleviate a potential cursor-sharing problem related to graduated binds.
SQL> alter system set events='10503 trace name context forever, level 2000';
System altered.
SQL> oradebug setmypid
Statement processed.
SQL> oradebug eventdump system
10503 trace name context forever, level 2000
10949 trace name context forever,level 1
28401 trace name context forever,level 1
SQL> select address,hash_value from v$sqlarea where sql_id='4uqp4jczbf1qt';
ADDRESS HASH_VALUE
---------------- ----------
C000001F574C4F90 1052182233
SQL> exec dbms_shared_pool.purge('C000001F574C4F90,1052182233','C',65);
PL/SQL procedure successfully completed.
SQL> select address,hash_value,last_load_time from v$sqlarea where sql_id='4uqp4jczbf1qt';
ADDRESS HASH_VALUE LAST_LOAD_TIME
---------------- ---------- -----------------
C000001F574C4F90 1052182233 20190211 15:00:02
SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';
COUNT(*)
----------
22700
SQL> alter system flush shared_pool;
System altered.
SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';
COUNT(*)
----------
0
SQL> select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count
2 from v$sql s, v$sql_bind_metadata m
3 where s.sql_id = '4uqp4jczbf1qt'
4 and m.datatype=1
5 and s.child_address = m.address group by m.position,m.bind_name , m.max_length
6 order by 1, 2;
POSITION BIND_NAME MAX_LENGTH CHILD_CURSOR_COUNT
---------- ------------------------------ ---------- ------------------
1 V0 2000 86
2 V1 2000 86
7 V6 2000 86
8 V7 2000 86
9 V8 2000 86
10 V9 2000 86
12 V11 2000 86
13 V12 2000 86
16 V15 2000 86
17 V16 2000 86
18 V17 2000 86
19 V18 2000 86
20 V19 2000 86
21 V20 2000 86
22 V21 2000 86
30 V29 2000 86
31 V30 2000 86
32 V31 2000 86
33 V32 2000 86
34 V33 2000 86
35 V34 2000 86
36 V35 2000 86
37 V36 2000 86
38 V37 2000 86
39 V38 2000 86
42 V41 2000 86
43 V42 2000 86
47 V46 2000 86
打赏
微信扫一扫,打赏作者吧~