oracle10503事件,library cache: mutex X等待事件, blocker session on cpu

白咸明

于 2021-04-04 16:55:58 发布

阅读量548

点赞数

文章标签： oracle10503事件

library cache: mutex X等待事件是当一个会话以 exclusive mode持有library cache mutex时，另一个会话当前无论何时申请此mutex时必须等待其释放。很多时候对 library cache做不同的操作时都需要申请一个mutex, 所以最重要的是确认mutex的位置”location”, 该位置有助于分析该类等待事件的原因。

对应参数：

P1 = “idn” = Unique Mutex Identifier

P2 = Mutex “value” = in order to find the mutex value, read note 1298015.1

P3 = “where” = location in code (internal identifier) where mutex is being waited for

P1 is idn – can be used for finding the cursor related to mutex

For library cache: mutex X

1. P1 represents library cache hash bucket number (if idn <= 131072)

2. P1 represents hash value of the library cache object under protection (if idn > 131072)

mutex的问题排查较为复杂，通常一个现象有多个原因，并且mutex问题又会引起其它级连问题。常见原因：

1. 频繁的hard parse

2. High sql version count,需要检查一个很长的chain of versions

3. 引为某些原因导致的Invalidations and reloads

4. shared pool配置过小

5. cursor_sharing=similar和session_cached_cursors配置不当

6. BUGs

该类问题时通常需要引集的信息：

• AWR Report

• ASH Report

• Hanganalyze trace

• Systemstate dump (contain call stack level)

系统级可用的相关视图：GV$MUTEX_SLEEP 和GV$MUTEX_SLEEP_HISTORY

分析该事件的相关SQL:

-- which library cache objects are the target of the operation.

select *

from

(select

case when (kglhdadr = kglhdpar)

then 'Parent'

else 'Child '||kglobt09 end cursor,

kglhdadr ADDRESS,

substr(kglnaobj,1,20) NAME,

kglnahsh HASH_VALUE,

kglobtyd TYPE,

kglobt23 LOCKED_TOTAL,

kglobt24 PINNED_TOTAL,

kglhdexc EXECUTIONS,

kglhdnsp NAMESPACE

from

x$kglob

-- where kglobtyd != 'CURSOR'

order by

kglobt24 desc)

where

rownum <= 10;

-- to find blocker session

SELECT count (*), to_number(substr(to_char(rawtohex(p2raw)), 1, 8),'XXXXXXXX') blocking_sid

FROM v$session

WHERE event = 'library cache: mutex X'

group by to_number(substr(to_char(rawtohex(p2raw)),1, 8), 'XXXXXXXX');

-- to find mutex from ash

select event, p1, count(1)

from v$active_session_history

where sample_time > (sysdate - 20/1440)

and event = 'library cache: mutex X'

group by

event, p1

order by 3;

-- if idn > 131072, the P1 is the object the mutex is against.For the topmost P1 run:

select KGLNAOBJ, KGLNAOWN, KGLHDNSP, KGLOBTYP

from x$kglob

where KGLNAHSH= {value of P1}

最近遇到过一个案例数据库几个session 在执行相同的insert values sql时library cache: mutex X争用严重.

SQL> select child_number,count(*) from v$sql where sql_id='b92jmb1qngsyd' group by child_number;

CHILD_NUMBER COUNT(*)

------------ ----------

0 47

1 47

2 47

3 47

4 47

5 46

6 46

7 47

8 48

9 46

10 47

11 47

12 47

13 47

14 47

15 48

16 47

17 47

...

83 47

84 47

85 47

86 47

87 47

88 47

89 47

90 47

91 47

92 47

93 46

94 47

95 47

96 47

97 47

98 47

99 47

100 rows selected.

sql> @no_shared

SQL_ID NONSHARED_REASON COUNT(*)

------------- ----------------------------- ----------

b92jmb1qngsyd BIND_MISMATCH 4730

b92jmb1qngsyd PURGED_CURSOR 2

b92jmb1qngsyd HASH_MATCH_FAILED 1

b92jmb1qngsyd BIND_LENGTH_UPGRADEABLE 952

select /*+rule*/m.position,m.bind_name , m.max_length,count(*) child_cursor_count

from v$sql s, v$sql_bind_metadata m

where s.sql_id = 'b92jmb1qngsyd'

and s.child_address = m.address group by m.position,m.bind_name , m.max_length

order by 1, 2;

-- 确认存在部分变量varchar长度区间问题

SQL> select mutex_type,location_id,location,sleeps from x$mutex_sleep where location_id=85;

MUTEX_TYPE LOCATION_ID LOCATION SLEEPS

-------------------------------- ----------- ---------------------------------------- ----------

Library Cache 85 kgllkdl1 85 15001254

当时有做SSD 但级别不够(10)，确认了blocker 是no in wait, 并且是on cpu, 执行的是相同的insert sql, 该sql是应用同步数据使用执行频率较高。

当时也有做errorstack

----- Abridged Call Stack Trace -----

146297 ksedsts()+544

-ksdxcb()+3216

146298

146299

----- End of Abridged Call Stack Trace -----

以call stack和event为关键字，在MOS中发现一篇Session Spin on Kglrdtin Holding ‘Library Cache: Mutex X’ (文档 ID 2219897.1)

和当前的版本现象较为相似因当前的平台hpux无相关one off patch,且无升级11.2.0.4计划记划，这里只记录一下该问题。

— another case

SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';

COUNT(*)

----------

21372

SQL> select version_count from v$sqlarea where sql_id='4uqp4jczbf1qt';

VERSION_COUNT

-------------

SQL> @no_shared

Enter value for sqlid: 4uqp4jczbf1qt

SQL_ID NONSHARED_REASON COUNT(*)

------------- ----------------------------- ----------

4uqp4jczbf1qt BIND_MISMATCH 21428

4uqp4jczbf1qt HASH_MATCH_FAILED 290

4uqp4jczbf1qt BIND_LENGTH_UPGRADEABLE 17489

SQL> @pd obso

Show all parameters and session values from x$ksppi/x$ksppcv...

INDX I_HEX NAME VALUE DESCRIPTION

---------- ----- -------------------------------- ---------- ------------------------------------------------

1723 6BB _cursor_obsolete_threshold 100 Number of cursors per parent before obsoletion.

SQL> select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count

2 from v$sql s, v$sql_bind_metadata m

3 where s.sql_id = '4uqp4jczbf1qt'

4 and m.datatype=1

5 and s.child_address = m.address group by m.position,m.bind_name , m.max_length

6 order by 1, 2;

POSITION BIND_NAME MAX_LENGTH CHILD_CURSOR_COUNT

---------- ------------------------------ ---------- ------------------

1 V0 32 22192

2 V1 32 22192

7 V6 32 22192

8 V7 32 22192

9 V8 32 22192

10 V9 32 22192

12 V11 32 22192

13 V12 32 18682

13 V12 128 3188

13 V12 2000 322

16 V15 32 22192

17 V16 32 21977

17 V16 128 215

18 V17 32 22192

19 V18 32 22192

20 V19 32 22192

21 V20 32 22192

22 V21 32 22192

30 V29 32 22192

31 V30 32 22192

32 V31 32 22192

33 V32 32 22192

34 V33 32 22192

35 V34 32 22192

36 V35 32 22192

37 V36 32 22192

38 V37 32 22192

39 V38 32 20037

39 V38 128 2155

42 V41 32 22192

43 V42 32 22192

47 V46 32 22192

Event 10503 level

EVENT 10503 enables users to specify a character bind buffer length. Depending on the length used, the character binds in the child cursor can all be created using the same bind length, skipping bind graduation and keeping the child chain relatively small. This helps to alleviate a potential cursor-sharing problem related to graduated binds.

SQL> alter system set events='10503 trace name context forever, level 2000';

System altered.

SQL> oradebug setmypid

Statement processed.

SQL> oradebug eventdump system

10503 trace name context forever, level 2000

10949 trace name context forever,level 1

28401 trace name context forever,level 1

SQL> select address,hash_value from v$sqlarea where sql_id='4uqp4jczbf1qt';

ADDRESS HASH_VALUE

---------------- ----------

C000001F574C4F90 1052182233

SQL> exec dbms_shared_pool.purge('C000001F574C4F90,1052182233','C',65);

PL/SQL procedure successfully completed.

SQL> select address,hash_value,last_load_time from v$sqlarea where sql_id='4uqp4jczbf1qt';

ADDRESS HASH_VALUE LAST_LOAD_TIME

---------------- ---------- -----------------

C000001F574C4F90 1052182233 20190211 15:00:02

SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';

COUNT(*)

----------

22700

SQL> alter system flush shared_pool;

System altered.

SQL> select count(*) from v$sql where sql_id='4uqp4jczbf1qt';

COUNT(*)

----------

SQL> select /*+rule*/ m.position,m.bind_name , m.max_length,count(*) child_cursor_count

2 from v$sql s, v$sql_bind_metadata m

3 where s.sql_id = '4uqp4jczbf1qt'

4 and m.datatype=1

5 and s.child_address = m.address group by m.position,m.bind_name , m.max_length

6 order by 1, 2;

POSITION BIND_NAME MAX_LENGTH CHILD_CURSOR_COUNT

---------- ------------------------------ ---------- ------------------

1 V0 2000 86

2 V1 2000 86

7 V6 2000 86

8 V7 2000 86

9 V8 2000 86

10 V9 2000 86

12 V11 2000 86

13 V12 2000 86

16 V15 2000 86

17 V16 2000 86

18 V17 2000 86

19 V18 2000 86

20 V19 2000 86

21 V20 2000 86

22 V21 2000 86

30 V29 2000 86

31 V30 2000 86

32 V31 2000 86

33 V32 2000 86

34 V33 2000 86

35 V34 2000 86

36 V35 2000 86

37 V36 2000 86

38 V37 2000 86

39 V38 2000 86

42 V41 2000 86

43 V42 2000 86

47 V46 2000 86

打赏

微信扫一扫，打赏作者吧～

白咸明

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
oracle10503事件,library cache: mutex X等待事件, blocker session on cpu

library cache: mutex X等待事件是当一个会话以exclusive mode持有library cache mutex时，另一个会话当前无论何时申请此mutex时必须等待其释放。很多时候对 library cache做不同的操作时都需要申请一个mutex, 所以最重要的是确认mutex的位置”location”, 该位置有助于分析该类等待事件的原因。对应参数：P1 = “id...
复制链接

扫一扫