一次library cache pin故障的解决过程

最新推荐文章于 2024-09-01 11:55:12 发布

congxiadui7455

最新推荐文章于 2024-09-01 11:55:12 发布

阅读量139

点赞数

文章标签：数据库

看了 崔大师的关于一次library cache pin故障的解决过程，觉得很好，转过来与大家分享，分享大师的思路。

原文链接：

http://www.dbsnake.net/solve-library-cache-pin.html

今天接到同事的电话，说他的一个存储过程已经run了一个多小时了，还在继续run，他觉得极不正常，按道理说不应该run这么长时间。

我说那我去看一下吧。

这个库是一个AIX上的10.2.0.4，我采集了一下问题时间段的AWR报告：

Begin Snap:	13302	11-Jun-10 12:00:56	109	4.7
End Snap:	13303	11-Jun-10 13:00:02	97	4.9
Elapsed:		59.10 (mins)
DB Time:		113.98 (mins)

Top 5事件为：

Event	Waits	Time(s)	Avg Wait(ms)	% Total Call Time	Wait Class
library cache pin	1,252	3,656	2,920	53.5	Concurrency
library cache lock	989	2,890	2,922	42.3	Concurrency
CPU time		219		3.2
db file sequential read	5,694	12	2	.2	User I/O
log file parallel write	1,467	11	8	.2	System I/O

从AWR报告结果里我们可以看出在出问题的时间段，系统在经历严重的library cache pin以及library cache lock等待。

从Load Profile中我们又可以看到：

	Per Second	Per Transaction
Parses:	12.83	65.83
Hard parses:	0.05	0.25

也就是说导致上述library cache pin和library cache lock的并不是hard parse。

对于library cache pin等待来说，AWR报告的作用有限，最有效的方式就是找到持有library cache pin以及等待library cache pin的session，然后看看他们在做什么：

SQL> SELECT s.sid, kglpnmod “Mode”, kglpnreq “Req”, SPID “OS Process”

2 FROM v$session_wait w, x$kglpn p, v$session s ,v$process o

3 WHERE p.kglpnuse=s.saddr

4 AND kglpnhdl=w.p1raw

5 and w.event like ‘%library cache pin%’

6 and s.paddr=o.addr

7 /

SID Mode Req OS Process

———- ———- ———- ————

396 0 2 6381970

341 2 0 4092132

363 0 2 3514690

304 0 2 3977478

354 0 3 3137874

20 rows selected

我那位run存储过程的同事所在的session是396，从上述结果里我们可以看出来396现在想以Share模式（即Req=2）去持有library cache pin，同时现在持有上述library cache pin的是session 341，且341的持有模式也是Share（即Mode=2）。

本来Share和Share是可以共享的，但不幸的是在396之前，session 354想以Exclusive模式（即Req=3）去持有上述library cache pin，这直接导致了396需要处于等待的Queue中，同时处于Queue中的还有363和304。

我为什么这么说呢，因为oracle对library cache pin的解释中有一句非常经典的话：

An X request (3) will be blocked by any pins held S mode (2) on the object.
An S request (2) will be blocked by any X mode (3) pin held, or may queue behind some other X request.

所以从AWR报告和上述查询结果中我们可以得出如下结论：

1、我那位run存储过程的同事为什么run了1个多小时还没有run完是因为这个存储过程正在经历严重的library cache pin等待；

2、而为什么会导致严重的library cache pin等待是因为session 341和354联手达到了这一效果，即341以Share模式持有library cache pin，接着354想以Exclusive模式持有，这直接导致所有的后续请求全部被处于等待的Queue中。也就是说341阻塞了354，而354又间接阻塞了396。

既然知道了原因，那我们去看一下session 341在做什么事情：

SQL> select decode(sql_hash_value,0,prev_hash_value,sql_hash_value) from v$session where sid=341;

DECODE(SQL_HASH_VALUE,0,PREV_H

——————————

784727971

间隔10秒钟后再次执行：

SQL> select decode(sql_hash_value,0,prev_hash_value,sql_hash_value) from v$session where sid=341;

DECODE(SQL_HASH_VALUE,0,PREV_H

——————————

784727971

间隔10秒钟后再次执行：

SQL> select decode(sql_hash_value,0,prev_hash_value,sql_hash_value) from v$session where sid=341;

DECODE(SQL_HASH_VALUE,0,PREV_H

——————————

784727971

SQL> select sql_text from v$sqltext where hash_value=784727971 order by piece;

SQL_TEXT

—————————————————————-

begin – Call the procedure p_adj_rrp_main(o_vc_flag => _vc_flag); end;

从结果里可以看到341一直在run一个存储过程。

给持有341的那位大姐打电话，问她在做什么，她告诉我说她从昨晚就开始run这个存储过程，今早来看发现死掉了，所以她就没管了。

知道原因后处理起来还是很容易的，当我把session 341干掉后，整个系统的library cache pin一下子就降下来了，接着我那位同事的run了一个多小时的存储过程过了没多久就run完了。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/28278387/viewspace-746693/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/28278387/viewspace-746693/

congxiadui7455

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
一次library cache pin故障的解决过程

看了崔大师的关于一次library cache pin故障的解决过程，觉得很好，转过来与大家分享，分享大师的思路。原文链接：http://www.dbsnake.net/solve-library-cache-...
复制链接

扫一扫