library cache lock 案例分析!
今天是2013-11-4,早上8点对所维护系统进行巡检,意外的事情发生了,昨天刚刚深入研究了一下library cache lock这个等待事件,今天我的系统居然给我出了这么一个问题。
首先查看awr信息如下:
可以看到在top5事件中就出现了这个问题。
查看sql信息发现 sql 运行事件有一个sql特别的长,而且在进行truncate操作。看上去堵塞了。
随即查看ash报告,如下:
找到了相对应的sql,那么在深入查看一下问题:
执行:
select
distinct
ses.ksusenum sid, ses.ksuseser serial#, ses.ksuudlna username,KSUSEMNM module,
ob.kglnaown obj_owner, ob.kglnaobj obj_name
,lk.kgllkcnt lck_cnt, lk.kgllkmod lock_mode, lk.kgllkreq lock_req
, w.state, w.event, w.wait_Time, w.seconds_in_Wait
from
x$kgllk lk, x$kglob ob,x$ksuse ses
, v$session_wait w
where lk.kgllkhdl in
(select kgllkhdl from x$kgllk where kgllkreq >0 )
and ob.kglhdadr = lk.kgllkhdl
and lk.kgllkuse = ses.addr
and w.sid = ses.indx
order by seconds_in_wait desc
/
SID SERIAL# USERNAME MODULE OBJ_OWNER OBJ_NAME LCK_CNT LOCK_MODE LOCK_REQ STATE EVENT WAIT_TIME SECONDS_IN_WAIT
--- ---------- ----------------------------------------------------- ---------- ----------------------------------------------- ---------- ---------------
1029 38558 xxxxxxxWORK r1 xxxWORK RPT_PRJ_STAGE_INFO_T 1 2 0 WAITING db file parallel read 0 260435
1017 5910 xxxxxxxWORK r2 xxxxxxxWORK RPT_PRJ_STAGE_INFO_T 0 0 3 WAITING library cache lock 0 204437
1057 23077 xxxxxxxWORK r1 xxxxxxxWORK RPT_PRJ_STAGE_INFO_T 1 2 0 WAITING read by other session 0 1336
1023 39503 xxxxxxxWORK r1 xxxxxxxWORK RPT_PRJ_STAGE_INFO_T 1 2 0 WAITING read by other session 0 1276
941 10980 xxxxxxxWORK r1 xxxxxxxWORK RPT_PRJ_STAGE_INFO_T 1 2 0 WAITING read by other session 0 947
可以看到刚刚那条语句造成了如上的结果,目前一共有5个会话,1029持有了library cache lock,但是1017确要获得该对象的library cache object handle上的exclusive lock,由于1029没有被释放,因此1017也将会等待,这个时候其他会话在对该表进行的操作将进入sequence。
解决办法:
1)经过确认可以杀掉该1029会话。(可是我非常想抓取这个时间段的sql,在进行分析一下。可是没有时间来及做这个事情,希望可以通过awr的内容视图获得,在此就不写了)
随即alter system kill session ‘1029,38558’;
但是提示:
09:07:19 sys@DB>alter system kill session '1029,38558';
alter system kill session '1029,38558'
*
第一行出现错误:
ORA-00031: 标记要终止的会话。
2)查找spid进行kill
select spid, osuser, s.program from v$session s, v$process p where s.paddr = p.addr and s.sid =1029;
SPID OSUSER PROGRAM
------------ ------------------------------ ------------------------------------------------
12191 webpms2 JDBC Thin Client
kill -9 12191
问题得到解决:
SQL> select
2 distinct
3 ses.ksusenum sid, ses.ksuseser serial#, ses.ksuudlna username,KSUSEMNM module,
4 ob.kglnaown obj_owner, ob.kglnaobj obj_name
5 ,lk.kgllkcnt lck_cnt, lk.kgllkmod lock_mode, lk.kgllkreq lock_req
6 , w.state, w.event, w.wait_Time, w.seconds_in_Wait
7 from
8 x$kgllk lk, x$kglob ob,x$ksuse ses
9 , v$session_wait w
10 where lk.kgllkhdl in
11 (select kgllkhdl from x$kgllk where kgllkreq >0 )
12 and ob.kglhdadr = lk.kgllkhdl
13 and lk.kgllkuse = ses.addr
14 and w.sid = ses.indx
15 order by seconds_in_wait desc
16 /
SID SERIAL# USERNAME MODULE OBJ_OWNER OBJ_NAME LCK_CNT LOCK_MODE LOCK_REQ STATE EVENT WAIT_TIME SECONDS_IN_WAIT
---------- ---------- ------------------------------ ---------------------------------------------------------------- ---------------------------------------------------------------- -------------------------------------------------------------------------------- ---------- ---------- ---------- ------------------- ---------------------------------------------------------------- ---------- ---------------
没有数据了。
总结,该问题是由于一个sql*plus登录然后执行单一sql导致。对于这种问题,需要定位导致此问题的所有sql语句,并查看堵塞其他会话的真正原因。当然如果和我这次一样得到确认的时候,可以将该会话删除。