author:skate
time:2010/09/25
今天早晨刚到公司,同事就说数据库有问题,在添加赛程时,数据库没有响应(数据库前几天宕机,修复了一大堆600的错误)。
触发器无法禁止.
我边打开包拿出电脑边想,数据库没有响应,比较常见的原因是:
1. 数据库归档满了,会导致数据库无法进行任何操作(数据库处于禁止状态)
2. 数据库有等待事件,一般是部分错误无法进程
3. 数据库bug等原因
这个数据库是非归档模式,不会存在归档满的问题。这时候我的电脑也打开了,我登录数据库简单看了下,基本操作都可以。
又看了下alertlog日志文件,也没有报错信息,我尝试在pl/sql里disable触发器,结果pl/sql无响应。我用sqlplus登录数据
库,执行如下sql:
SYS@orcl> alter trigger tri_tb_ware_aiur disable;
alter trigger tri_tb_ware_aiur disable
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
也是半天无响应,最后ctrl+c终止会话。
于是用10046看看数据库在等待什么?
SYS@orcl> alter session set events ' 10046 trace name context forever, level 10';
Session altered.
SYS@orcl> alter trigger tri_tb_ware_aiur disable;
在user_dump_dest目路下找到刚刚生成的trace文件,文件内容如下:
/u01/app/oracle/admin/orcl/udump/orcl_ora_31868.trc:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.4.0 - Production
ORACLE_HOME = /u01/app/oracle/product/9.2.0
System name: Linux
Node name: localhost.localdomain
Release: 2.6.9-78.ELsmp
Version: #1 SMP Thu Jul 24 23:54:48 EDT 2008
Machine: x86_64
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 11
Unix process pid: 31868, image: oracle@localhost.localdomain (TNS V1-V3)
*** 2010-09-25 10:14:48.212
*** SESSION ID:(40.369) 2010-09-25 10:14:48.203
APPNAME mod='SQL*Plus' mh=3669949024 act='' ah=216172790438528344
=====================
PARSING IN CURSOR #1 len=70 dep=0 uid=80 oct=42 lid=216172786408751184 tim=125525477364476
0 hv=1280192016 ad='85be9bc0'
alter session set events ' 10046 trace name context forever, level 10'
END OF STMT
EXEC #1:c=0,e=31,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=1255254773636089
WAIT #1: nam='SQL*Net message to client' ela= 3 p1=1650815232 p2=1 p3=0
*** 2010-09-25 10:14:58.590
WAIT #1: nam='SQL*Net message from client' ela= 10134574 p1=1650815232 p2=1 p3=0
XCTEND rlbk=0, rd_only=1
=====================
PARSING IN CURSOR #1 len=38 dep=0 uid=80 oct=60 lid=80 tim=1255254783779741 hv=2807330300
ad='884b7448'
alter trigger tri_tb_ware_aiur disable
END OF STMT
PARSE #1:c=0,e=167,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=4,tim=1255254783779738
WAIT #1: nam='library cache pin' ela= 2998918 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999772 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998696 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998836 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998878 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999887 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998685 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998857 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998877 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999853 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998537 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999071 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999074 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999068 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999015 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999789 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998746 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998707 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999766 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998729 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998794 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998853 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999827 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998696 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998840 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998870 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999721 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998785 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998881 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998923 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999847 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998788 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998865 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998874 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999819 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998709 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998867 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999756 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998705 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998775 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998844 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999813 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998764 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998721 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999897 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998883 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998819 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998687 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999729 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998712 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2998820 p1=6502926968 p2=6481288088 p3=303
WAIT #1: nam='library cache pin' ela= 2999842 p1=6502926968 p2=6481288088 p3=303
从trace文件内容可以看到,有都是'library cache pin',这个等待事件通常是发生在编译或重新编译pl/sql,view,procedure,type等object时出现的。
解决这个问题基本思路如下:
1. v$session_wait.p1raw 可以知道library cache pin等待的对象的handle的地址
2. 通过等待对象的handle可以和x$kglob.kglhdadr相关联,可以了解这个等待对象的基本信息,那这个对象的锁(pin)的情况是如何?
3. 再用x$kglob.kglhdadr关联x$kglpn.kglpnhdl,知道谁在持有这个pin(锁),在用x$kglpn.kglpnuse和v$session.saddr关联,了解持有锁(pin)的会话信息。如果要了解这个会话在等待什么,可以通过v$session_wait知道,也可以通v$session.sql_hash_value和v$session.sql_address等字段关联v$sqltext,v$sqlarea视图了解这个会话的动作.
因为通过10046事件已经知道数据库现在正在等待library cache pin,所以查询v$seission_wait
SQL> select sid,
2 seq#,
3 event,
4 p1,
5 p1raw,
6 p1text,
7 p2,
8 p2raw,
9 p2text,
10 p3,
11 p3raw,
12 p3text,
13 wait_time,
14 seconds_in_wait,
15 state
16 from v$session_wait
17 where event like 'library%';
SID SEQ# EVENT P1 P1RAW P1TEXT P2 P2RAW P2TEXT P3 P3RAW P3TEXT WAIT_TIME SECONDS_IN_WAIT STATE
---- ---------- --------------------------- ---------------- ---------------------------------------- ---------------- ---------------------------------------------------------------- ---------- ---------------- -----------------
23 481 library cache pin 6540622432 0000000185D9FA60 handle address 6482206672 00000001825E9FD0 pin address 301 000000000000012D 100*mode+namespace 0 32 WAITING
SQL>
P1RAW='0000000185D9FA60'是会话等待的对象的handle的值,通过这个值可以在x$kglob里查看这个对象的基本信息,如下
SQL> select addr,kglhdadr,kglhdpar,kglnaown,kglnaobj,kglnahsh,kglhdobj from sys.x$kglob where kglhdadr='0000000185D9FA60';
ADDR KGLHDADR KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ
---------------- ---------------- ---------------- ----------------------------------------------------------------
0000002A965B38D0 0000000185D9FA60 0000000185D9FA60 SYS PINNG 1626462215 00000001843E6F68
SQL> select a.PADDR,
2 a.sid,
3 a.SERIAL#,
4 a.PROGRAM,
5 a.SQL_ADDRESS,
6 a.STATUS,
7 a.SQL_HASH_VALUE,
8 b.addr,
9 b.kglpnadr,
10 b.kglpnuse,
11 b.kglpnses,
12 b.kglpnhdl,
13 b.kglpnlck,
14 b.kglpnmod,
15 b.kglpnreq
16 from x$kglpn b, v$session a
17 where a.SADDR = b.kglpnuse
18 and b.kglpnhdl = '0000000185D9FA60'
19 and b.kglpnmod <> 0
20 ;
PADDR SID SERIAL# PROGRAM SQL_ADDRESS STATUS SQL_HASH_VALUE ADDR KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK KGLPNMOD KGLPNREQ
---------------- ---------- ---------- ------------------------------------------------ ---------------- -------- -------------- ---------------- ---------------- ---------------- ---------------- ---------------- ------
0000000180433138 33 18 plsqldev.exe 0000000185D8ACD0 ACTIVE 3937586639 0000002A965CF1D0 00000001825EA050 00000001804A8D50 00000001804A8D50 0000000185D9FA60 00000001825E8E98 2 0
SQL>
查看持有该锁(pin)的会话的基本信息,到这个有如下几种分析情况:
1. 看字段status的值,如果这个值是“killed”,表明这个会话被kill了,那大家要问了,既然已经被kill,那为什么还要等待呢?因为虽然会话被标记为“killed”但会话持有的资源可没有释放,这个时候要查看视图v$process,查询会话对应os进程是否还存在。
select * from v$process where addr in ('&paddr'); ---这里变量paddr等于‘0000000180433138’
如果这个查询里有值,说明os进程没有释放资源,可以通过os命令: kill -9 spid 杀掉进程(spid是v$process对应os的服务器进程)。我这次遇到的问题就是这个原因,同事在跑程序的时候由于触发器里dblink的问题,导致操作中断,这个时候资源没有及时释放,而同事又执行sql:alter trigger trigger_name disable;无响应时,直接kill了这个会话,致使这个会话被标记为“killed”,但资源没有释放。所以才会出现“library cache pin”
2. 如果字段status的值为“ACTIVE”,说明持有pin的会话是活跃的,那就看看这个会话在等待什么,在执行什么操作
查看会话在等待什么?
select * from v$session_wait where sid=33
查看会话的动作是什么?
select * from v$sqlarea where hash_value='&sql_hash_value';
到此,library cache pin的原因已经找到剩下就是如何处理,一般快速处理是先kill掉看看是否可以解决,如果无法解决,那就看应用
相关快捷代码:
获得library cache pin 等待对象的信息:
select addr, kglhdadr, kglhdpar, kglnaown, kglnaobj, kglnahsh, kglhdobj
from x$kglob
where kglhdadr in
(select p1raw from v$session_wait where event like 'library%')
获得持有等待对象的session信息:
select a.PADDR,
a.sid,
a.SERIAL#,
a.PROGRAM,
a.SQL_ADDRESS,
a.STATUS,
a.SQL_HASH_VALUE,
b.addr,
b.kglpnadr,
b.kglpnuse,
b.kglpnses,
b.kglpnhdl,
b.kglpnlck,
b.kglpnmod,
b.kglpnreq
from x$kglpn b, v$session a
where a.SADDR = b.kglpnuse
and b.kglpnmod <> 0
and b.kglpnhdl in (select p1raw from v$session_wait where event like 'library%');
获得持有等待对象的会话执行的代码:
select *
from v$sqltext
where (address, hash_value) in
(select sql_address, sql_hash_value
from v$session
where sid in (select a.SID
from x$kglpn b, v$session a
where a.SADDR = b.kglpnuse
and b.kglpnmod <> 0
and b.kglpnhdl in
(select p1raw
from v$session_wait
where event like 'library%')))
order by piece
--------end--------