hanganalye使用示例
Hanganalyze工具使用内核调用检测会话在等待什么资源,报告出占有者和等待者的相互关系。另外,它还会将一些比较”interesting”的进程状态dump出来,这个取决于我们使用hanganalyze的分析级别。
hanganalyze 可以用于查找数据库hang的原因, 也可以用于其他一些等待事件的查找例如 library cache pin 等等地事件.
1、
介绍下
hanganalyze
级别的定义10 Dump all processes (IGN state)
5 Level 4 + Dump all processes involved in wait chains (NLEAF state)
4
Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3 Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2
Only HANGANALYZE output, no process dump at all
一般情况下,我们都定义到
级别
3
,
metalink
上说,级别
3
以上,需要
oracle support
的人告知后,才能用。
2、数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
PL/SQL Release 11.1.0.6.0 - Production
CORE 11.1.0.6.0 Production
TNS for Linux: Version 11.1.0.6.0 - Production
NLSRTL Version 11.1.0.6.0 - Production
3、创建测试表
SQL> create table wytest (id number,name varchar2(10)) tablespace users;
4、插入测试数据
SQL> insert into wytest values(1,'111');
1 row created.
SQL> insert into wytest values(2,'222');
1 row created.
SQL> commit;
Commit complete.
5、分别打开3个session 170,137,129,用session170,137模拟会话阻塞,session129执行hanganalye工具生成跟踪文件,如下
session 170:
SQL> update wytest set name='test' where id = 1;
1 row updated.
--未commit
session 137:
SQL> update wytest set name = 'test1' where id = 1;
--会话阻塞
session 129:
--查看系统锁,结果显示:sid=170的session阻塞了sid=137的session
SQL> SELECT SID, TYPE, id1, id2, lmode, request, BLOCK
2 FROM v$lock
3 WHERE request <> 0 OR BLOCK <> 0
4 ORDER BY SID;
SID TY ID1 ID2 LMODE REQUEST BLOCK
---------- -- ---------- ---------- ---------- ---------- ----------
137 TX 65554 649 0 6 0
170 TX 65554 649 6 0 1
--执行hanganalye,生成的trace文件直接显示出来,比较人性化,呵呵
SQL> oradebug setmypid
Statement processed.
SQL> oradebug setinst all
Statement processed.
SQL> oradebug hanganalye level 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/lndtest/lndtest/trace/lndtest_ora_3556.trc
也可用下面的语句:
ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level <level>';
以下语句会显示trace文件路径
SQL>oradebug tracefile_name
--查看生成的跟踪文件
SQL> !more /u01/app/oracle/diag/rdbms/lndtest/lndtest/trace/lndtest_ora_3556.trc
Trace file /u01/app/oracle/diag/rdbms/lndtest/lndtest/trace/lndtest_ora_3556.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.1.0.6.0/db_1
System name: Linux
Node name: rctvm
Release: 2.6.18-164.el5
Version: #1 SMP Tue Aug 18 15:51:48 EDT 2009
Machine: x86_64
Instance name: lndtest
Redo thread mounted by this instance: 1
Oracle process number: 34
Unix process pid: 3556, image: oracle@rctvm (TNS V1-V3)
*** 2012-05-25 17:17:05.926
*** SESSION ID:(129.5) 2012-05-25 17:17:05.926
*** CLIENT ID:() 2012-05-25 17:17:05.926
*** SERVICE NAME:(SYS$USERS) 2012-05-25 17:17:05.926
*** MODULE NAME:(sqlplus@rctvm (TNS V1-V3)) 2012-05-25 17:17:05.926
*** ACTION NAME:() 2012-05-25 17:17:05.926
Processing Oradebug command 'setmypid'
*** 2012-05-25 17:17:05.926
Oradebug command 'setmypid' console output: <none>
*** 2012-05-25 17:17:20.072
Processing Oradebug command 'setinst all'
*** 2012-05-25 17:17:20.145
Oradebug command 'setinst all' console output: <none>
*** 2012-05-25 17:18:01.228
Processing Oradebug command 'hanganalyze 3'
*** 2012-05-25 17:18:02.799
===============================================================================
HANG ANALYSIS:
instances (db_name.oracle_sid): lndtest.lndtest
oradebug_node_dump_level: 3
analysis initiated by oradebug
===============================================================================
Chains most likely to have caused the hang:
[a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
===============================================================================
Non-intersecting chains:
-------------------------------------------------------------------------------
Chain 1:
从此处开始正式进入我们想要的跟踪信息
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (lndtest.lndtest)
os id: 3458
process id: 27, oracle@rctvm (TNS V1-V3)
session id: 137
--被阻塞的sid
session serial #: 1
}
is waiting for 'enq: TX - row lock contention' with wait info:
--正在等待行锁
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x10012
p3: 'sequence'=0x289
time in wait: 146.133448 secs
timeout after: never
wait id: 20
blocking: 0 sessions
current sql: update wytest set name = 'test1' where id = 1
--被阻塞的sql
short stack: <-ksedsts()+315<-ksdxfstk()+32<-ksdxcb()+1764<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-sskgpwwait()+265<-skgpwwait()+135<-ksli
wat()+1396<-kslwaitctx()+162<-ksqcmi()+4799<-ksqgtlctx()+1159<-ksqgelctx()+571<-ktcwit1()+333<-kdddgb()+4338<-kdusru()+11909<-updrowFastPath()+1217<-qerupFetch(
)+3665<-updaul()+1349<-updThreePhaseExe()+332<-updexe()+611<-opiexe()+14570<-kpoal8()+2805<-opiodr()+1178<-ttcpip()+1211<-opitsk()+1455<-opiino()+1026<-opiodr()
+1178<-opidrv()+580<-sou2o()+90<-opimai_real()+145<-ssthrdmain()+
wait history:
1. event: 'SQL*Net message from client'
wait id: 19 p1: 'driver id'=0x62657100
time waited: 96.104246 secs p2: '#bytes'=0x1
2. event: 'SQL*Net message to client'
wait id: 18 p1: 'driver id'=0x62657100
time waited: 0.000004 secs p2: '#bytes'=0x1
3. event: 'log file sync'
wait id: 17 p1: 'buffer#'=0x8b
time waited: 0.000407 secs
}
and is blocked by
--被哪个session阻塞
=> Oracle session identified by:
{
instance: 1 (lndtest.lndtest)
os id: 3425
process id: 18, oracle@rctvm (TNS V1-V3)
session id: 170
--产生阻塞的sid
session serial #: 5
}
which is waiting for 'SQL*Net message from client' with wait info:
{
p1: 'driver id'=0x62657100
p2: '#bytes'=0x1
time in wait: 192.963281 secs
timeout after: never
wait id: 3657
blocking: 1 session
current sql: <none>
short stack: <-ksedsts()+315<-ksdxfstk()+32<-ksdxcb()+1764<-sspuser()+112<-__restore_rt()<-read()+14<-sntpread()+35<-ntpfprd()+115<-nsbasic_brc()+1
338<-nsbrecv()+129<-nioqrc()+2820<-__PGOSF19_opikndf2()+782<-opitsk()+672<-opiino()+1026<-opiodr()+1178<-opidrv()+580<-sou2o()+90<-opimai_real()+145<-ssthrdmain
()+177<-main()+215<-__libc_start_main()+244<-_start()+41
wait history:
1. event: 'SQL*Net message to client'
wait id: 3656 p1: 'driver id'=0x62657100
time waited: 0.000005 secs p2: '#bytes'=0x1
2. event: 'SQL*Net message from client'
wait id: 3655 p1: 'driver id'=0x62657100
time waited: 31.961731 secs p2: '#bytes'=0x1
3. event: 'SQL*Net message to client'
wait id: 3654 p1: 'driver id'=0x62657100
time waited: 0.000002 secs p2: '#bytes'=0x1
}
Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------
===============================================================================
Extra information that will be dumped at higher levels:
[level 4] : 1 node dumps -- [LEAF] [LEAF_NW]
[level 5] : 1 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[136]/1/137/1/0x9c8d98b0/3458/NLEAF/[169]
[169]/1/170/5/0x9c931a68/3425/LEAF/
*** 2012-05-25 17:18:02.800
===============================================================================
END OF HANG ANALYSIS
===============================================================================
*** 2012-05-25 17:18:02.800
===============================================================================
HANG ANALYSIS DUMPS:
oradebug_node_dump_level: 3
===============================================================================
State of LOCAL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[136]/1/137/1/0x9c8d98b0/3458/NLEAF/[169]
[169]/1/170/5/0x9c931a68/3425/LEAF/
No processes qualify for dumping.
===============================================================================
HANG ANALYSIS DUMPS: END
===============================================================================
*** 2012-05-25 17:18:02.800
Oradebug command 'hanganalyze 3' console output:
Hang Analysis in /u01/app/oracle/diag/rdbms/lndtest/lndtest/trace/lndtest_ora_3556.trc
6、结论:
通过trace信息很清楚的看到被阻塞的session,sql语句,以及产生阻塞的session信息,其它后者才是我们要找的目标,在紧急情况下KILL掉产生阻塞的session
SQL> alter system kill session '170,5';