检查alert.log,发现有死锁发生:
Thu Nov 18 09:32:59 2010
Global Enqueue Services Deadlock detected. More info in file
/oracle/app/oracle/admin/tradeb/bdump/tradeb1_lmd0_2306186.trc.
打开该trace 文件显示,查找BLOCKED,显示:
BLOCKED 700000f03f0edf0 5 wq 2 cvtops x1 [0x39002f][0x76b4e],[TX] [1034-0346-00000003] 0
BLOCKER 700000f03f0eca0 5 wq 1 cvtops x8 [0x39002f][0x76b4e],[TX] [1027-027B-00000003] 0
BLOCKED 700000f019a5e08 5 wq 2 cvtops x1 [0x1b6000d][0x2],[TX] [1027-027B-00000003] 0
BLOCKER 700000f019a5b68 5 wq 1 cvtops x8 [0x1b6000d][0x2],[TX] [1034-0346-00000003] 0
第一列为:阻塞/被阻塞
第二列为:lock pointer
第三列为:锁模式
第四列为:事务信息 将前2部分,补足8位连起来就是xid
最后一列为:节点信息,从0开始,1就是节点2
打开该trace 文件 tradeb1_lmd0_2306186.trc,查找deadlock关键字,看到sql了:
user session for deadlock lock 700000f019a5e08
pid=635 serial=5 audsid=954186 user: 33/HS_ASSET
O/S info: user: hundsun, term: pts/2, ospid: 30881, machine: htsc
program: hsserver@htsc (TNS V1-V3)
O/S info: user: hundsun, term: pts/2, ospid: 30881, machine: htsc
program: hsserver@htsc (TNS V1-V3)
application name: hsserver@htsc (TNS V1-V3), hash value=1000408464
Current SQL Statement:
update banktransfer set bktrans_status='1',report_time=:b0,bank_operator= case bktrans_status when 'P' then '?
?' else bank_operator end ,remark=substrb((trim(remark)||'|'),1,255) where ((serial_no=:b1 and init_date=:b2)
and (bktrans_status='0' or bktrans_status='P'))
user session for deadlock lock 700000f019a5b68
pid=838 serial=12 audsid=954383 user: 33/HS_ASSET
O/S info: user: hundsun, term: pts/2, ospid: 28379, machine: EESBG04M00
program: hsserver@EESBG04M00 (TNS V1-V3)
application name: hsserver@EESBG04M00 (TNS V1-V3), hash value=4199011500
Current SQL Statement:
update banktransfer set bktrans_status='1',report_time=:b0,bank_operator= case bktrans_status when 'P' then '?
?' else bank_operator end ,remark=substrb((trim(remark)||'|'),1,255) where ((serial_no=:b1 and init_date=:b2)
and (bktrans_status='0' or bktrans_status='P'))
使用Flashback Transaction Query,查找对应的事务信息:
select * from FLASHBACK_TRANSACTION_QUERY where xid=hextoraw('XXX');
注意这儿一定要用xid=hextoraw('XXX')而不能直接xid=xxx
使用xid=hextoraw('XXX')则会使用索引X$KTUQQRY(ind:1),很快;xid=xxx会是全表扫描SYS.X$KTUQQRY,非常慢。
[0x1b6000d][0x2]
select * from flashback_transaction_query where xid=hextoraw('01B6000D00000002')
查找历史快照,看该事务的信息:
[0x39002f][0x76b4e]
select * from dba_hist_active_sess_history where xid='0039002F00076B4E'
[0x1b6000d][0x2]
select * from dba_hist_active_sess_history where xid='01B6000D00000002'
根据得到的sql_id,也可以查到sql
0hbz88j3hxzh5
select * from v$sqltext_with_newlines where sql_id='0hbz88j3hxzh5' order by piece
查对应的绑定变量
select * from dba_hist_sqlbind where snap_id=11588 and instance_number=1 and sql_id='0hbz88j3hxzh5'
用logmnr挖掘:
先添加要挖掘的日志到挖掘列表
DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME=>'归档日志全路径',OPTIONS=> DBMS_LOGMNR.NEW);
DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME=>'归档日志全路径');
开始挖掘
DBMS_LOGMNR.START_LOGMNR();
查看结果
V$LOGMNR_CONTENTS
结束挖掘
DBMS_LOGMNR.END_LOGMNR
添加要logmnr的日志
DBMS_LOGMNR.ADD_LOGFILE (
LogFileName IN VARCHAR2,
options IN BINARY_INTEGER default ADDFILE );
options
DBMS_LOGMNR.NEW : Starts a new LogMiner session and a new list of redo log files for analysis.Implicitly calls the DBMS_LOGMNR.END_LOGMNR procedure to end the current LogMiner session and then creates a new session. The new session starts a new list of redo log files to be analyzed, beginning with the redo log file you specify
DBMS_LOGMNR.ADDFILE : Default. Adds a file to an existing list of redo log files for analysis
从要logmnr的日志列表中移除日志
DBMS_LOGMNR.REMOVE_LOGFILE (
LogFileName IN VARCHAR2);
启动logmnr
DBMS_LOGMNR.START_LOGMNR (
startScn IN NUMBER default 0,
endScn IN NUMBER default 0,
startTime IN DATE default '01-jan-1988',
endTime IN DATE default '31-dec-2110',
DictFileName IN VARCHAR2 default '',
Options IN BINARY_INTEGER default 0 );
结束logmnr
DBMS_LOGMNR.END_LOGMNR;
This procedure is called automatically when you log out of a database session or when you call DBMS_LOGMNR.ADD_LOGFILE and specify the NEW option
根据死锁时间和TRACE产生的时间点,我们可以得到这个死锁发生的时候的日志是记录在哪个SEQUENCE的日志中,如果日志已经归档,则需要LOGMINER相应的归档日志,如果还没有归档,则需要LOGMINER当前日志,另外注意,如果死锁发生在1和2两个节点上,需要把两个节点上对应的日志同时LOGMINER。然后使用下面的SQL进行查询:
SELECT *
FROM V$LOGMNR_CONTENTS
WHERE (XIDUSN = TO_NUMBER(SUBSTR('040d002e', 1, 4), 'XXXX') AND
XIDSLT = TO_NUMBER(SUBSTR('040d002e', 5, 8), 'XXXX') AND
XIDSQN = TO_NUMBER('9d0d8', 'XXXXXXXX'))
OR (XIDUSN = TO_NUMBER(SUBSTR('0008002b', 1, 4), 'XXXX') AND
XIDSLT = TO_NUMBER(SUBSTR('0008002b', 5, 8), 'XXXX') AND
XIDSQN = TO_NUMBER('2bc9d9', 'XXXXXXXX'))
ORDER BY 1, RS_ID, SSN;
其中XIDUSN和XIDSLT是由XID的前半部分转换而来的,不足8 位的左边补0够8位;XIDSQN是由XID的后半部分转换得到的,这部分也是8位左补0的,因为这块是否补0对结果没有影响,所以这里简单期间不补0。
根据ORACLE DOCUMENT,V$LOGMNR_CONTENTS中的每个记录是按照RS_ID和SSN组合来标识唯一的,翻了半天也没有找到哪个字段表示SQL执行的顺序,经过简单测试和猜测,如果是同一个节点的,则可以认为RS_ID和SSN排序就是SQL执行的顺序,如果是跨节点的,因为有不同的THREAD 来写REDO LOG,所以排序的时候需要加上第一列,也就是SCN列。
实例:
SQL> exec DBMS_LOGMNR.ADD_LOGFILE(LOGFILENAME=>'/arch01/1_9894_692893719.arc',OPTIONS=> DBMS_LOGMNR.NEW);
PL/SQL procedure successfully completed
SQL> exec DBMS_LOGMNR.START_LOGMNR (startTime=>to_date('201011180925','yyyymmddhh24mi'),endTime=>to_date('201011180940','yyyymmddhh24mi'));
PL/SQL procedure successfully completed
SQL> SELECT *
2 FROM V$LOGMNR_CONTENTS
3 WHERE (XIDUSN = TO_NUMBER(SUBSTR('0039002f', 1, 4), 'XXXX') AND
4 XIDSLT = TO_NUMBER(SUBSTR('0039002f', 5, 8), 'XXXX') AND
5 XIDSQN = TO_NUMBER('00076b4e', 'XXXXXXXX'))
6 OR (XIDUSN = TO_NUMBER(SUBSTR('01b6000d', 1, 4), 'XXXX') AND
7 XIDSLT = TO_NUMBER(SUBSTR('01b6000d', 5, 8), 'XXXX') AND
8 XIDSQN = TO_NUMBER('00000002', 'XXXXXXXX'))
9 ORDER BY 1, RS_ID, SSN;
SCN CSCN TIMESTAMP COMMIT_TIMESTAMP THREAD# LOG_ID XIDUSN XIDSLT XIDSQN PXIDUSN PXIDSLT PXIDSQN RBASQN RBABLK RBABYTE UBAFIL UBABLK UBAREC UBASQN ABS_FILE# REL_FILE# DATA_BLK# DATA_OBJ# DATA_OBJD# SEG_OWNER SEG_NAME TABLE_NAME SEG_TYPE SEG_TYPE_NAME TABLE_SPACE ROW_ID SESSION# SERIAL# USERNAME SESSION_INFO TX_NAME ROLLBACK OPERATION OPERATION_CODE SQL_REDO SQL_UNDO RS_ID SEQUENCE# SSN CSF INFO STATUS REDO_VALUE UNDO_VALUE SQL_COLUMN_TYPE SQL_COLUMN_NAME REDO_LENGTH REDO_OFFSET UNDO_LENGTH UNDO_OFFSET DATA_OBJV# SAFE_RESUME_SCN XID PXID AUDIT_SESSIONID
---------- ---------- ----------- ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------------------------------- -------------------------------------------------------------------------------- -------------------------------- ---------- -------------------------------- -------------------------------- ------------------ ---------- ---------- ------------------------------ -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ---------- -------------------------------- -------------- -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- -------------------------------- ---------- ---------- ---------- -------------------------------- ---------- ---------- ---------- ------------------------------ ------------------------------ ----------- ----------- ----------- ----------- ---------- --------------- ---------------- ---------------- ---------------
5850638177 2010-11-18 1 9894 57 47 486222 57 47 486222 9894 1202378 392 3 0 0 0 0 0 0 0 0 0 AAAAAAAAAAAAAAAAAA 0 0 UNKNOWN UNKNOWN 0 START 6 set transaction read write; 0x0026a6.001258ca.0188 1 0 0 0 520064 520065 0 0 0 0 0 0039002F00076B4E 0039002F00076B4E 0
... ...
5850644940 2010-11-18 1 9894 57 47 486222 57 47 486222 9894 1262907 384 4 17623108 44 10027 14 0 0 246836 254130 0 AAA+CyAAAAAAAAAAAA 0 0 UNKNOWN UNKNOWN 0 INTERNAL 0 0x0026a6.0013453b.0180 1 0 0 0 625030 625031 0 0 0 0 0 0039002F00076B4E 0039002F00076B4E 0
5850644942 5850644942 2010-11-18 2010-11-18 09:33 1 9894 57 47 486222 57 47 486222 9894 1262981 88 4 0 0 0 3 0 0 0 0 0 AAAAAAAAAAAAAAAAAA 0 0 UNKNOWN UNKNOWN 0 COMMIT 7 commit; 0x0026a6.00134585.0058 1 0 0 0 625178 625179 0 0 0 0 0 58506382000 0039002F00076B4E 0039002F00076B4E 0
29 rows selected
SQL> exec DBMS_LOGMNR.END_LOGMNR;
PL/SQL procedure successfully completed
SQL>