客户10.2.0.4 RAC环境,出现大量的library cache lock和cursor: pin S wait on X等待,经分析是由于统计信息收集僵死导致的。
[@more@]
数据库在8点到9点期间,数据库两个节点都存在明显的cursor: pin S wait on X和library cache lock的等待:
Event
Waits
Time(s)
Avg Wait(ms)
% Total Call Time
Wait Class
cursor: pin S wait on X
1,573,056
30,651
19
146.2
Concurrency
library cache lock
31,757
7,009
221
33.4
Concurrency
CPU time
6,416
30.6
DFS lock handle
12,381
2,979
241
14.2
Other
latch: library cache
1,646
1,974
1,199
9.4
Concurrency
Event
Waits
Time(s)
Avg Wait(ms)
% Total Call Time
Wait Class
cursor: pin S wait on X
17,586,451
342,437
19
812.1
Concurrency
library cache lock
63,657
30,153
474
71.5
Concurrency
CPU time
3,820
9.1
db file sequential read
241,761
1,094
5
2.6
User I/O
inactive session
1,105
1,090
986
2.6
Other
两个节点的等待现象基本一致,而节点1上还存在明显的DFS lock handle等待事件。
通过分析ASH信息,发现library cache lock和cursor: pin S wait on X等待基本上都是6点之后才开始出现:
SQL> select trunc(sample_time, 'hh24') time, count(*)
2 from WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
3 where ash.event_id = en.event_id
4 and sample_time >= to_timestamp('20130703', 'yyyymmdd')
5 and event_name in ('cursor: pin S wait on X', 'library cache lock')
6 group by trunc(sample_time, 'hh24')
7 order by 1;
TIME COUNT(*)
------------------- ----------
2013-07-03 00:00:00 4
2013-07-03 01:00:00 8
2013-07-03 02:00:00 3
2013-07-03 03:00:00 8
2013-07-03 04:00:00 9
2013-07-03 05:00:00 14
2013-07-03 06:00:00 348
2013-07-03 07:00:00 2138
2013-07-03 08:00:00 41576
2013-07-03 09:00:00 50108
2013-07-03 10:00:00 2452
而观察出现cursor: pin S wait on X和library cache lock等待会话的BLOCKING SESSION,发现大部分处于等待的会话被会话37锁定:
SQL> select to_char(sample_time, 'hh24:mi:ss.ff') time, session_id sid,
2 decode(en.event_name, 'cursor: pin S wait on X', to_number(substr(ltrim(to_char(p2, '0xxxxxxxxxxxxxxx')), 1, 8), 'xxxxxxxx'), blocking_session) b_sid,
3 en.event_name
4 from WRH$_ACTIVE_SESSION_HISTORY ash, wrh$_event_name en
5 where ash.event_id = en.event_id(+)
6 and instance_number = 1
7 and sample_time >= to_timestamp('201307030800', 'yyyymmddhh24mi')
8 and sample_time <= to_timestamp('201307030801', 'yyyymmddhh24mi')
9 order by 1;
TIME SID B_SID EVENT_NAME
------------------ ---------- ---------- ------------------------------
08:00:09.569 457 4294967291 latch: library cache
08:00:09.569 52 981 cursor: pin S wait on X
08:00:09.569 168 73 library cache lock
08:00:09.569 1763 4294967291 SQL*Net message from client
08:00:09.569 123 4294967295 kksfbc child completion
08:00:09.569 476 4294967291 kst: async disk IO
08:00:09.569 500 4294967292 db file sequential read
08:00:09.569 327 4294967291 SQL*Net message to client
08:00:09.569 233 4294967291 db file scattered read
08:00:09.569 73 73 DFS lock handle
08:00:09.569 1889 4294967291 gc current block 2-way
08:00:09.569 1969 981 cursor: pin S wait on X
08:00:09.569 857 981 cursor: pin S wait on X
08:00:09.569 812 168 cursor: pin S wait on X
08:00:09.569 981 73 library cache lock
08:00:19.669 52 981 cursor: pin S wait on X
08:00:19.669 812 168 cursor: pin S wait on X
08:00:19.669 857 981 cursor: pin S wait on X
08:00:19.669 1969 981 cursor: pin S wait on X
08:00:19.669 233 4294967291 db file scattered read
08:00:19.669 500 4294967292 db file sequential read
08:00:19.669 1763 4294967291 db file sequential read
08:00:19.669 476 4294967291 kst: async disk IO
08:00:19.669 457 4294967291 kksfbc child completion
08:00:19.669 73 73 DFS lock handle
08:00:19.669 654 4294967291 SQL*Net message to client
08:00:19.669 168 73 library cache lock
08:00:19.669 981 73 library cache lock
08:00:29.769 1969 981 cursor: pin S wait on X
08:00:29.769 52 981 cursor: pin S wait on X
08:00:29.769 457 4294967291 kksfbc child completion
08:00:29.769 981 73 library cache lock
08:00:29.769 168 73 library cache lock
08:00:29.769 500 4294967292 db file sequential read
08:00:29.769 73 73 DFS lock handle
08:00:29.769 654 4294967291 SQL*Net message to client
08:00:29.769 327 4294967291 SQL*Net message to client
08:00:29.769 1763 4294967291 db file sequential read
08:00:29.769 857 981 cursor: pin S wait on X
08:00:29.769 123 4294967295 latch: library cache
08:00:29.769 196 4294967295 latch: library cache
08:00:29.769 233 4294967295 latch: library cache
08:00:29.769 812 168 cursor: pin S wait on X
08:00:29.769 476 4294967291 kst: async disk IO
08:00:39.879 476 4294967291 kst: async disk IO
08:00:39.879 658 4294967291 db file sequential read
08:00:39.879 500 4294967292 db file sequential read
08:00:39.879 1969 981 cursor: pin S wait on X
08:00:39.879 857 981 cursor: pin S wait on X
08:00:39.879 812 168 cursor: pin S wait on X
08:00:39.879 52 981 cursor: pin S wait on X
08:00:39.879 981 73 library cache lock
08:00:39.879 168 73 library cache lock
08:00:39.879 1849 4294967291 SQL*Net message to client
08:00:39.879 327 4294967291 SQL*Net message to client
08:00:39.879 235 4294967291 rdbms ipc reply
08:00:39.879 19 4294967291 rdbms ipc reply
08:00:39.879 73 73 DFS lock handle
08:00:39.879 123 4294967294 enq: WF - contention
08:00:49.987 476 4294967291 kst: async disk IO
08:00:49.987 500 4294967292 db file sequential read
08:00:49.987 382 4294967291 gc cr block busy
08:00:49.987 1969 981 cursor: pin S wait on X
08:00:49.987 857 981 cursor: pin S wait on X
08:00:49.987 812 168 cursor: pin S wait on X
08:00:49.987 235 4294967295 kksfbc child completion
08:00:49.987 658 4294967295 latch: library cache
08:00:49.987 19 4294967291 latch: library cache
08:00:49.987 981 73 library cache lock
08:00:49.987 168 73 library cache lock
08:00:49.987 1889 4294967291 SQL*Net message from client
08:00:49.987 73 73 DFS lock handle
08:00:49.987 52 981 cursor: pin S wait on X
可以判断,导致数据库实例HANG死的原因是因为SID为73的进程:
SQL> select to_char(sample_time, 'hh24:mi:ss'), program, action, event_name
2 from wrh$_active_session_history ash, wrh$_event_name en
3 where session_id = 73
4 and ash.event_id = en.event_id(+)
5 and sample_time >= to_date('201307030550', 'yyyymmddhh24mi')
6 and sample_time <= to_date('201307030601', 'yyyymmddhh24mi')
7 order by 1;
TO_CHAR( PROGRAM ACTION EVENT_NAME
-------- ---------------------------------------- ------------------------------ ---------------------------
05:50:07 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential read
05:50:17 oracle@orcldb5 (J002) GATHER_STATS_JOB db file sequential read
05:50:27 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:37 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:50:47 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr block 2-way
05:50:57 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:18 oracle@orcldb5 (J002) GATHER_STATS_JOB db file scattered read
05:51:28 oracle@orcldb5 (J002) GATHER_STATS_JOB gc cr multi block request
05:51:38 oracle@orcldb5 (J002) GATHER_STATS_JOB direct path write temp
05:51:48 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:51:58 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:08 oracle@orcldb5 (J002) GATHER_STATS_JOB latch: library cache
05:52:18 or
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-1060686/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/4227/viewspace-1060686/