buffer busy waits常常是由于很频繁的insert ,需要重建,或者没有充足的回滚段引起的
发生条件:
block正被读入缓冲区或者缓冲区正被其他session使用, 当缓冲区以一种非共享方式或者如正在被读入到缓冲时,
就会出现该等待.该值不应该大于1%
解决办法:
出现此情况通常可能通过几种方式调整:增大data buffer,增加freelist,减小pctused,增加回滚段数目,
增大initrans,考虑使用LMT, 确认是不是由于热点块造成(如果是可以用反转索引,或者用更小块大小)
P1 = file# (Absolute File# in Oracle8 onwards)
P2 = block#
P3 = id (Reason Code)
原因代码:
A block is being read
=====
100 We want to NEW the block but the block is currently being read by another session (most likely for undo).
200 We want to NEW the block but someone else has is using the current copy so we have to wait for them to finish.
230 Trying to get a buffer in CR/CRX mode , but a modification has started on the buffer that has not yet been completed.
- A modification is happening on a SCUR or XCUR buffer, but has not yet completed
(dup.) 231 CR/CRX scan found the CURRENT block, but a modification has started on the buffer that has not yet been completed.
130 Block is being read by another session and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefor it will read the CR version of the block.
110 We want the CURRENT block either shared or exclusive but the Block is being read into cache by another session, so we have to wait until their read() is completed.
(duplicate) 120 We want to get the block in current mode but someone else is currently reading it into the cache. Wait for them to complete the read. This occurs during buffer lookup.
210 The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or the session is in discrete TX mode, the session waits for the first time and the second time escalates the block as a deadlock and so does not show up as waiting very long. In this case the statistic: "exchange deadlocks" is incremented and we yield the CPU for the "buffer deadlock" wait event.
(duplicate) 220 During buffer lookup for a CURRENT copy of a buffer we have found the buffer but someone holds it in an incompatible mode so we have to wait.
1.
SELECT kcbwhdes, why0+why1+why2 "Gets", "OTHER_WAIT"
FROM x$kcbsw s, x$kcbwh w
WHERE s.indx=w.indx
and s."OTHER_WAIT">0
ORDER BY 3
;
2. SELECT count, file#, name
FROM x$kcbfwait, v$datafile
WHERE indx + 1 = file#
ORDER BY count
3.SELECT distinct owner, segment_name, segment_type
FROM dba_extents
WHERE file_id= &FILE_ID
;
4.SELECT p1 "File", p2 "Block", p3 "Reason"
FROM v$session_wait
WHERE event='buffer busy waits'
;
相关解决办法详解
================
This document discusses a rare and difficult to diagnose database performance
problem characterized by extremely high buffer busy waits that occur at
seemingly random times. The problem persists even after traditional buffer
busy wait tuning practices are followed (typically, increasing the number of
freelists for an object).
SCOPE & APPLICATION
-------------------
This document is intended for support analysts and customers. It applies to
both Unix and Windows-based systems, although the examples here will be
particular to a Unix-based (Solaris) system.
In addition to addressing a specific buffer busy wait performance problem,
in section II, this document presents various techniques to diagnose and
resolve this problem by using detailed data from a real-world example. The
techniques illustrated here may be used to diagnose other I/O and performance
problems.
RESOLVING INTENSE AND "RANDOM" BUFFER BUSY WAIT PERFORMANCE PROBLEMS
--------------------------------------------------------------------
This document is composed of two sections; a summary section that broadly
discusses the problem and its resolution, and a detailed diagnostics section
that shows how to collect and analyze various database and operating system
diagnostics related to this problem. The detailed diagnostics section is
provided to help educate the reader with techniques that may be useful in
other situations.
I. Summary
~~~~~~~~~~~
1. Problem Description
~~~~~~~~~~~~~~~~~~~~~~~
At seemingly random times without regard to overall load on the database,
the following symptoms may be witnessed:
- Slow response times on an instance-wide level
- long wait times for "buffer busy waits" in Bstat/Estat or Statpack reports
- large numbers of sessions waiting on buffer busy waits for a group of
objects (identified in v$session_wait)
Some tuning effort may have been spent in identifying the segments
involved in the buffer busy waits and rebuilding those segments with a higher
number of freelists or freelist groups (from 8.1.6 on one can dynamically add
process freelists; segments only need to be rebuilt if changing freelist
groups). Even after adding freelists, the problem continues and is not
diminished in any way (although regular, concurrency-based buffer busy waits
may be reduced).
2. Problem Diagnosis
~~~~~~~~~~~~~~~~~~~~~
The problem may be diagnosed by observing the following:
- The alert.log file shows many occurrences of ORA-600, ORA-7445 and
core dumps during or just before the time of the problem.
- The core_dump_dest directory contains large core dumps during the
time of the problem. There may either be many core dumps or a few
very large core dumps (100s of MB per core file), depending on the
size of the SGA.
查看cdump下的文件及大小
- sar -d shows devices that are completely saturated and have high
request queues and service times. These devices and/or their
controllers are part of logical volumes used for database files.
磁盘使用情况
- Buffer busy waits, write complete waits, db file parallel writes and
enqueue waits are high (in top 5 waits, usually in that order).
Note that in environments using Oracle Advanced Replication, the
buffer busy waits may at times appear on segments related to
replication (DEF$_AQCALL table, TRANORDER index, etc...).
3. Problem Resolution
~~~~~~~~~~~~~~~~~~~~~~
The cause for the buffer busy waits and other related waits might be a
saturated disk controller or subsystem impacting the database's ability to read
or write blocks. The disk/controller may be saturated because of the many
core dumps occurring simultaneously requiring hundreds of megabytes each. If
the alert.log or core_dump_dest directory has no evidence of core dumps, then
the source of the I/O saturation must be found. It may be due to non-database
processes, another database sharing the same filesystems, or a poorly tuned
I/O subsystem.
The solution is as follows:
1) Find the root cause for the I/O saturation (core dumps,
another process or database, or poorly performing I/O
subsystem) and resolve it.
OR,
2) If evidence of core dumps are found:
- Find the causes for the core dumps and resolve
them (patch, etc)
- Move the core_dump_dest location to a filesystem
not shared with database files.
- Use the following init.ora parameters to reduce
or avoid the core dumps:
shadow_core_dump = partial
background_core_dump = partial
These core dump parameters can also be set to "none"
but this is not recommended unless the causes for the
core dumps have been identified.
B. SAR Diagnostics
~~~~~~~~~~~~~~~~~~
SAR, IOSTAT, or similar tools are critical to diagnosing this problem because
they show the health of the I/O system during the time of the problem. The
SAR data for the example we are looking at is shown below (shown
using "sar -d -f /var/adm/sa/sa16"):
SunOS prod1 5.6 Generic_105181-23 sun4u 05/16/01
01:00:00 device %busy avque r+w/s blks/s avwait avserv
sd22 100 72.4 2100 2971 0.0 87.0
sd22,c 0 0.0 0 0 0.0 0.0
sd22,d 0 0.0 0 0 0.0 0.0
sd22,e 100 72.4 2100 2971 0.0 87.0
/\
||
extremely high queue values (usually less than 2 during peak)
By mapping the sd22 device back to the device number (c3t8d0) and then back to
the logical volume through to the filesystem (using "f" and Veritas'
utility /usr/sbin/vxprint), it was determined the filesystem shared the same
controller (c3) as several database files (among them were the datafiles for
the SYSTEM tablespace).
By looking within the filesystems using the aforementioned controller (c3),
several very large (1.8 GB) core dumps were found in the core_dump_dest
directory, produced around the time of the problem.
The following lists some key statistics to look at:
Statistic Total per Second per Trans
----------------------- ---------------- ------------ ------------
consistent changes 43,523 12.1 2.4 Much
free buffer inspected 6,087 1.7 0.3 <== higher
free buffer requested 416,010 115.6 23.1 than
logons cumulative 15,718 4.4 0.9 normal
physical writes 24,757 6.9 1.4
write requests 634 0.2 0.0
iii. Tablespace I/O Summary
The average wait times for tablespaces will be dramatically higher.
Tablespace IO Summary for DB: PROD Instance: PROD Snaps: 3578 - 3579
Avg Read Total Avg Wait
Tablespace Reads (ms) Writes Waits (ms)
----------- ----------- -------- ----------- ---------- --------
BOM 482,368 7.0 18,865 3,161 205.9 very
CONF 157,288 0.6 420 779 9897.3 <= high
CTXINDEX 36,628 0.5 7 4 12.5 very
RBS 613 605.7 23,398 8,253 7694.6 <= high
SYSTEM 18,360 3.6 286 78 745.5
DB_LOW_DATA 16,560 2.6 1,335 14 24.3
比如是由于热块造成的,可以使用修改pctfree到一个大的值,利用空间来提高性能。
================
统计某个区域的等待事件信息
================
CREATE TABLE sinoview.previous_events AS
SELECT SYSDATE timestamp, v$system_event.*
FROM v$system_event;
EXECUTE dbms_lock.sleep (30);
SELECT A.event,
A.total_waits
- NVL (B.total_waits, 0) total_waits,
A.time_waited
- NVL (B.time_waited, 0) time_waited
FROM v$system_event A, previous_events B
WHERE A.event NOT IN ('client message',
'dispatcher timer',
'gcs for action',
'gcs remote message',
'ges remote message',
'i/o slave wait',
'jobq slave wait',
'lock manager wait for remote message',
'null event',
'parallel query dequeue',
'pipe get',
'PL/SQL lock timer',
'pmon timer',
'PX Deq Credit: need buffer',
'PX Deq Credit: send blkd',
'PX Deq: Execute Reply',
'PX Deq: Execution Msg',
'PX Deq: Signal ACK',
'PX Deq: Table Q Normal',
'PX Deque wait',
'PX Idle Wait',
'queue messages',
'rdbms ipc message',
'slave wait',
'smon timer',
'SQL*Net message from client',
'SQL*Net message to client',
'SQL*Net more data from client',
'virtual circuit status',
'wakeup time manager' )
AND B.event (+) = A.event
ORDER BY time_waited;
http://metalink.oracle.com/metal ... OT&p_id=34405.1
http://www.**.org/viewthread.php?tid=64883
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-417497/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/35489/viewspace-417497/