buffer busy waits

最新推荐文章于 2020-10-13 17:25:28 发布

cuixie2370

最新推荐文章于 2020-10-13 17:25:28 发布

阅读量545

点赞数

buffer busy waits常常是由于很频繁的insert ，需要重建，或者没有充足的回滚段引起的

发生条件：
block正被读入缓冲区或者缓冲区正被其他session使用, 当缓冲区以一种非共享方式或者如正在被读入到缓冲时,
就会出现该等待.该值不应该大于1%

解决办法：
出现此情况通常可能通过几种方式调整：增大data  buffer,增加freelist，减小pctused,增加回滚段数目，
增大initrans,考虑使用LMT, 确认是不是由于热点块造成(如果是可以用反转索引,或者用更小块大小)

P1 = file# (Absolute File# in Oracle8 onwards)
P2 = block#
P3 = id (Reason Code)

原因代码：
A block is being read
=====
100    We want to NEW the block but the block is currently being read by another session (most likely for undo).
200    We want to NEW the block but someone else has is using the current copy so we have to wait for them to finish.
230    Trying to get a buffer in CR/CRX mode , but a modification has started on the buffer that has not yet been completed.
      -  A modification is happening on a SCUR or XCUR buffer, but has not yet completed
      (dup.) 231  CR/CRX scan found the CURRENT block, but a modification has started on the buffer that has not yet been completed.
130    Block is being read by another session and no other suitable block image was found, so we wait until the read is completed. This may also occur after a buffer cache assumed deadlock. The kernel can't get a buffer in a certain amount of time and assumes a deadlock. Therefor it will read the CR version of the block.
110    We want the CURRENT block either shared or exclusive but the Block is being read into cache by another session, so we have to wait until their read() is completed.
      (duplicate)  120  We want to get the block in current mode but someone else is currently reading it into the cache. Wait for them to complete the read. This occurs during buffer lookup.
210    The session wants the block in SCUR or XCUR mode. If this is a buffer exchange or the session is in discrete TX mode, the session waits for the first time and the second time escalates the block as a deadlock and so does not show up as waiting very long. In this case the statistic: "exchange deadlocks" is incremented and we yield the CPU for the "buffer deadlock" wait event.
      (duplicate)  220  During buffer lookup for a CURRENT copy of a buffer we have found the buffer but someone holds it in an incompatible mode so we have to wait.

1.
SELECT kcbwhdes, why0+why1+why2 "Gets", "OTHER_WAIT"
FROM x$kcbsw s, x$kcbwh w
WHERE s.indx=w.indx
   and s."OTHER_WAIT">0
ORDER BY 3
  ;

2. SELECT count, file#, name
FROM x$kcbfwait, v$datafile
WHERE indx + 1 = file#
ORDER BY count

3.SELECT distinct owner, segment_name, segment_type
FROM dba_extents
WHERE file_id= &FILE_ID
  ;

4.SELECT p1 "File", p2 "Block", p3 "Reason"
FROM v$session_wait
WHERE event='buffer busy waits'
  ;


相关解决办法详解
================
This document discusses a rare and difficult to diagnose database performance
problem characterized by extremely high buffer busy waits that occur at
seemingly random times.  The problem persists even after traditional buffer
busy wait tuning practices are followed (typically, increasing the number of
freelists for an object).

SCOPE & APPLICATION
-------------------

This document is intended for support analysts and customers.  It applies to
both Unix and Windows-based systems, although the examples here will be
particular to a Unix-based (Solaris) system.

In addition to addressing a specific buffer busy wait performance problem,
in section II, this document presents various techniques to diagnose and
resolve this problem by using detailed data from a real-world example.  The
techniques illustrated here may be used to diagnose other I/O and performance
problems.

RESOLVING INTENSE AND "RANDOM" BUFFER BUSY WAIT PERFORMANCE PROBLEMS
--------------------------------------------------------------------

This document is composed of two sections; a summary section that broadly
discusses the problem and its resolution, and a detailed diagnostics section
that shows how to collect and analyze various database and operating system
diagnostics related to this problem.  The detailed diagnostics section is
provided to help educate the reader with techniques that may be useful in
other situations.

I.  Summary
~~~~~~~~~~~

1.  Problem Description
~~~~~~~~~~~~~~~~~~~~~~~

At seemingly random times without regard to overall load on the database,
the following symptoms may be witnessed:

- Slow response times on an instance-wide level
- long wait times for "buffer busy waits" in Bstat/Estat or Statpack reports
- large numbers of sessions waiting on buffer busy waits for a group of
      objects (identified in v$session_wait)

Some tuning effort may have been spent in identifying the segments
involved in the buffer busy waits and rebuilding those segments with a higher
number of freelists or freelist groups (from 8.1.6 on one can dynamically add
process freelists; segments only need to be rebuilt if changing freelist
groups).  Even after adding freelists, the problem continues and is not
diminished in any way (although regular, concurrency-based buffer busy waits
may be reduced).


2.  Problem Diagnosis
~~~~~~~~~~~~~~~~~~~~~

   The problem may be diagnosed by observing the following:

      - The alert.log file shows many occurrences of ORA-600, ORA-7445 and
      core dumps during or just before the time of the problem.



      - The core_dump_dest directory contains large core dumps during the
      time of the problem. There may either be many core dumps or a few
      very large core dumps (100s of MB per core file), depending on the
      size of the SGA.

      查看cdump下的文件及大小

      - sar -d shows devices that are completely saturated and have high
      request queues and service times.  These devices and/or their
      controllers are part of logical volumes used for database files.
   磁盘使用情况


      - Buffer busy waits, write complete waits, db file parallel writes and
      enqueue waits are high (in top 5 waits, usually in that order).
      Note that in environments using Oracle Advanced Replication, the
      buffer busy waits may at times appear on segments related to
      replication (DEF$_AQCALL table, TRANORDER index, etc...).


3.  Problem Resolution
~~~~~~~~~~~~~~~~~~~~~~

The cause for the buffer busy waits and other related waits might be a
saturated disk controller or subsystem impacting the database's ability to read
or write blocks.  The disk/controller may be saturated because of the many
core dumps occurring simultaneously requiring hundreds of megabytes each.  If
the alert.log or core_dump_dest directory has no evidence of core dumps, then
the source of the I/O saturation must be found.  It may be due to non-database
processes, another database sharing the same filesystems, or a poorly tuned
I/O subsystem.

The solution is as follows:

1) Find the root cause for the I/O saturation (core dumps,
               another process or database, or poorly performing I/O
               subsystem) and resolve it.
OR,
2) If evidence of core dumps are found:
-  Find the causes for the core dumps and resolve
                        them (patch, etc)
-  Move the core_dump_dest location to a filesystem
                        not shared with database files.
-  Use the following init.ora parameters to reduce
                        or avoid the core dumps:
shadow_core_dump = partial
background_core_dump = partial
These core dump parameters can also be set to "none"
                     but this is not recommended unless the causes for the
                     core dumps have been identified.


B. SAR Diagnostics
~~~~~~~~~~~~~~~~~~

SAR, IOSTAT, or similar tools are critical to diagnosing this problem because
they show the health of the I/O system during the time of the problem.  The
SAR data for the example we are looking at is shown below (shown
using "sar -d -f /var/adm/sa/sa16"):

SunOS prod1 5.6 Generic_105181-23 sun4u 05/16/01

01:00:00 device       %busy avque r+w/s  blks/s  avwait  avserv

      sd22          100 72.4 2100 2971    0.0 87.0
      sd22,c          0    0.0    0    0    0.0    0.0
      sd22,d          0    0.0    0    0    0.0    0.0
      sd22,e       100 72.4 2100 2971    0.0 87.0
                              /\
                              ||
extremely high queue values (usually less than 2 during peak)

By mapping the sd22 device back to the device number (c3t8d0) and then back to
the logical volume through to the filesystem (using "f" and Veritas'
utility /usr/sbin/vxprint), it was determined the filesystem shared the same
controller (c3) as several database files (among them were the datafiles for
the SYSTEM tablespace).

By looking within the filesystems using the aforementioned controller (c3),
several very large (1.8 GB) core dumps were found in the core_dump_dest
directory, produced around the time of the problem.

The following lists some key statistics to look at:

Statistic                         Total per Second per Trans
----------------------- ---------------- ------------ ------------
consistent changes             43,523       12.1       2.4    Much
free buffer inspected             6,087       1.7       0.3 <== higher
free buffer requested          416,010       115.6       23.1    than
logons cumulative                15,718       4.4       0.9    normal
physical writes                24,757       6.9       1.4
write requests                      634       0.2       0.0

iii.  Tablespace I/O Summary

The average wait times for tablespaces will be dramatically higher.

Tablespace IO Summary for DB: PROD  Instance: PROD  Snaps: 3578 - 3579

                     Avg Read                Total Avg Wait
Tablespace       Reads (ms)       Writes    Waits (ms)
----------- ----------- -------- ----------- ---------- --------
BOM          482,368    7.0    18,865    3,161 205.9 very
CONF          157,288    0.6       420       779 9897.3 <= high
CTXINDEX       36,628    0.5          7       4    12.5 very
RBS             613 605.7    23,398    8,253 7694.6 <= high
SYSTEM       18,360    3.6       286       78 745.5
DB_LOW_DATA    16,560    2.6    1,335       14    24.3

比如是由于热块造成的，可以使用修改pctfree到一个大的值，利用空间来提高性能。

================
统计某个区域的等待事件信息
================
CREATE TABLE sinoview.previous_events AS
SELECT SYSDATE timestamp, v$system_event.*
FROM v$system_event;
EXECUTE dbms_lock.sleep (30);
SELECT A.event,
      A.total_waits
      - NVL (B.total_waits, 0) total_waits,
      A.time_waited
      - NVL (B.time_waited, 0) time_waited
FROM    v$system_event A, previous_events B
WHERE A.event NOT IN ('client message',
                     'dispatcher timer',
                     'gcs for action',
                     'gcs remote message',
                     'ges remote message',
                     'i/o slave wait',
                     'jobq slave wait',
                     'lock manager wait for remote message',
                     'null event',
                     'parallel query dequeue',
                     'pipe get',
                     'PL/SQL lock timer',
                     'pmon timer',
                     'PX Deq Credit: need buffer',
                     'PX Deq Credit: send blkd',
                     'PX Deq: Execute Reply',
                     'PX Deq: Execution Msg',
                     'PX Deq: Signal ACK',
                     'PX Deq: Table Q Normal',
                     'PX Deque wait',
                     'PX Idle Wait',
                     'queue messages',
                     'rdbms ipc message',
                     'slave wait',
                     'smon timer',
                     'SQL*Net message from client',
                     'SQL*Net message to client',
                     'SQL*Net more data from client',
                     'virtual circuit status',
                     'wakeup time manager' )
AND    B.event (+) = A.event
ORDER BY time_waited;

http://metalink.oracle.com/metal ... OT&p_id=34405.1

http://www.**.org/viewthread.php?tid=64883

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/35489/viewspace-417497/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/35489/viewspace-417497/

cuixie2370

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
buffer busy waits

buffer busy waits常常是由于很频繁的insert ，需要重建，或者没有充足的回滚段引起的发生条件：block正被读入缓冲区或者缓冲区正被其他session使用, 当缓冲区以一种非共享方式或者如正在被读入到缓...
复制链接

扫一扫