重学log buffer

最新推荐文章于 2023-05-05 10:28:18 发布

傻儿哥

最新推荐文章于 2023-05-05 10:28:18 发布

阅读量2.9k

点赞数

分类专栏： ORACLE 2010年重头再学习oracle 文章标签： buffer allocation 磁盘 archive initialization file

本文链接：https://blog.csdn.net/csucxcc/article/details/5885186

版权

ORACLE 同时被 2 个专栏收录

106 篇文章 0 订阅

订阅专栏

2010年重头再学习oracle

16 篇文章 0 订阅

订阅专栏

Configuring and Using the Redo Log Buffer

配置和使用redo log buffer
Server processes making changes to data blocks in the buffer cache generate redo data
into the log buffer. LGWR begins writing to copy entries from the redo log buffer to
the online redo log if any of the following are true:

服务进程对于buffer cache里面的数据进行修改，产生redo 数据，并写入log buffer。

如果下面的条件满足，LGWR 将开始把数据从redo log buffer写到online redo log。

■ The log buffer becomes one third full.
■ LGWR is posted by a server process performing a COMMIT or ROLLBACK.
■ DBWR posts LGWR to do so.

logbuffer 有1/3满了。

LGWR 被服务进程进行提交或回滚调用

DBWR 让LGWR进程做这个事情

When LGWR writes redo entries from the redo log buffer to a redo log file or disk,
user processes can then copy new entries over the entries in memory that have been
written to disk. LGWR usually writes fast enough to ensure that space is available in
the buffer for new entries, even when access to the redo log is heavy.

当LGWR把redo信息从redo log buffer写入 redo 日志文件或磁盘里，用户进程

可以把写新的数据到内存中，把原来内存中已经写入磁盘的老的数据覆盖。

LGWR一般写的很快，能够满足在内存中一直有空间可以写入新的数据。就算redo 日志很忙。
A larger buffer makes it more likely that there is space for new entries, and also gives
LGWR the opportunity to efficiently write out redo records (too small a log buffer on a
system with large updates means that LGWR is continuously flushing redo to disk so
that the log buffer remains 2/3 empty).

一个更大的log buffer 可以留出更多的空间来让数据写入，并给LGWR 机会高效的

写redo 信息（如果一个系统中的log buffer太小，这将意味着LGWR将不断地把redo

信息从内存写到磁盘，这样log buffer 将保留 2/3 空）
On systems with fast processors and relatively slow disks, the processors might be
filling the rest of the buffer in the time it takes the redo log writer to move a portion of
the buffer to disk. A larger log buffer can temporarily mask the effect of slower disks
in this situation. Alternatively, you can do one of the following:

在某些系统中，有快的处理器和相对慢一点的磁盘，这些进程可能可以在redo lgwr 把信息

写入磁盘的时候，把剩下的buffer 填满。因此，一个更大的log buffer 将临时的掩盖

磁盘比较慢的情景。而且，你可以选择做下面的一个操作之一：

■ Improve the checkpointing or archiving process
Improve the performance of log writer (perhaps by moving all online logs to fast
raw devices)
Good usage of the redo log buffer is a simple matter of:
■ Batching commit operations for batch jobs, so that log writer is able to write redo
log entries efficiently
■ Using NOLOGGING operations when you are loading large quantities of data
The size of the redo log buffer is determined by the initialization parameter LOG_
BUFFER. The log buffer size cannot be modified after instance startup.
Figure 7– Redo Log Buffer

提高checkpoint 或archive 进程的效率

提高lgwr 的性能（比如说把所有的联机日志设置到比较快速的裸设备）

利用好redo log buffer:

批量工作则批量提交，，这样lgrw能够高效地写入redo 日志条目

使用nologging 操作，当你加载大量的数据的时候。redo log buffer的大小

由初始化参数 LOG_BUFFER决定。

而log buffer 大小在实例启动后不能被修改。

log buffer space
This event occurs when server processes are waiting for free space in the log buffer,
because all the redo is generated faster than LGWR can write it out.

LOG Buffer 空间
这个事件将发生，当服务进程等待log buffer中的空闲空间。因为redo 日志产生速度
比LGWR 写出的速度更慢。

Actions
Modify the redo log buffer size. If the size of the log buffer is already reasonable, then
ensure that the disks on which the online redo logs reside do not suffer from I/O
contention. The log buffer space wait event could be indicative of either disk I/O
contention on the disks where the redo logs reside, or of a too-small log buffer. Check
the I/O profile of the disks containing the redo logs to investigate whether the I/O
system is the bottleneck. If the I/O system is not a problem, then the redo log buffer
could be too small. Increase the size of the redo log buffer until this event is no longer
significant.

动作：
修改redo log buffer的大小。如果log buffer的大小是合理的，那么请确保联机日志所在的
磁盘IO 不紧张。Log buffer空间等待事件可能暴露的问题是：redo log所在的磁盘磁盘I/O压力大，
或者log buffer太小了。如果IO系统不是问题，那么redo log buffer可能太小了。那么我们就需要
增大redo log buffer的大小知道这个事件(log buffer space wait event)的影响不再巨大。

log file switch
There are two wait events commonly encountered:
■ log file switch (archiving needed)
■ log file switch (checkpoint incomplete)
In both of the events, the LGWR is unable to switch into the next online redo log, and
all the commit requests wait for this event.

日志切换
我们经常会遇到下面的等待事件：
log file 切换（需要归档）
log file 切换（不完全检查点）
上面两个事件中，lgwr能够切换到下一个online redo 日志，而且所有的提交
事件都等待这个事件

Actions
For the log file switch (archiving needed) event, examine why the archiver is
unable to archive the logs in a timely fashion. It could be due to the following:

操作：
针对log file switch （需要归档的)事件，检查归档进程无法归档日志的原因，在一个及时的原因，
它可能因为如下原因导致：

■ Archive destination is running out of free space.
■ Archiver is not able to read redo logs fast enough (contention with the LGWR).
■ Archiver is not able to write fast enough (contention on the archive destination, or
not enough ARCH processes). If you have ruled out other possibilities (such as
slow disks or a full archive destination) consider increasing the number of ARCn
processes. The default is 2.

归档目的地没有空闲空间了；
归档进程读取redo 日志不够快（LGWR 压力大导致）
归档进程写速度不够快（归档目的地压力导致，或者没有足够的归档进程）
如果你派出了其他的可能性（比如较慢的磁盘或者完全的归档路径），考虑增加
归档进程的个数。（归档进程默认的个数是2）

■ If you have mandatory remote shipped archive logs, check whether this process is
slowing down because of network delays or the write is not completing because of
errors.
如果你强制远程运输归档日志，检查这个进程是否减慢了，因为网络延迟或者
写过程因错误而无法完成。

Depending on the nature of bottleneck, you might need to redistribute I/O or add
more space to the archive destination to alleviate the problem. For the log file
switch (checkpoint incomplete) event:
因为瓶颈的原因，你可能需要把IO 打散，或者为归档目的加入更多的空间来缓和这个问题，
针对日志切换（归档不完全）事件:

■ Check if DBWR is slow, possibly due to an overloaded or slow I/O system. Check
the DBWR write times, check the I/O system, and distribute I/O if necessary. See
Chapter 8, "I/O Configuration and Design".

检查是否DBWR是很慢，可能因为过载或者很慢的IO。检查DBWR写的次数，
检查I/O系统，而且如果有必要则分散IO。请查看第8章："i/o配置和设计“

■ Check if there are too few, or too small redo logs. If you have a few redo logs or
small redo logs (for example two x 100k logs), and your system produces enough
redo to cycle through all of the logs before DBWR has been able to complete the
checkpoint, then increase the size or number of redo logs. See "Sizing Redo Log
Files" on page 4-3

检查是否redo 日志太小了或者太少了。如果你只有很少的日志或者很小的日志
(比如说2个100K的日志)，而且你的系统产生了足够的redo在在所有的日志间切换，
而dbwr没有足够的能力来完成checkpoint操作，然后增加联机日志的大小和个数。

log file sync
When a user session commits (or rolls back), the session's redo information must be
flushed to the redo logfile by LGWR. The server process performing the COMMIT or
ROLLBACK waits under this event for the write to the redo log to complete.

日志切换
当一个用户会话提交（或者回滚），这个会话的重做信息必须被LGWR写入redo 日志。
而知道对日志的写完成之后，这个COMMIT或者ROLLBACK操作才结束。

Actions
If this event's waits constitute a significant wait on the system or a significant amount
of time waited by a user experiencing response time issues or on a system, then
examine the average time waited.

操作
在这个事件的等待构成了系统的等待事件中很大的比例或者占用了很长的时间，
所以我们需要检查等待的平均时间。

If the average time waited is low, but the number of waits are high, then the
application might be committing after every INSERT, rather than batching COMMITs.

如果平均等待时间比较慢，但是等待的次数比较高，那么应用程序可能在每次
insert 操作之后都提交了，而不是做的批量提交。

Applications can reduce the wait by committing after 50 rows, rather than every row.

应用程序可以减少wait事件，通过每50行提交一次的方式。而不是之前的每行提交一次。

If the average time waited is high, then examine the session waits for the log writer
and see what it is spending most of its time doing and waiting for. If the waits are
because of slow I/O, then try the following:

如果平均等待时间比较长，那么检查等待lgwr的会话，查看这个会话最长的时间段里
面在做什么.如果等待是因为I/O太慢了，那么检查下面的情况。

■ Reduce other I/O activity on the disks containing the redo logs, or use dedicated
disks
■ Alternate redo logs on different disks to minimize the effect of the archiver on the
log writer.
■ Move the redo logs to faster disks or a faster I/O subsystem (for example, switch
from RAID 5 to RAID 1).
■ Consider using raw devices (or simulated raw devices provided by disk vendors)
to speed up the writes.
■ Depending on the type of application, it might be possible to batch COMMITs by
committing every N rows, rather than every row, so that fewer log file syncs are
needed.

减少包含redo 日志的磁盘的IO负载，或使用单独的磁盘。
使用安放在不同磁盘上的可选择的redo日志，减少归档进程对lgwr的影响。
把redo 日志移到更快的磁盘上，或者更快的IO子系统上（比如从raid5移到RAID1)
考虑使用裸设备（或者由磁盘厂商提供的模拟的裸设备）来加速写操作。
根据应用的种类，我们可能可以做批量的提交，比如每N行提交一次，而不是每行提交一次，
这样我们只需要较少的log file 同步。

Log Buffer Statistics
The statistic REDO BUFFER ALLOCATION RETRIES reflects the number of times a user
process waits for space in the redo log buffer. This statistic can be queried through the
dynamic performance view V$SYSSTAT.

Log Buffer 统计信息
统计信息 redo buffer allocation retries 反映了用户进程等待redo log buffer空闲空间的次数。
这个统计信息可以通过动态视图V$SYSSTAT查询。

Use the following query to monitor these statistics over a period of time while your
application is running:
SELECT NAME, VALUE
FROM V$SYSSTAT
WHERE NAME = 'redo buffer allocation retries';

当你的应用程序在运行时，使用下面的查询来检查统计信息，
SELECT NAME, VALUE
FROM V$SYSSTAT
WHERE NAME = 'redo buffer allocation retries';

The value of redo buffer allocation retries should be near zero over an
interval. If this value increments consistently, then processes have had to wait for
space in the redo log buffer.
在间隔的时间段里，redo buffer空间的分配次数的数值应该是0。如果这个
数值在不断地的增加，那么进程必须等待redo log buffer中的空间。

The wait can be caused by the log buffer being too small
or by checkpointing. Increase the size of the redo log buffer, if necessary, by changing
the value of the initialization parameter LOG_BUFFER.
这个等待事件可能是因为log buffer 太小，或者因为checkpointing事件。建议增加
redo log buffer的大小，如果有必要的话，修改初始化参数 LOG_BUFFER.

The value of this parameter is expressed in bytes. Alternatively, improve the checkpointing or archiving process.
Another data source is to check whether the log buffer space wait event is not a
significant factor in the wait time for the instance; if not, the log buffer size is most
likely adequate.
这个数值的大小是以bytes为单位表示。或者我们也可以提升checkpoint 或归档进程的能力。
另外一个数据源是检查是否log buffer space等待事件在instance等待事件中比较有影响的因子。
如果不是，那么log buffer 大小应该是比较适合的。

LOG_BUFFER
Property Description
Parameter type Integer
Default value 512 KB or 128 KB * CPU_COUNT, whichever is greater
Modifiable No
Range of values Operating system-dependent
Basic No

LOG_BUFFER specifies the amount of memory (in bytes) that Oracle uses when buffering redo entries to a redo log file. Redo log entries contain a record of the changes that have been made to the database block buffers. The LGWR process writes redo log entries from the log buffer to a redo log file.

In general, larger values for LOG_BUFFER reduce redo log file I/O, particularly if transactions are long or numerous. In a busy system, a value 65536 or higher is reasonable.

|||||||||||||||||||||||||||

重做日志缓存问题

数据库在处理用户事务时，同时会把事务信息记载在日志缓存当中，并在一定的时机下将事务信息写盘，这也就是重做日志缓存的作用。在下列条件下，数据库日志读写进程LGWR将把重做日志缓存的内容写到联机重做日志中:
日志缓存1/3满
用户执行COMMIT、ROLLBACK操作
DBWR执行前LGWR执行

这样，日志缓存中的数据在理想状态下将总保持2/3的空闲。一旦LGWR写盘，后续发生的日志信息就可以覆盖已写盘空间。
假定在某个瞬间日志数据生成速度快于写盘速度，这时日志缓存中剩下2/3空间将被使用。因此，保持一个较大的日志空间
将有利于日志数据对缓存的写入，降低日志因1/3满而导致写盘的频率。

由于日志写盘也会在用户提交的情况下发生，因此，在繁忙的系统，日志的写操作构成了整个数据库I/O操作极为重要的
组成部分。优化联机重做日志的写操作是整个数据库应用系统优化的重要方面。一般有如下方法:

Oracle中调整检查点进程
Oracle中在归档模式下改进归档方式和优化归档进程
在系统(AIX)一级使用RAID0+1/RAID1+0类型的存储设备
在系统(AIX)一级使用条带化的逻辑卷（例如 #mklv -S128K)
在系统（AIX)一级使用居于磁盘中部存储的逻辑卷(例如 #mklv -a c)存放日志

优化LGWR进程，将重做日志文件放到具有快速写能力的磁盘区域中，例如采用RAID1+0方式磁盘组上的裸设备。
在编程时，主动减少提交类型的操作。在批处理过程中，将事务提交也批处理化。但应注意它的副作用---这可能
造成严重的回滚段压力和“快照太旧”的错误。

也可考虑使用带有NOLOGGING方式的SQL，但同样需要注意其副作用--要评估系统备份恢复方案是否支持这样做。
重做日志缓存的大小由初始化参数LOG BUFFER决定。该参数是一个静态参数，需要系统重新启动确认。修改日志缓存
设置的过程如下所示:

SQL>alter system set log_buffer = 10000000 scope=spfile;

那么，日志缓存一般的设置情况如何？Oracle使用下面的公式设置初始。我们可以在应用载荷正常的情况下进行系统统计
和数据分析，以确定是否优化。
MAX(0.5M，(128k*cpu数量))

因此，对于一个4CPU系统来说，我们可以设置日志缓存为512KB；对于一个8CPU系统来说，1MB较为恰当。

统计参数“REDO BUFFER ALLOCATION RETRY"(日志缓存分配重试次数）反映了用户进程所进行的事务等待重做日志缓存分配
空间的次数。显然，相对高的统计值可能意味着日志缓存应该加大。该统计数据通过动态性能视图V$SYSSTAT查询，下面的
语句查询了一个时间周期内日志缓存分配重试的统计值。

SQL> select name,value
2 from v$sysstat
3 where name='redo buffer allocation retries';

NAME VALUE
---------------------------------------------------------------- ----------
redo buffer allocation retries 0

返回数据的绝对值意义不大，但是我们可以查询典型时间间隔的统计数据，查看其增幅情况，就可以知道是否存在日志重试次数的
持续上升。

果真如此，调整初始化参数LOG_BUFFER，同时，也要考虑调整检查点的必要性；通常检查点发出后日志数据不能及时写盘，
日志空闲空间获得迟缓。要么降低检查点发生的次数，那么想办法将日志文件放在相对快速的磁盘上。同时注意归档速度，
是否能跟得上日志生成的速度。