RedoLog & Checkpoint & SCN

Redo Byte Address(RDA)

Redolog的redo entry地址用RBA(Redo Byte Address)表示,没有视图查看RDA,需要dump redo log查看。RDA由三部分组成:

  1. the log file sequence number (4 bytes)
  2. the log file block number (4 bytes)
  3. the byte offset into the block at which the redo record starts (2bytes)

LRBA表示buffer cache中脏块第一次更改时产生的redo entry在redo log的地址

HRBA表示buffer cache中脏块最近一次更改时产生的redo entry在redo log的地址

checkpoint RBA表示在执行checkpoint时redo log中最后一个redo entry地址,checkpoint会从上一个checkpoint RBA到当前checkpoint RBA间buffer cache的blocks写入disk

System Change Numbers(SCNs)

A system change number (SCN) is a logical, internal time stamp used by Oracle Database.

Oracle Database increments SCNs in the system global area (SGA).

SCNs occur in a monotonically单调 increasing sequence. Oracle Database can use an SCN like a clock because an observed SCN indicates a logical point in time, and repeated observations return equal or greater values. If one event has a lower SCN than another event, then it occurred at an earlier time in the database. Several events may share the same SCN, which means that they occurred at the same time in the database.

SCNs order events that occur within the database, which is necessary to satisfy the ACID properties of a transaction.

Oracle Database uses SCNs to mark the SCN before which all changes are known to be on disk so that recovery avoids applying unnecessary redo.

The database also uses SCNs to mark the point at which no redo exists for a set of data so that recovery can stop.

控制文件,数据文件,日志文件及block中均有SCN,查看各种SCN:

  1. CURRENT SCN

SELECT current_scn scn FROM v$database;

SELECT DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER scn FROM DUAL;

  1. Database Normal checkpoint SCN(存于控制文件)

select CHECKPOINT_CHANGE# from v$database;

  1. Control file normal checkpoint SCN与data file last normal checkpoint SCN(均存于控制文件)

select file#,checkpoint_change#,last_change#, enabled, status from v$datafile;

last_change#表示表空间最后一次normal checkpoint SCN, 如果表空间为online+read write时此值为NULL。实例非正常关闭则last_change#仍为null

Oracle启动时会比较control file与data file header的CHECKPOINT_CHANGE#, 如果非offline/read only的数据文件与控制文件值不同, 则需要介质恢复;再检查控制文件last_change#,如果有数据文件为null则需要实例恢复

可见控制文件中存储了三个CHECKPOINT SCN,比较查询:

select 'controlfile' "SCN location",name,checkpoint_change#
from v$datafile where name like '%users01%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where name like '%users01%';
4) Data file normal checkpoint SCN(存于数据文件)

select FILE#,CHECKPOINT_CHANGE# from v$datafile_header;

5) Redo Log SCN

Redo log并不记录checkpoint scn,使用FIRST_CHANGE#与NEXT_CHANGE#分别表示日志文件开始SCN与下个日志开始SCN. Current log中的NEXT_CHANGE#为无穷大

select sequence#, FIRST_CHANGE#,NEXT_CHANGE# from v$log_history;

Checkpoints

Checkpoint position是一个数据结构,存储于控制文件及每个data file header(并不存于redo log中), 它指向redo中对应位置

A data structure that indicates the checkpoint position, which is the SCN in the redo stream where instance recovery must begin. The checkpoint position acts as a pointer to the redo stream and is stored in the control file and in each data file header. 

Checkpoint主要分为normal checkpoint与Incremental checkpoint. 另外还有其它类型:

Other types of checkpoints include instance and media recovery checkpoints and checkpoints when schema objects are dropped or truncated.

  1. 完全检查点(Normal checkpoint)

Normal checkpoint过程:

  1. CKPT开始一个checkpoint event,记录checkpoint RBA与SCN,通常是redo log当前的RBA
  2. CKPT通知DBW将全部在buffer cache中小于此SCN的dirty blocks写入磁盘
  3. CKPT将checkpoint信息更新到data file header与control file, 即v$datafile_header与v$datafile中的checkpoint_change#

Normal checkpoint分为system checkpoint(写入所有dirty blocks)与tablespace checkpoint(写入对应数据文件的dirty blocks, 也称为 partial checkpoint):

system checkpoint occur in the following situations: 

  1. ALTER SYSTEM CHECKPOINT statement
  2. Online redo log switch
  3. ALTER DATABASE BEGIN BACKUP statement
  4. Consistent database shutdown

LOG SWITCH与其它system checkpoint的过程略有不同:

When you force a log switch, Oracle Database begins to perform a checkpoint but returns control to you immediately rather than when the checkpoint is complete.

Specify CHECKPOINT to explicitly force Oracle Database to perform a checkpoint, ensuring that all changes made by committed transactions are written to data files on disk. Oracle Database does not return control to you until the checkpoint is complete.

Tablespace checkpoint occur in the following situations: 

A tablespace checkpoint is a set of data file checkpoints, one for each data file in the tablespace.

Alter tablespace offline normal statement

Alter tablespace read only

Alter database datafile shrink

Alter tablespace begin backup

注:datafile的相关操作不会触发normal checkpoint,如alter database datafile ... online, 这是为什么在执行online datafile时需要media recovery, 而online tablespace不需要

  1. 增量检查点(Incremental checkpoint)

Incremental checkpoint用于避免在log switches时需要写入大量dirty blocks,它决定instance recovery从哪里开始应用日志,减少instance recovery恢复时间

增量或全量checkpoint会触发DBW写入脏块,DBW写入脏块只是推进增量checkpoint,但不会使它发生。增量检查点由CKPT决定,它将增量检查点写入控制文件而不写入data file header

An incremental checkpoint is intended to avoid writing large numbers of blocks at online redo log switches. DBW checks at least every three seconds to determine whether it has work to do. When DBW writes dirty buffers, it advances推进 the checkpoint position, causing CKPT to write the checkpoint position to the control file, but not to the data file headers.

关于data buffer中的链表:

  1. Cache buff chain(CBC): 以块地址的方式把buffer cache的所有块链起来,用于地址查找
  2. Least recently used(LRU): 把used blocks按使用次数多少链起来,用于判断是否可用于读入新的data blocks
  3. LRUW: 按使用次数链接commit blocks,用于DBWn写入disk,使用次数多的脏块会慢于使用次数少的脏块的写入磁盘, 它不是按SCN写blocks,早commit的blocks不一定先写disk
  4. Buffer checkpoint Queues (BCQ): 为了实现增量检查点8i时加入了BCQ(normal checkpoint应该不用BCQ,它使用的是SCN),它把所有commit blocks按LRBA顺序链起来,即队列尾部为最小LRBA, 这样如果dirty blocks再被修改也不会更改BCQ顺序

normal/incremental checkpoint均会触发DBW, DBW不会触发CKPT

  1. Normal checkpoint是写入所有dirty blocks
  2. incremental checkpoint是CKPT每3秒查看BCQ dirty blocks数量判断是否需要进行incremental checkpoint, 只在需要时才会触发DBW
  3. DBW是按LRUW部分写入commit blocks,commit blocks也会包含uncommitted data, DBW写入disk的commit blocks并不会对BCQ相同blocks产生影响, 即DBW只定义如何更高效写blocks,并不定义如何让数据库恢复到一致状态

Incremental checkpoint过程

CKPT会每3秒对BCQ的脏块数量进行检查,如果需要会进行Incremental checkpoint,会触发DBW从BCQ尾部(不是LRUW)开始写入部分dirty block到disk,CKPT把最后写入blocks的LRBA记录为增量检查点存于control file,但不会写入data file header

增量检查点保证在此之前commit的数据均写入disk. 在实例恢复时会到控制文件找到增量检查点对应的LRBA开始应用日志,此时disk中blocks是有比此LRBA更改新的块的,因此应用日志时还会查看blocks的实际SCN

The checkpoint position guarantees that every committed change with an SCN lower than the checkpoint SCN is saved to the data files. During instance recovery, the database must apply the changes that occur between the checkpoint position and the end of the redo thread.

As shown in Figure 13-5, some changes may already have been written to the data files. However, only changes with SCNs lower than the checkpoint position are guaranteed to be on disk.


Incremental checkpoint条件

  1. 满足实例恢复要求相关参数:具体包括FAST_START_MTTR_TARGET, LOG_CHECKPOINT_INTERVAL, LOG_CHECKPOINT_TIMEOUT, FAST_START_IO_TARGET
  2. 最小的日志文件大小smallest redo log(oracle的内部参数,默认为90%,即当日志写到90%时自动触发增量检查点)

  1. 额外说明
  1. ALTER SYSTEM SET LOG_checkpointS_TO_ALERT=TRUE;   

LOG_CHECKPOINTS_TO_ALERT lets you log your checkpoints to the alert log. Doing so is useful for determining whether checkpoints are occurring at the desired frequency. Default False.

当前所知只有在LOG_checkpoint_TIMEOUT设置了非0值之后触发的增量checkpoint会在alert文件中有记录,其他条件触发的增量checkpoint都不会记录在alert文件中

Normal checkpoint:

2020-12-04T13:39:13.465973+08:00

Beginning global checkpoint up to RBA [0x23.2.10], SCN: 4500464

Completed checkpoint up to RBA [0x23.2.10], SCN: 4500464

incremental checkpoint:

2020-12-04T14:19:08.018783+08:00

Incremental checkpoint up to RBA [0x24.e640.0], current log tail at RBA [0x24.e646.0]

可见normal checkpoint会使用SCN与RBA,而incremental checkpoint只使用RBA

  1. alter system switch logfile;

日志切换触发的normal checkpoint是一个优先级别比较低的操作,它不会立即完成,而是在切换日志完成后进行

2020-12-04T14:29:48.549990+08:00

Beginning log switch checkpoint up to RBA [0x25.2.10], SCN: 4502999

2020-12-04T14:29:48.550092+08:00

Thread 1 advanced to log sequence 37 (LGWR switch)

  Current log# 1 seq# 37 mem# 0: /u01/app/oracle/oradata/orcl/redo01.log

2020-12-04T14:31:01.386741+08:00

Completed checkpoint up to RBA [0x25.2.10], SCN: 4502999

可以发现在日志切换时先记录切换时SCN,然后进行切换日志,然后将小于此SCN的dirty block写入disk, 最后在2分钟后完成checkpoint,把开始记录的SCN写于控制及数据文件中。

切换日志从current变为active,在checkpoint完成后变为inactive.

  1. 相关X$视图

X$BH用于查看脏块的LRBA和HRBA(There is also a recovery RBA which is used to record the progress of partial block recovery by PMON)

X$TARGETRBA查看增量checkpoint RBA,target RBA和on-disk RBA。

X$KCCCP这里面也有增量checkpoint RBA,target RBA的信息。

X$KCCRT完全checkpoint(full thread checkpoint)RBA信息。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值