1)about instance recovery
当数据库突然宕机(如shutdown abort)或系统突然停电宕机,那么数据库实例就会自动的进行应用redo log进行crash recovery。当数据库正常关闭如(shutdown immediate,shutdown,shutdown transactional)那么系统dbwr就会把内存的数据在进行checkpoint之前写入到数据文件磁盘。
然而,如果一个单实例或是多实例(如rac)突然crash了。那么数据库将在下一次启动的世界进行数据库实例恢复,这个回复是通过pmon进行的。当在rac环境下存在多个instance,当一个或是多个实例crash了,那么存活的实例将会进行自动的实例恢复操作。实例恢复分为两个步骤,一个是cache recovery 然后是transaction recovery。
注意只要完成了个了cache recovery 那么数据库就会被打开了,这个地方需要注意,所以提高cache recovery的效率对高可用性非常的重要。
明白几个数语。
1、cache recovery
cache recovery 或者叫做前滚,当数据库在进行恢复的步骤时,它会应用所有已经提交的或是没有提交的redo log 文件去改变数据块。这个时间受检查点到宕机的时间和执行的事务大小有关系。
2、transaction recovery
transaction recovery 又叫回滚,它是为了让数据库达到一致性,没有进行提交的事务认为是无效的需要undone,在这个过程中oracle应用回滚段去重做没有提交的数据改变。
(这里就先不介绍redo和undo的知识了,后续这块是有的。)
2)检查点和cache recovery
定期的,oracle会进行检查点,检查点事最高的scn,例如所有的数据块的scn都要小于或者等于这个scn,如果等于这个scn那么说明所有的数据改变就会写入到了数据文件中了。如果故障发生了,然后只有这个日志记录包含了scn高于检查点需要应用日志进行recovery,恢复的持续时间有两个因素决定,一个是所有改变的数据块高于检查点的scn的数目,另一个是日志块数目需要进行读取和应用这些改变。
3)how checkpoints affect performance
Frequent checkpointing writes dirty buffers to the data files more often than otherwise, and so reduces cache recovery time in the event of an instance failure. If checkpointing is frequent, then applying the redo records in the redo log between the current checkpoint position and the end of the log involves processing relatively few data blocks. This means that the cache recovery phase of recovery is fairly short.
总结一下:可以看出,减少cache recovery的时间,需要经常的让数据库进行检查点,然后可以应用少量的日志去进行故障恢复。但是在一个高的update系统中,频繁的checkpoint 势必会加剧数据库的性能开销。
如果系统运作效率更加重要,那么就需要减少checkpoint给系统带啦的压力了,那么这也会增大了系统在故障后进行的实例恢复时间了。
那么这个时候呢,我们在分析一下:fast_start_mttr_target参数的影响。
开始进入我今天要学习的fast_start_mttr_target 参数的设置与优化影响。
先看一下官方文档对该参数的简单描述:
FAST_START_MTTR_TARGET
Property | Description |
---|---|
Parameter type | Integer |
Default value | 0 |
Modifiable | ALTER SYSTEM |
Range of values | 0 to 3600 seconds |
Basic | No |
Oracle RAC | Multiple instances can have different values, and you can change the values at runtime. |
FAST_START_MTTR_TARGET
enables you to specify the number of seconds the database takes to perform crash recovery of a single instance. When specified, FAST_START_MTTR_TARGET
is overridden by LOG_CHECKPOINT_INTERVAL
.
从这点我们了解到,value最大设置为1小时,这个参数需要用alter system进行设置等等。
在看一下官网是怎么解释的,呵呵:
The Fast-Start Fault Recovery feature reduces the time required for cache recovery, and makes the recovery bounded and predictable by limiting the number of dirty buffers and the number of redo records generated between the most recent redo record and the last checkpoint.
The foundation of Fast-Start Fault Recovery is the Fast-Start checkpointing architecture. Instead of conventional event-driven (that is, log switching) checkpointing, which does bulk writes, fast-start checkpointing occurs incrementally. Each DBWn process periodically writes buffers to disk to advance the checkpoint position. The oldest modified blocks are written first to ensure that every write lets the checkpoint advance. Fast-Start checkpointing eliminates bulk writes and the resultant I/O spikes that occur with conventional checkpointing.
With the Fast-Start Fault Recovery feature, the FAST_START_MTTR_TARGET
initialization parameter simplifies the configuration of recovery time from instance or system failure. FAST_START_MTTR_TARGET
specifies a target for the expected mean time to recover (MTTR), that is, the time (in seconds) that it should take to start up the instance and perform cache recovery. After FAST_START_MTTR_TARGET
is set, the database manages incremental checkpoint writes in an attempt to meet that target. If you have chosen a practical value for FAST_START_MTTR_TARGET
, you can expect your database to recover, on average, in approximately the number of seconds you have chosen.
Note:
You must disable or remove theFAST_START_IO_TARGET
,
LOG_CHECKPOINT_INTERVAL
, and
LOG_CHECKPOINT_TIMEOUT
initialization parameters when using
FAST_START_MTTR_TARGET
. Setting these parameters interferes with the mechanisms used to manage cache recovery time to meet
FAST_START_MTTR_TARGET.
总结一下:你可以给这个参数设置一个value,然后呢数据库会根据这个value进行limit checkpoint之前的diry buffer number和redo number。然后呢这个参数可以优化了以前常规检查点(完全检查点)的一些弊端,而是采用了增量检查点机制,当然了这个参数也有一些注意的地方。(简略了,文档谁都能看懂)。
然后设置这个值有需要看看下面这个说法:
In principle, the minimum value for FAST_START_MTTR_TARGET
is one second. However, the fact that you can set FAST_START_MTTR_TARGET
this low does not mean that this target can be achieved. There are practical limits to the minimum achievable MTTR target, due to such factors as database startup time.
修改这个参数使用如下命令:
alter system set fast_start_mttr_target=value;
The MTTR target that your database can achieve given the current value of FAST_START_MTTR_TARGET
is called the effective MTTR target. You can view your current effective MTTR by viewing the TARGET_MTTR
column of the V$INSTANCE_RECOVERY
view.
LOG_CHECKPOINT_INTERVAL
Property | Description |
---|---|
Parameter type | Integer |
Default value | 0 |
Modifiable | ALTER SYSTEM |
Range of values | 0 to 231 - 1 |
Basic | No |
Oracle RAC | Multiple instances can have different values. |
LOG_CHECKPOINT_INTERVAL
specifies the frequency of checkpoints in terms of the number of redo log file blocks that can exist between an incremental checkpoint and the last block written to the redo log. This number refers to physical operating system blocks, not database blocks.
Regardless of this value, a checkpoint always occurs when switching from one online redo log file to another. Therefore, if the value exceeds the actual redo log file size, checkpoints occur only when switching logs. Checkpoint frequency is one of the factors that influence the time required for the database to recover from an unexpected failure.
Notes:
-
Specifying a value of
0
(zero) forLOG_CHECKPOINT_INTERVAL
has the same effect as setting the parameter to infinity and causes the parameter to be ignored. Only nonzero values of this parameter are considered meaningful. -
Recovery I/O can also be limited by setting the
LOG_CHECKPOINT_TIMEOUT
parameter or by the size specified for the smallest redo log. For information on which mechanism is controlling checkpointing behavior, query theV$INSTANCE_RECOVERY
view.
Property | Description |
---|---|
Parameter type | Integer |
Default value | 1800 |
Modifiable | ALTER SYSTEM |
Range of values | 0 to 231 - 1 |
Basic | No |
Oracle RAC | Multiple instances can have different values. |
LOG_CHECKPOINT_TIMEOUT
specifies (in seconds) the amount of time that has passed since the incremental checkpoint at the position where the last write to the redo log (sometimes called the tail of the log) occurred. This parameter also signifies that no buffer will remain dirty (in the cache) for more than integerseconds.
Specifying a value of 0
for the timeout disables time-based checkpoints. Hence, setting the value to 0
is not recommended unless FAST_START_MTTR_TARGET
is set.
Notes:
-
A checkpoint scheduled to occur because of this parameter is delayed until the completion of the previous checkpoint if the previous checkpoint has not yet completed.
-
Recovery I/O can also be limited by setting the
LOG_CHECKPOINT_INTERVAL
parameter or by the size specified for the smallest redo log. For information on which mechanism is controlling checkpointing behavior, query theV$INSTANCE_RECOVERY
view.
FAST_START_PARALLEL_ROLLBACK
Property | Description |
---|---|
Parameter type | String |
Syntax | FAST_START_PARALLEL_ROLLBACK = { HIGH | LOW | FALSE } |
Default value | LOW |
Modifiable | ALTER SYSTEM |
Basic | No |
FAST_START_PARALLEL_ROLLBACK
specifies the degree of parallelism used when recovering terminated transactions. Terminated transactions are transactions that are active before a system failure. If a system fails when there are uncommitted parallel DML or DDL transactions, then you can speed up transaction recovery during startup by using this parameter.
Values:
-
FALSE
Parallel rollback is disabled
-
LOW
Limits the maximum degree of parallelism to 2 *
CPU_COUNT
-
HIGH
Limits the maximum degree of parallelism to 4 *
CPU_COUNT
If you change the value of this parameter, then transaction recovery will be stopped and restarted with the new implied degree of parallelism.
SQL> create table amy as select * from scott.emp;
Table created.
SQL> begin
2 for i in 1..20 loop
3 insert into amy select * from amy;
4 end loop;
5 end;
6 /
PL/SQL procedure successfully completed.
SQL> select count(*) from amy;
COUNT(*)
----------
14680064
SQL>
SQL> show parameter fast_start_
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_io_target integer 0
fast_start_mttr_target integer 0
fast_start_parallel_rollback string LOW
SQL>
SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 1
SQL>
SQL> delete from amy;
另一个session:
Connected.
SQL>
SQL> shutdown abort;
ORACLE instance shut down.
SQL>
SQL> startup
ORACLE instance started.
Total System Global Area 471830528 bytes
Fixed Size 2254344 bytes
Variable Size 197134840 bytes
Database Buffers 268435456 bytes
Redo Buffers 4005888 bytes
Database mounted.
Database opened.
SQL> select usn,extents,rssize,writes,xacts,status from v$rollstat where xacts>0;
USN EXTENTS RSSIZE WRITES XACTS STATUS
---------- ---------- ---------- ---------- ---------- ---------------
5 206 526245888 0 1 ONLINE
SQL> select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions;
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL
---------- ---------- ---------- ---------------- -------------- ---------------
CPUTIME
----------
5 27 1329 RECOVERING 19032 64072
39
SQL> set linesize 200
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 19582 64072 44
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 19620 64072 47
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 20383 64072 48
SQL>
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 20416 64072 49
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 20535 64072 51
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 20766 64072 52
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 20961 64072 53
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 21160 64072 54
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 21550 64072 56
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 21937 64072 57
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 22225 64072 58
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 22245 64072 59
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 22491 64072 60
SQL> show parameter fast_start
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_io_target integer 0
fast_start_mttr_target integer 0
fast_start_parallel_rollback string LOW
SQL> alter system set fast_start_parallel_rollback=high;
System altered.
SQL> select usn,extents,rssize,writes,xacts,status from v$rollstat where xacts>0;
select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions;
USN EXTENTS RSSIZE WRITES XACTS STATUS
---------- ---------- ---------- ---------- ---------- ---------------
5 206 526245888 0 1 ONLINE
SQL> SQL>
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 626 35288 6
SQL>
SQL> select * from v$fast_start_servers;
STATE UNDOBLOCKSDONE PID XID
----------- -------------- ---------- ----------------
RECOVERING 626 24 05001B0031050000
RECOVERING 0 29 05001B0031050000
RECOVERING 0 31 05001B0031050000
RECOVERING 0 32 05001B0031050000
SQL> select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions;
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 2288 35288 14
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 2530 35288 15
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 2883 35288 16
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 2948 35288 18
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 3091 35288 19
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 3232 35288 21
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 3397 35288 22
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 4257 35288 26
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 4374 35288 27
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 4582 35288 28
SQL> r
1* select usn,slt,seq,state,undoblocksdone,undoblockstotal,cputime from v$fast_start_transactions
USN SLT SEQ STATE UNDOBLOCKSDONE UNDOBLOCKSTOTAL CPUTIME
---------- ---------- ---------- ---------------- -------------- --------------- ----------
5 27 1329 RECOVERING 4811 35288 29
SQL>
可以看到设置high后,数据库的恢复显然要 高出很多。恢复的时间也大大缩短了。
that's all!