http://www.hellodba.com/reader.php?ID=28&lang=cn
Redo log 是用于恢复和一个高级特性的重要数据,一个redo条目包含了相应操作导致的数据库变化的所有信息,所有redo条目最终都要被写入redo文件中去。 Redo log buffer是为了避免Redo文件IO导致性能瓶颈而在sga中分配出的一块内存。一个redo条目首先在用户内存(PGA)中产生,然后由 oracle服务进程拷贝到log buffer中,当满足一定条件时,再由LGWR进程写入redo文件。由于log buffer是一块“共享”内存,为了避免冲突,它是受到redo allocation latch保护的,每个服务进程需要先获取到该latch才能分配redo buffer。因此在高并发且数据修改频繁的oltp系统中,我们通常可以观察到redo allocation latch的等待。Redo写入redo buffer的整个过程如下:
在PGA中生产Redo Enrey -> 服务进程获取Redo Copy latch(存在多个---CPU_COUNT*2) -> 服务进程获取redo allocation latch(仅1个) -> 分配log buffer -> 释放redo allocation latch -> 将Redo Entry写入Log Buffer -> 释放Redo Copy latch;
shared strand
为了减少redo allocation latch等待,在oracle 9.2中,引入了log buffer的并行机制。其基本原理就是,将log buffer划分为多个小的buffer,这些小的buffer被成为strand(为了和之后出现的private strand区别,它们被称之为shared strand)。每一个strand受到一个单独redo allocation latch的保护。多个shared strand的出现,使原来序列化的redo buffer分配变成了并行的过程,从而减少了redo allocation latch等待。
shared strand的初始数据量是由参数log_parallelism控制的;在10g中,该参数成为隐含参数,并新增参数 _log_parallelism_max控制shared strand的最大数量;_log_parallelism_dynamic则控制是否允许shared strand数量在_log_parallelism和_log_parallelism_max之间动态变化。
- HELLODBA.COM>select nam.ksppinm, val.KSPPSTVL, nam.ksppdesc
- 2 from sys.x$ksppi nam,
- 3 sys.x$ksppsv val
- 4 where nam.indx = val.indx
- 5 --AND nam.ksppinm LIKE '_%'
- 6 AND upper(nam.ksppinm) LIKE '%LOG_PARALLE%';
- KSPPINM KSPPSTVL KSPPDESC
- -------------------------- ---------- ------------------------------------------
- _log_parallelism 1 Number of log buffer strands
- _log_parallelism_max 2 Maximum number of log buffer strands
- _log_parallelism_dynamic TRUE Enable dynamic strands
每一个shared strand的大小 = log_buffer/(shared strand数量)。strand信息可以由表x$kcrfstrand查到(包含shared strand和后面介绍的private strand,10g以后存在)。
- HELLODBA.COM>select indx,strand_size_kcrfa from x$kcrfstrand where last_buf_kcrfa != '00';
- INDX STRAND_SIZE_KCRFA
- ---------- -----------------
- 0 3514368
- 1 3514368
- HELLODBA.COM>show parameter log_buffer
- NAME TYPE VALUE
- ------------------------------------ ----------- ------------------------------
- log_buffer integer 7028736
关于shared strand的数量设置,16个cpu之内最大默认为2,当系统中存在redo allocation latch等待时,每增加16个cpu可以考虑增加1个strand,最大不应该超过8。并且_log_parallelism_max不允许大于 cpu_count。
注意:在11g中,参数_log_parallelism被取消,shared strand数量由_log_parallelism_max、_log_parallelism_dynamic和cpu_count控制。
Private strand
为了进一步降低redo buffer冲突,在10g中引入了新的strand机制——Private strand。Private strand不是从log buffer中划分的,而是在shared pool中分配的一块内存空间。
- HELLODBA.COM>select * from V$sgastat where name like '%strand%';
- POOL NAME BYTES
- ------------ -------------------------- ----------
- shared pool private strands 2684928
- HELLODBA.COM>select indx,strand_size_kcrfa from x$kcrfstrand where last_buf_kcrfa = '00';
- INDX STRAND_SIZE_KCRFA
- ---------- -----------------
- 2 66560
- 3 66560
- 4 66560
- 5 66560
- 6 66560
- 7 66560
- 8 66560
- ...
Private strand的引入为Oracle的Redo/Undo机制带来很大的变化。每一个Private strand受到一个单独的redo allocation latch保护,每个Private strand作为“私有的”strand只会服务于一个活动事务。获取到了Private strand的用户事务不是在PGA中而是在Private strand生成Redo,当flush private strand或者commit时,Private strand被批量写入log文件中。如果新事务申请不到Private strand的redo allocation latch,则会继续遵循旧的redo buffer机制,申请写入shared strand中。事务是否使用Private strand,可以由x$ktcxb的字段ktcxbflg的新增的第13位鉴定:
- HELLODBA.COM>select decode(bitand(ktcxbflg, 4096),0,1,0) used_private_strand, count(*)
- 2 from x$ktcxb
- 3 where bitand(ksspaflg, 1) != 0
- 4 and bitand(ktcxbflg, 2) != 0
- 5 group by bitand(ktcxbflg, 4096);
- USED_PRIVATE_STRAND COUNT(*)
- ------------------- ----------
- 1 10
- 0 1
对于使用Private strand的事务,无需先申请Redo Copy Latch,也无需申请Shared Strand的redo allocation latch,而是flush或commit是批量写入磁盘,因此减少了Redo Copy Latch和redo allocation latch申请/释放次数、也减少了这些latch的等待,从而降低了CPU的负荷。过程如下:
事务开始 -> 申请Private strand的redo allocation latch (申请失败则申请Shared Strand的redo allocation latch) -> 在Private strand中生产Redo Enrey -> Flush/Commit -> 申请Redo Copy Latch -> 服务进程将Redo Entry批量写入Log File -> 释放Redo Copy Latch -> 释放Private strand的redo allocation latch
注意:对于未能获取到Private strand的redo allocation latch的事务,在事务结束前,即使已经有其它事务释放了Private strand,也不会再申请Private strand了。
每个Private strand的大小为65K。10g中,shared pool中的Private strands的大小就是活跃会话数乘以65K,而11g中,在shared pool中需要为每个Private strand额外分配4k的管理空间,即:数量*69k。
- --10g:
- SQL> select * from V$sgastat where name like '%strand%';
- POOL NAME BYTES
- ------------ -------------------------- ----------
- shared pool private strands 1198080
- HELLODBA.COM>select trunc(value * KSPPSTVL / 100) * 65 * 1024
- 2 from (select value from v$parameter where name = 'transactions') a,
- 3 (select val.KSPPSTVL
- 4 from sys.x$ksppi nam, sys.x$ksppsv val
- 5 where nam.indx = val.indx
- 6 AND nam.ksppinm = '_log_private_parallelism_mul') b;
- TRUNC(VALUE*KSPPSTVL/100)*65*1024
- -------------------------------------
- 1198080
- --11g:
- HELLODBA.COM>select * from V$sgastat where name like '%strand%';
- POOL NAME BYTES
- ------------ -------------------------- ----------
- shared pool private strands 706560
- HELLODBA.COM>select trunc(value * KSPPSTVL / 100) * (65 + 4) * 1024
- 2 from (select value from v$parameter where name = 'transactions') a,
- 3 (select val.KSPPSTVL
- 4 from sys.x$ksppi nam, sys.x$ksppsv val
- 5 where nam.indx = val.indx
- 6 AND nam.ksppinm = '_log_private_parallelism_mul') b;
- TRUNC(VALUE*KSPPSTVL/100)*(65+4)*1024
- -------------------------------------
- 706560
Private strand的数量受到2个方面的影响:logfile的大小和活跃事务数量。
参数_log_private_mul指定了使用多少logfile空间预分配给Private strand,默认为5。我们可以根据当前logfile的大小(要除去预分配给log buffer的空间)计算出这一约束条件下能够预分配多少个Private strand:
- HELLODBA.COM>select bytes from v$log where status = 'CURRENT';
- BYTES
- ----------
- 52428800
- HELLODBA.COM>select trunc(((select bytes from v$log where status = 'CURRENT') - (select to_number(value) from v$parameter where name = 'log_buffer'))*
- 2 (select to_number(val.KSPPSTVL)
- 3 from sys.x$ksppi nam, sys.x$ksppsv val
- 4 where nam.indx = val.indx
- 5 AND nam.ksppinm = '_log_private_mul') / 100 / 66560)
- 6 as "calculated private strands"
- 7 from dual;
- calculated private strands
- --------------------------
- 5
- HELLODBA.COM>select count(1) "actual private strands" from x$kcrfstrand where last_buf_kcrfa = '00';
- actual private strands
- ----------------------
- 5
当logfile切换后(和checkpoint一样,切换之前必须要将所有Private strand的内容flush到logfile中,因此我们在alert log中可能会发现日志切换信息之前会有这样的信息:"Private strand flush not complete",这是可以被忽略的),会重新根据切换后的logfile的大小计算对Private strand的限制:
- HELLODBA.COM>alter system switch logfile;
- System altered.
- HELLODBA.COM>select bytes from v$log where status = 'CURRENT';
- BYTES
- ----------
- 104857600
- HELLODBA.COM>select trunc(((select bytes from v$log where status = 'CURRENT') - (select to_number(value) from v$parameter where name = 'log_buffer'))*
- 2 (select to_number(val.KSPPSTVL)
- 3 from sys.x$ksppi nam, sys.x$ksppsv val
- 4 where nam.indx = val.indx
- 5 AND nam.ksppinm = '_log_private_mul') / 100 / 66560)
- 6 as "calculated private strands"
- 7 from dual;
- calculated private strands
- --------------------------
- 13
- HELLODBA.COM>select count(1) "actual private strands" from x$kcrfstrand where last_buf_kcrfa = '00';
- actual private strands
- ----------------------
- 13
参数_log_private_parallelism_mul用于推算活跃事务数量在最大事务数量中的百分比,默认为10。Private strand的数量不能大于活跃事务的数量。
- HELLODBA.COM>show parameter transactions
- NAME TYPE VALUE
- ------------------------------------ ----------- ------------------------------
- transactions integer 222
- transactions_per_rollback_segment integer 5
- HELLODBA.COM>select trunc((select to_number(value) from v$parameter where name = 'transactions') *
- 2 (select to_number(val.KSPPSTVL)
- 3 from sys.x$ksppi nam, sys.x$ksppsv val
- 4 where nam.indx = val.indx
- 5 AND nam.ksppinm = '_log_private_parallelism_mul') / 100 )
- 6 as "calculated private strands"
- 7 from dual;
- calculated private strands
- --------------------------
- 22
- HELLODBA.COM>select count(1) "actual private strands" from x$kcrfstrand where last_buf_kcrfa = '00';
- actual private strands
- ----------------------
- 22
注:在预分配Private strand时,会选择上述2个条件限制下最小一个数量。但相应的shared pool的内存分配和redo allocation latch的数量是按照活跃事务数预分配的。
因此,如果logfile足够大,_log_private_parallelism_mul与实际活跃进程百分比基本相符的话,Private strand的引入基本可以消除redo allocation latch的争用问题。
--- Fuyuncat Mark ---
新知识点2:In Memory Undo
http://tech.it168.com/a2009/1118/811/000000811474.shtml
【IT168 技术文档】IMU是10g引入的一项新技术,并且是Oracle的专利技术。但是,在10g中似乎没有完全激活,以下的测试在10.2.0.3中无法通过,在11g中可以进行。
在传统的事务更新过程中,如果一条数据记录被更新,就会从buffer cache中读取/分配一块UNDO数据块,并且立即会写入一条UNDO条目。如果同一个事务中有多条记录被更新,则undo buffer数据块中就会写入多条undo条目。引入IMU后,会从shared pool中分配出一个新的内存池——IMU pool。当一条数据记录被更新,仍然会从buffer cache中读取/分配一块undo数据块,但是,这块undo块并不会立即被更新,而是会在IMU pool中产生一个IMU node,IMU节点通过IMU map与数据记录更新对应。如果事务中有多条记录被修改,则IMU pool中就生产多个IMU nodes,而buffer中的undo block不会发生任何变化。当发生IMU commit或IMU flush时,才会通过IMU map将这些IMU node记录的undo信息写入undo buffer block中。并且,所有这些redo信息会和commit vector一起作为一个Redo条目写入Redo log中。整个过程中UNDO所产生的redo信息则大大减少。
隐含参数 _in_memory_undo用于控制IMU特性的开关,可以在会话/系统级立即生效,默认为true。另外一个隐含参数_IMU_pools则控制 IMU pool的数量,默认为3。此外,目前IMU的使用还存在一些限制,如undo管理方式(undo_management)必须为auto,在RAC中无 效,
HELLODBA.COM > create table ttt (a number , b varchar2 ( 20 ));
Table created.
HELLODBA.COM > begin
2 for i in 1 .. 2000 loop
3 insert into ttt values (i, '' || i);
4 end loop;
5 commit ;
6 end ;
7 /
PL / SQL procedure successfully completed.
HELLODBA.COM > select a
2 from ( select a, dbms_rowid.rowid_block_number(ROWID) block_id, lag(dbms_rowid.rowid_block_number(ROWID)) over ( order by rowid) as pre_block_id from ttt)
3 where block_id != pre_block_id;
A
-- --------
1124
1643
1
IMU Commit
让我们看下IMU commit与传统事务commit时产生的redo size的变化。首先看传统模式下,
Connected.
HELLODBA.COM > alter session set "_in_memory_undo" = false;
Session altered.
HELLODBA.COM > update ttt set b = ' X ' where a = 1124 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# and b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 4
redo size 1600
IMU commits 0
HELLODBA.COM > update ttt set b = ' Y ' where a = 1643 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# and b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 5
redo size 1960
IMU commits 0
HELLODBA.COM > update ttt set b = ' Z ' where a = 1 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# and b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 6
redo size 2320
IMU commits 0
HELLODBA.COM > commit ;
Commit complete.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# and b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 7
redo size 2416
IMU commits 0
可以看到,每一条数据被update都产生一条redo 条目。
然后,我们激活IMU,再重复上述事务过程,
Connected.
HELLODBA.COM > alter session set "_in_memory_undo" = true;
Session altered.
HELLODBA.COM > update ttt set b = ' X ' where a = 1124 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 3
redo size 1084
IMU commits 0
HELLODBA.COM > update ttt set b = ' Y ' where a = 1643 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 3
redo size 1084
IMU commits 0
HELLODBA.COM > update ttt set b = ' Z ' where a = 1 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 3
redo size 1084
IMU commits 0
HELLODBA.COM > commit ;
Commit complete.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 4
redo size 2176
IMU commits 1
可见redo数量并没有随着数据的更新而增加,而是在IMU commit时增加。而当1条DML语句更新多条记录时,也可以使用到IMU:
Connected.
HELLODBA.COM > alter session set "_in_memory_undo" = true;
Session altered.
HELLODBA.COM > update ttt set b = ' X ' where a in ( 1643 , 1124 , 1 );
3 rows updated.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 3
redo size 1084
IMU commits 0
HELLODBA.COM > commit ;
Commit complete.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# an
d b.name in ( ' redo entries ' , ' redo size ' , ' IMU commits ' );
NAME VALUE
-- -------------------------------------------------------------- ----------
redo entries 4
redo size 2344
IMU commits 1
从上面的例子中你也许注意到了,尽管UPDATE过程中redo size没有变化,但是,在IMU commit时,redo size的变化却很大,比传统模式下的commit产生的redo大许多。这是因为在IMU commit中,不仅仅包含了commit vector,还包含了commit之前数据变化,并且这些redo数据的写入是一次批量写入。我们可以将这个redo条目dump出来观察其内容:
Connected.
HELLODBA.COM > set serveroutput on
HELLODBA.COM > var v_bt number ;
HELLODBA.COM > var v_et number ;
HELLODBA.COM > alter session set "_in_memory_undo" = false;
Session altered.
HELLODBA.COM > update tt set x = 1 where rownum <= 1 ;
1 row updated.
HELLODBA.COM > update tt set x = 2 where rownum <= 1 ;
1 row updated.
HELLODBA.COM > update tt set x = 3 where rownum <= 1 ;
1 row updated.
HELLODBA.COM > begin
2 select current_scn into :v_bt from v$ database ;
3 dbms_output.put_line( '' || :v_bt);
4 end ;
5 /
6328064
PL / SQL procedure successfully completed.
HELLODBA.COM > commit ;
Commit complete.
HELLODBA.COM > begin
2 select current_scn into :v_et from v$ database ;
3 dbms_output.put_line( '' || :v_et);
4 end ;
5 /
6328067
PL / SQL procedure successfully completed.
HELLODBA.COM > declare
2 v_log varchar2 ( 2000 );
3 v_sql varchar2 ( 4000 );
4 begin
5 select a.member into v_log from v$logfile a, v$ log b where a. group # = b. group # and b.status = ' CUR
RENT ' and rownum <= 1 ;
6 execute immediate ' alter system switch logfile ' ;
7 v_sql : = ' alter system dump logfile ''' || v_log || ''' SCN MIN ' || :v_bt || ' SCN MAX ' || :v_et;
8 execute immediate v_sql;
9 end ;
10 /
PL / SQL procedure successfully completed.
可以看到,在trace日志中,这一redo条目包含了多个change:
REDO RECORD - Thread: 1 RBA: 0x0000c8 .00000f39. 0010 LEN : 0x046c VLD: 0x0dSCN: 0x0000 .00608ed4 SUBSCN: 1 11 / 16 / 2009 14 : 59 : 10
CHANGE # 1 TYP: 2 CLS: 1 AFN: 4 DBA: 0x010016cf OBJ: 74952 SCN: 0x0000 .00602dc7 SEQ: 4 OP: 11.19
KTB Redo
...
CHANGE # 2 TYP: 0 CLS: 17 AFN: 3 DBA: 0x00c00009 OBJ: 4294967295 SCN: 0x0000 .00608e9b SEQ: 2 OP: 5.2
...
CHANGE # 8 TYP: 0 CLS: 18 AFN: 3 DBA: 0x00c006f7 OBJ: 4294967295 SCN: 0x0000 .00608ed4 SEQ: 2 OP: 5.1
...
IMU Flush
IMU pool也是按照LRU算法管理的。当IMU pool没有足够空闲内存可 分配时,会将buffer链上LRU段的buffer块flush出来。其他一些事件也会导致IMU flush的发生,如switch logfile、rollback。但是,尽管IMU pool是从shared pool中分配的,手动flush shared pool并不会导致IMU flush。当IMU flush发生时,也会将undo、redo数据批量写入。
Connected.
HELLODBA.COM > alter session set "_in_memory_undo" = true;
Session altered.
HELLODBA.COM > update tt set x = 1 ;
1 row updated.
HELLODBA.COM > update tt set x = 2 ;
1 row updated.
HELLODBA.COM > update tt set x = 3 ;
1 row updated.
HELLODBA.COM > select b.name, a.value from v$sysstat a, v$statname b where a.statistic# = b.statistic# and b.name like ' %IMU% ' ;
NAME VALUE
-- -------------------------------------------------------------- ----------
IMU commits 320
IMU Flushes 159
IMU contention 19
...
13 rows selected.
HELLODBA.COM > alter system switch logfile;
System altered.
HELLODBA.COM > select b.name, a.value from v$sysstat a, v$statname b where a.statistic# = b.statistic# and b.name like ' %IMU% ' ;
NAME VALUE
-- -------------------------------------------------------------- ----------
IMU commits 320
IMU Flushes 160
IMU contention 20
...
13 rows selected.
Tips: 通过dump出事务的undo block,可以比较IMU commit/flush前后undo block的变化——commit/flush之前没有写入数据。
IMU CR
在传统事务中,需要进行一致性读时,会从相应的UNDO数据块中读入undo数据进行undo操作。而在IMU中,在发生IMU commit或IMU flush之前,这些undo数据并未写入UNDO数据块中,此时一致性读就从IMU pool中读取相应的IMU node中的undo信息。
HELLODBA.COM > conn demo / demo @ora11
Connected.
HELLODBA.COM > alter session set "_in_memory_undo" = true;
Session altered.
HELLODBA.COM > update tt set x = 1 ;
1 row updated.
HELLODBA.COM > update tt set x = 2 ;
1 row updated.
HELLODBA.COM > update tt set x = 3 ;
1 row updated.
-- Session 2:
HELLODBA.COM > conn demo / demo @ora11
Connected.
HELLODBA.COM > alter system flush buffer_cache;
System altered.
HELLODBA.COM > alter session set tracefile_identifier = IMU_CR;
Session altered.
HELLODBA.COM > alter session set events ' 10046 trace name context forever, level 8 ' ;
Session altered.
HELLODBA.COM > select * from tt;
X
-- --------
3
HELLODBA.COM > alter session set events ' 10046 trace name context off ' ;
Session altered.
HELLODBA.COM > select b.name, a.value from v$mystat a, v$statname b where a.statistic# = b.statistic# and b.name like ' %IMU% ' ;
NAME VALUE
-- -------------------------------------------------------------- ----------
...
IMU CR rollbacks 3
...
13 rows selected.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/24383181/viewspace-714487/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/24383181/viewspace-714487/