为什么测试
对于大批量并行插入的表来说,sequence基础上建了的主键,因为索引的“向右”分裂所形成的热点,造成buffer busy wait,会对插入性能有较大的损耗。
反向索引是一个解决方案,但反向索引不支持范围扫描,这也会造成困惑,有人会说,主键上我们又不用范围扫描。但对于插入频繁的表来说,随着时间的推
移,数据量必定巨大。在oltp系统中,必然会将历史数据迁移,一般会采用将历史分区drop的方式来达到快速清理的目的,但如果是反向索引,这将是个艰巨
任务。在oracle 10g中,引入了hash分区索引,我觉得是替代反向索引的一种好方法。
测试思路
通过8个进程并行向三个表中插入数据,比较其插入速度,Segments by Buffer Busy Waits,Transaction/second,CPU time (seconds)等方面的数据进行比较。
测试方法
因为只有并发才会存在争用,如果只是一个进程在运行,我想NOREVERS表现肯定是最好的,因为不管hash还是REVERS都是需要消耗资源的。当然2个并发的,3个并发的。。。每个情况所测试出的结果都是不一致的。选用8个并发也是一个挺极端的情形了(如果有先后就不是并发的,很多情况下,插入并不是并发的)。
测试脚本
13:39:49 david@test>create table t
13:39:51 2 partition by range(id)
13:39:51 3 (
13:39:51 4 partition part1 values less than(2000000),
13:39:51 5 partition part2 values less than(4000000),
13:39:51 6 partition part3 values less than(6000000),
13:39:51 7 partition part4 values less than(8000000),
13:39:51 8 partition part5 values less than(10000000),
13:39:51 9 partition part6 values less than(12000000)
13:39:51 10 )
13:39:51 11 as
13:39:51 12 select 0 id, a.* from all_objects a where 1 = 0;
13:42:24 david@test>alter table T add session_id varchar2(2);
Sequence
13:42:15 david@test>create sequence s cache 1000;主键
13:42:46 david@test>alter table t
13:43:00 2 add constraint t_pk
13:43:00 3 primary key (id)
13:43:00 4 using index (create index t_pk on t(id));存储过程
create or replace procedure do_sql(v_session varchar2)
as
begin
for i in 1..10 loop
for x in ( select rownum r, all_objects.* from all_objects )
loop
insert into t
( id,session_id,OWNER, OBJECT_NAME, SUBOBJECT_NAME,
OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, CREATED,
LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY,
GENERATED, SECONDARY )
values
( s.nextval,v_session, x.OWNER, x.OBJECT_NAME, x.SUBOBJECT_NAME,
x.OBJECT_ID, x.DATA_OBJECT_ID, x.OBJECT_TYPE, x.CREATED,
x.LAST_DDL_TIME, x.TIMESTAMP, x.STATUS, x.TEMPORARY,
x.GENERATED, x.SECONDARY );
if ( mod(x.r,100) = 0 )
then
commit;
end if;
end loop;
end loop;
commit;
end;
/
并行8个do_sql('1')对表进行插入。
测试结果
| NoReverse | Reverse | hash | 数据量 |
Transaction/second | 83.65 | 167.29 | 167.27 | 10044560 |
Buffer Busy Waits/time(s) | 2,789,971/ 3,602 | 88,359/ 54 | 379,480/240 | 10044560 |
CPU time (s) | 3,748 | 997 | 1,160 | 10044560 |
Elapsed time (s) | 5,567 | 1,409 | 1,512 | 10044560 |
Avg_insert(s)(大约) | 8801 | 24000 | 24000 |
|
很明显,noreverse对插入的影响还是比较大的,在这里hash和reverse的表现差不多,reverse的表现稍好。
但我们来看看hash在范围查询的表现。
查询测试
11:25:33 david@test>select count(*) from t where id>1 and id<100;
COUNT(*)
----------
98
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 4152626091
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | INDEX RANGE SCAN| T_PK | 98 | 1274 | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID">1 AND "ID"<100)
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
11:25:44 david@test>select count(*) from t_REVERSE where id>1 and id<100;
COUNT(*)
----------
98
Elapsed: 00:00:08.38
Execution Plan
----------------------------------------------------------
Plan hash value: 108326537
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 8088 (1)| 00:01:38 |
| 1 | SORT AGGREGATE | | 1 | 13 | | |
|* 2 | INDEX FAST FULL SCAN| TR_PK | 362 | 4706 | 8088 (1)| 00:01:38 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID">1 AND "ID"<100)
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
5 recursive calls
2 db block gets
39210 consistent gets
37119 physical reads
45016 redo size
515 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
11:26:08 david@test>select count(*) from t_hash where id>1 and id<100;
COUNT(*)
----------
98
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 1539008444
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | |
| 1 | SORT AGGREGATE | | 1 | 13 | | | | |
| 2 | PARTITION HASH ALL| | 98 | 1274 | 2 (0)| 00:00:01 | 1 | 16 |
|* 3 | INDEX RANGE SCAN | TH_PK | 98 | 1274 | 2 (0)| 00:00:01 | 1 | 16 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("ID">1 AND "ID"<100)
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
48 consistent gets
0 physical reads
0 redo size
515 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
rows processed
-
在这里,我们单单比较逻辑读
NoReverse 3 consistent gets
Reverse 39210 consistent gets
Hash 48 consistent gets
这里因为统计的是总数,Reverse类型索引还是走了INDEX FAST FULL SCAN,如果是插叙数据的话,很多时候是不会走索引的,应该走的是全表扫描。这里就不做这个实验的,这里仅仅对最简单的查询方式做一下对比,可以看出Reverse索引是不合适范围扫描的。这里简单解释一下,范围扫描为什么hash比NoReverse多了这么多逻辑读。
14:58:09 david@test > alter table t_hash
15:01:48 2 add constraint th_pk
15:01:48 3 primary key (id)
15:01:48 4 using index (create index th_pk on t_hash(id) GLOBAL PARTITION BY HASH (ID) PARTITIONS 16);
这是因为hash有16个分区,我想在这里大家就明白了hash为什么是48了,3*16=48。
清理分区测试
有人会说,我不使用范围扫描。在这里,我们切换到本文开头,建有不同索引的表在清理分区的表现。只比较hash和reverse这两种索引
为了是对比更明显,我找了一个更具有代表性的大表来做这个测试
--反向索引
16:06:28 david@test>alter table test.big drop partition P_20101111 update global indexes;
Table altered.
Elapsed: 00:31:23.40
16:54:40 david@test>create index test.idx_DEALID on test.big(ID) GLOBAL PARTITION BY HASH (DEALID) PARTITIONS 16 tablespace david_ts ;
Index created.
--hash索引。
17:20:06 david@test>alter table test.big drop partition P_20101116 update global indexes;
Table altered.
Elapsed: 00:01:01.51
结论:不言自明。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29033984/viewspace-1061719/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29033984/viewspace-1061719/