REVERSE与hash索引测试

最新推荐文章于 2024-11-01 12:20:35 发布

cszt85637

最新推荐文章于 2024-11-01 12:20:35 发布

阅读量202

点赞数

文章标签：数据库

为什么测试
对于大批量并行插入的表来说，sequence基础上建了的主键，因为索引的“向右”分裂所形成的热点，造成buffer busy wait，会对插入性能有较大的损耗。
反向索引是一个解决方案，但反向索引不支持范围扫描，这也会造成困惑，有人会说，主键上我们又不用范围扫描。但对于插入频繁的表来说，随着时间的推
移，数据量必定巨大。在oltp系统中，必然会将历史数据迁移，一般会采用将历史分区drop的方式来达到快速清理的目的，但如果是反向索引，这将是个艰巨
任务。在oracle 10g中，引入了hash分区索引，我觉得是替代反向索引的一种好方法。

测试思路

通过8个进程并行向三个表中插入数据，比较其插入速度，Segments by Buffer Busy Waits，Transaction/second,CPU time (seconds)等方面的数据进行比较。
测试方法
    因为只有并发才会存在争用，如果只是一个进程在运行，我想NOREVERS表现肯定是最好的，因为不管hash还是REVERS都是需要消耗资源的。当然2个并发的，3个并发的。。。每个情况所测试出的结果都是不一致的。选用8个并发也是一个挺极端的情形了(如果有先后就不是并发的，很多情况下，插入并不是并发的)。
测试脚本


13:39:49 david@test>create table t
13:39:51   2 partition by range(id)
13:39:51   3 (
13:39:51   4      partition part1 values less than(2000000),
13:39:51   5      partition part2 values less than(4000000),
13:39:51   6      partition part3 values less than(6000000),
13:39:51   7      partition part4 values less than(8000000),
13:39:51   8      partition part5 values less than(10000000),
13:39:51   9      partition part6 values less than(12000000)
13:39:51 10 )
13:39:51 11 as
13:39:51 12 select 0 id, a.* from all_objects a where 1 = 0;
13:42:24 david@test>alter table T add session_id varchar2(2);
Sequence
13:42:15 david@test>create sequence s cache 1000;

主键
13:42:46 david@test>alter table t
13:43:00   2 add constraint t_pk
13:43:00   3 primary key (id)
13:43:00   4 using index (create index t_pk on t(id));

存储过程
create or replace procedure do_sql(v_session varchar2)
as
begin
for i in 1..10 loop
for x in ( select rownum r, all_objects.* from all_objects )
loop
insert into t
( id,session_id,OWNER, OBJECT_NAME, SUBOBJECT_NAME,
OBJECT_ID, DATA_OBJECT_ID, OBJECT_TYPE, CREATED,
LAST_DDL_TIME, TIMESTAMP, STATUS, TEMPORARY,
GENERATED, SECONDARY )
values
( s.nextval,v_session, x.OWNER, x.OBJECT_NAME, x.SUBOBJECT_NAME,
x.OBJECT_ID, x.DATA_OBJECT_ID, x.OBJECT_TYPE, x.CREATED,
x.LAST_DDL_TIME, x.TIMESTAMP, x.STATUS, x.TEMPORARY,
x.GENERATED, x.SECONDARY );
if ( mod(x.r,100) = 0 )
then
commit;
end if;
end loop;
end loop;
commit;
end;
/

           并行8个do_sql('1')对表进行插入。
      测试结果

	NoReverse	Reverse	hash	数据量
Transaction/second	83.65	167.29	167.27	10044560
Buffer Busy Waits/time(s)	2,789,971/ 3,602	88,359/ 54	379,480/240	10044560
CPU time (s)	3,748	997	1,160	10044560
Elapsed time (s)	5,567	1,409	1,512	10044560
Avg_insert(s)（大约）	8801	24000	24000

     很明显，noreverse对插入的影响还是比较大的，在这里hash和reverse的表现差不多，reverse的表现稍好。
但我们来看看hash在范围查询的表现。
     查询测试

11:25:33 david@test>select count(*) from t where id>1 and id<100;

COUNT(*)
----------
98

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 4152626091

--------------------------------------------------------------------------
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |     1 |    13 |     4   (0)| 00:00:01 |
|   1 | SORT AGGREGATE   |      |     1 |    13 |            |          |
|* 2 |   INDEX RANGE SCAN| T_PK |    98 | 1274 |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID">1 AND "ID"<100)

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        515 bytes sent via SQL*Net to client
        492 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

11:25:44 david@test>select count(*) from t_REVERSE where id>1 and id<100;

COUNT(*)
----------
98

Elapsed: 00:00:08.38

Execution Plan
----------------------------------------------------------
Plan hash value: 108326537

-------------------------------------------------------------------------------
| Id | Operation             | Name | Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 |    13 | 8088   (1)| 00:01:38 |
|   1 | SORT AGGREGATE       |       |     1 |    13 |            |          |
|* 2 |   INDEX FAST FULL SCAN| TR_PK |   362 | 4706 | 8088   (1)| 00:01:38 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("ID">1 AND "ID"<100)

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
          5 recursive calls
          2 db block gets
      39210 consistent gets
      37119 physical reads
      45016 redo size
        515 bytes sent via SQL*Net to client
        492 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

11:26:08 david@test>select count(*) from t_hash where id>1 and id<100;

COUNT(*)
----------
98

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1539008444

---------------------------------------------------------------------------------------------
| Id | Operation           | Name | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |    13 |     2   (0)| 00:00:01 |       |       |
|   1 | SORT AGGREGATE     |       |     1 |    13 |            |          |       |       |
|   2 |   PARTITION HASH ALL|       |    98 | 1274 |     2   (0)| 00:00:01 |     1 |    16 |
|* 3 |    INDEX RANGE SCAN | TH_PK |    98 | 1274 |     2   (0)| 00:00:01 |     1 |    16 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("ID">1 AND "ID"<100)

Note
-----
- dynamic sampling used for this statement

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         48 consistent gets
          0 physical reads
          0 redo size
        515 bytes sent via SQL*Net to client
        492 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
rows processed

在这里，我们单单比较逻辑读
NoReverse 3 consistent gets
Reverse    39210 consistent gets
Hash      48 consistent gets
这里因为统计的是总数，Reverse类型索引还是走了INDEX FAST FULL SCAN，如果是插叙数据的话，很多时候是不会走索引的，应该走的是全表扫描。这里就不做这个实验的，这里仅仅对最简单的查询方式做一下对比，可以看出Reverse索引是不合适范围扫描的。这里简单解释一下，范围扫描为什么hash比NoReverse多了这么多逻辑读。
14:58:09 david@test > alter table t_hash
15:01:48   2 add constraint th_pk
15:01:48   3 primary key (id)
15:01:48   4 using index (create index th_pk on t_hash(id) GLOBAL PARTITION BY HASH (ID) PARTITIONS 16);
这是因为hash有16个分区，我想在这里大家就明白了hash为什么是48了，3*16=48。

     清理分区测试
有人会说，我不使用范围扫描。在这里，我们切换到本文开头，建有不同索引的表在清理分区的表现。只比较hash和reverse这两种索引

为了是对比更明显，我找了一个更具有代表性的大表来做这个测试
--反向索引
16:06:28 david@test>alter table test.big drop partition P_20101111 update global indexes;

Table altered.

Elapsed: 00:31:23.40
16:54:40 david@test>create index test.idx_DEALID on test.big(ID) GLOBAL PARTITION BY HASH (DEALID) PARTITIONS 16 tablespace david_ts ;

Index created.

--hash索引。
17:20:06 david@test>alter table test.big drop partition P_20101116 update global indexes;

Table altered.

Elapsed: 00:01:01.51

结论：不言自明。

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/29033984/viewspace-1061719/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/29033984/viewspace-1061719/