假设一个表中有9行记录,分布在3个block上,按照其中列col1的分布情况,分为以下两种情况:
Block 1 Block 2 Block 3
------- ------- --------
A A A B B B C C C
Block 1 Block 2 Block 3
------- ------- --------
A B C A B C A B C
第一种情况,相同的值分布在同一个block里,第二种情况,相同的三个值分布在3个block里,在列col1上创建一个索引。
显然,当进行如下查询时:
select * from t where col1='A';
oracle首先进行索引范围扫描,获取相应的列col1='A’的rowid,之后根据rowid从表中获取数据。对于第一种情况,只需要扫描block 1,即可获得想要的结果,而对于第二种情况,需要扫描block 1,block 2,block 3共3个block。显然,第二种情况下的需要进行的IO次数大于第一种情况。
以下例子,首先根据第一种情况构造一个测试表test1:
SQL> create table test1 (a char(2000),b char(100));
Table created.
SQL> insert into test1 values('A','X');
1 row created.
SQL> /
1 row created.
SQL> /
1 row created.
SQL> COMMIT;
Commit complete.
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
3442
3442
3442
可见,三行均位于同一个block里。
SQL> insert into test1 values('B','X');
1 row created.
SQL> /
1 row created.
SQL> /
1 row created.
SQL> commit;
Commit complete.
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
3442
3442
3442
3443
3443
3443
6 rows selected.
SQL> insert into test1 values('C','X');
1 row created.
SQL> /
1 row created.
SQL> /
1 row created.
SQL> commit;
Commit complete.
SQL> select dbms_rowid.rowid_block_number(rowid) from test1;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
3442
3442
3442
3443
3443
3443
3444
3444
3444
9 rows selected.
以上构造出第一种情况,即值为'A','B','C'的列分别各自位于同一个block里。再构造第二种情况:
SQL> create table test2(a char(2000),b char(100));
Table created.
SQL> insert into test2 values('A','X');
1 row created.
SQL> insert into test2 values('B','X');
1 row created.
SQL> insert into test2 values('C','X');
1 row created.
SQL> commit;
Commit complete.
SQL> select dbms_rowid.rowid_block_number(rowid) from test2;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
3450
3450
3450这样,'A','B','C'这三个不同值的列均位于同一个block里了。然后再构造2个类似的block:
SQL> insert into test2 select * from test2 where rownum<=3;
3 rows created.
SQL> /
3 rows created.
SQL> select dbms_rowid.rowid_block_number(rowid) from test2;
DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
3450
3450
3450
3451
3451
3451
3452
3452
3452
接下来执行多次的insert into test1 select * from test1和 insert into test2 select * from test2,对这两个表各自构造出大量数据。
SQL> select count(*) from test1;
COUNT(*)
----------
4608
SQL> select count(*) from test2;
COUNT(*)
----------
4608
分别在test1和test2上创建索引:
SQL> create index idx_test1 on test1(a);
Index created.
SQL> create index idx_test2 on test2(a);
Index created.
执行索引范围扫描:
SQL> set autot traceonly statistics
SQL> select /*+ index(test1,idx_test1) */ * from test1 where a like 'A%';
1536 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1237 consistent gets 0 physical reads
0 redo size
23223 bytes sent via SQL*Net to client
1517 bytes received via SQL*Net from client
104 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1536 rows processed
SQL> select /*+ index(test2,idx_test2) */ * from test2 where a like 'A%';
1536 rows selected.
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2158 consistent gets 0 physical reads
0 redo size
23223 bytes sent via SQL*Net to client
1517 bytes received via SQL*Net from client
104 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1536 rows processed
显然,对test1的逻辑读要小于对test2的逻辑读,再看两个索引的聚簇因子以及两张表占用的block数量:
SQL> select index_name,CLUSTERING_FACTOR from user_Indexes where index_name in('IDX_TEST1','IDX_TEST2');
INDEX_NAME CLUSTERING_FACTOR
------------------------------ -----------------
IDX_TEST1 1536
IDX_TEST2 4608
SQL> select count(distinct(dbms_rowid.rowid_block_number(rowid))) from test1;
COUNT(DISTINCT(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)))
-----------------------------------------------------
1536
SQL> select count(distinct(dbms_rowid.rowid_block_number(rowid))) from test2;
COUNT(DISTINCT(DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)))
-----------------------------------------------------
1536
SQL> select count(*) from test1;
COUNT(*)
----------
4608
SQL> select count(*) from test2;
COUNT(*)
----------
4608
可以看出,上述测试用例模拟了聚簇因子的两种极端情况,IDX_TEST1的聚簇因子与表的blcok数相等,而IDX_TEST2的聚簇因子与表的记录数相等。