oracle 在 dba_indexes 视图中提供一个名为 clustering_factor 的列,通知优化器关于表的行与索引的同步情况。当集簇因子接近数据块的数量时,表的行与索引同步,即列值相同的数据行存放得比较集中,聚集度高。 列值的选择性、db_block_size、avg_row_len 以及集合基数全都协同工作,帮助优化器决定是使用索引还是使用全表扫描。如果数据列具有高度的选择性和低的 clustering_factor,则索引扫描通常是最快的执行方法。即使列具有高度的选择性,高 clustering_factor 和小 avg_row_len 也会表示列值在表中随机分布,而获取这些行需要额外的 I/O。在此情况下,索引范围扫描会导致大量不必要的 I/O;全表扫描则会高效得多。
【实验环境】
操作系统:RHEL 5.5
数据库:Oracle 10.2.0
【实验过程】
1、环境部署
1.1、创建表、添加索引
SCOTT@ prod>create table test as select * from emp;
SCOTT@ prod>create index ind_test_empno on test (empno);
1.2、分析表
SCOTT@ prod>analyze table test compute statistics;
1.3、查看列值的集簇因子:
SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name,
i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name
and t.owner='SCOTT' and t.table_name='TEST';
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR
---------- ---------- ------- ----------- --------------- -----------------
TEST 14 4 40 IND_TEST_EMPNO 1
NUM_ROWS 总行数14
AVG_ROW_LEN 平均每行长度40 bytes
14*40=560,一个块中14行占了560 bytes字节,一个块的大小是8192bytes(8K)。
560/8192=0.0683,560个字节约占了一个块的7%
1.4、使用索引列进行查询
SCOTT@ prod>set autotrace traceonly SCOTT@ prod>select * from test where empno=7788; Execution Plan ---------------------------------------------------------- Plan hash value: 4043037449 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 32 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 32 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_TEST_EMPNO | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPNO"=7788) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 0 bytes sent via SQL*Net to client 0 bytes received via SQL*Net from client 0 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
1.5、修改pctfree值,改为93,插入更多值
SCOTT@ prod>alter table test pctfree 93; SCOTT@ prod>insert into test select * from emp; SCOTT@ prod>/ SCOTT@ prod>/ SCOTT@ prod>/ SCOTT@ prod>/ SCOTT@ prod>/ SCOTT@ prod>/ SCOTT@ prod>commit;
|
1.6、分析表、查看执行计划
SCOTT@ prod>analyze table test estimate statistics; SCOTT@ prod>select * from test where empno=7788; 8 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1357081020 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 256 | 5 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | TEST | 8 | 256 | 5 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("EMPNO"=7788) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 0 bytes sent via SQL*Net to client 0 bytes received via SQL*Net from client 0 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 8 rows processed |
1.7、查看集簇因子
SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name, i.CLUSTERING_FACTOR from dba_tables t,dba_indexes i where t.table_name=i.table_name and t.owner='SCOTT' and t.table_name='TEST';
TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR ---------- ---------- ------- ----------- --------------- ----------------- TEST 112 16 40 IND_TEST_EMPNO 112 |
CLUSTERING_FACTOR的值和NUM_ROWS的值接近,列值相同的行比较分散,走全表扫描。
可以对emp3重新排序,把列值相同的行变的集中,降低集簇因子的值。
2、改变集簇因子
2.1、创建中间表存放数据
SCOTT@ prod>create table test_tmp as select * from test;
|
2.2、truncate原表
SCOTT@ prod>truncate table test; |
2.3、按顺序重新插入数据
SCOTT@ prod>insert into test select * from test_tmp order by empno; SCOTT@ prod>commit; SCOTT@ prod>select * from test where rownum < 10; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO ------ ----- --------- ----- ------------------- ------- ------ ------- 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 7499 ALLEN SALESMAN 7698 1981-02-20 00:00:00 1600 300 30 9 rows selected. |
2.4、重新分析表、查看执行计划
SCOTT@ prod>analyze table test estimate statistics; SCOTT@ prod>select * from test where empno=7788; 8 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 4043037449 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 256 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| TEST | 8 | 256 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IND_TEST_EMPNO | 8 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPNO"=7788) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 0 bytes sent via SQL*Net to client 0 bytes received via SQL*Net from client 0 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 8 rows processed |
这次执行计划选择了索引
2.5、查看集簇因子的改变
SYS@ prod>select t.table_name,t.NUM_ROWS,t.BLOCKS,t.AVG_ROW_LEN,i.index_name, TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN INDEX_NAME CLUSTERING_FACTOR ---------- ---------- ------- ----------- --------------- ----------------- TEST 112 13 40 IND_TEST_EMPNO 13 |
【实验总结】
1、如果CLUSTERING_FACTOR的值和BLOCKS的值比较接近,则列值相同的行比较集中,则执行计划应该走索引。 2、如果CLUSTERING_FACTOR的值和NUM_ROWS的值比较接近,则列值相同的行比较分散,这样优化器会选择走全表扫描。如果想要走索引,需要对原表进行重排序,降低集簇因子的值。 |
吕星昊
2014.9.4
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29475508/viewspace-1264262/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29475508/viewspace-1264262/