- 有这样一个SQL
- select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
- id列选择性很高,owner选择性很低
- 要优化它很简单,只需要在t1表上建立一个组合索引(owner,id),在t2表上建立一个索引(id)
- 现在要讨论的是我们应该怎么建立组合索引,哪一列(owner,id)应该放在最前面?
- 现在来做个实验
- SQL> desc t1
- Name Type Nullable Default Comments
- ----------- ------------- -------- ------- --------
- ID NUMBER Y
- OBJECT_NAME VARCHAR2(128) Y
- OWNER VARCHAR2(30) Y
- SQL> desc t2
- Name Type Nullable Default Comments
- --------- ----------- -------- ------- --------
- ID NUMBER Y
- STATUS VARCHAR2(7) Y
- TEMPORARY VARCHAR2(1) Y
- SQL> create index inx_id on t2(id);
- Index created.
- SQL> create index inx_id_owner on t1(id,owner);
- Index created.
- SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
- Elapsed: 00:00:00.02
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 2432674005
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 16 | 88 (2)| 00:00:02 |
- | 1 | SORT AGGREGATE | | 1 | 16 | | |
- |* 2 | HASH JOIN | | 2416 | 38656 | 88 (2)| 00:00:02 |
- |* 3 | INDEX FAST FULL SCAN| INX_ID_OWNER | 2416 | 26576 | 50 (0)| 00:00:01 |
- | 4 | INDEX FAST FULL SCAN| INX_ID | 72475 | 353K| 37 (0)| 00:00:01 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("T1"."ID"="T2"."ID")
- 3 - filter("T1"."OWNER"='SCOTT')
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 392 consistent gets
- 0 physical reads
- 0 redo size
- 422 bytes sent via SQL*Net to client
- 420 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> create index inx_owner_id on t1(owner,id);
- Index created.
- SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
- Elapsed: 00:00:00.03
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 277464349
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 16 | 47 (3)| 00:00:01 |
- | 1 | SORT AGGREGATE | | 1 | 16 | | |
- |* 2 | HASH JOIN | | 2416 | 38656 | 47 (3)| 00:00:01 |
- |* 3 | INDEX RANGE SCAN | INX_OWNER_ID | 2416 | 26576 | 9 (0)| 00:00:01 |
- | 4 | INDEX FAST FULL SCAN| INX_ID | 72475 | 353K| 37 (0)| 00:00:01 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("T1"."ID"="T2"."ID")
- 3 - access("T1"."OWNER"='SCOTT')
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 169 consistent gets
- 0 physical reads
- 0 redo size
- 422 bytes sent via SQL*Net to client
- 420 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- 如果OWNER作为引导列,逻辑读由以前的392变成现在的169,并且由以前的 index fast full scan 变成index range scan
- 如果强制指定走索引 inx_id_owner
- SQL> select /*+ index(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
- Elapsed: 00:00:00.03
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3161475902
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 16 | 259 (1)| 00:00:04 |
- | 1 | SORT AGGREGATE | | 1 | 16 | | |
- |* 2 | HASH JOIN | | 2416 | 38656 | 259 (1)| 00:00:04 |
- |* 3 | INDEX FULL SCAN | INX_ID_OWNER | 2416 | 26576 | 221 (1)| 00:00:03 |
- | 4 | INDEX FAST FULL SCAN| INX_ID | 72475 | 353K| 37 (0)| 00:00:01 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("T1"."ID"="T2"."ID")
- 3 - access("T1"."OWNER"='SCOTT')
- filter("T1"."OWNER"='SCOTT')
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 387 consistent gets
- 0 physical reads
- 0 redo size
- 422 bytes sent via SQL*Net to client
- 420 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- 依然要387个逻辑读。
- 为什么要以owner为引导列?因为ID是join列,并且ID列上面没过滤条件,如果以ID列作为引导列,由于没过滤条件
- 那么CBO只能走 index full scan,或者index fast full scan,因为引导列没过滤条件,走不了index range scan,
- 最多走index skip scan,不过index skip scan代价过高,因为index skip scan要求 引导列选择性很低,但是ID这里选择性很高
- SQL> select /*+ index_ss(t1 inx_id_owner) */ count(*) from t1,t2 where t1.id=t2.id and t1.owner='SYS';
- Elapsed: 00:00:00.10
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 3493079762
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 16 | 72529 (1)| 00:14:31 |
- | 1 | SORT AGGREGATE | | 1 | 16 | | |
- |* 2 | HASH JOIN | | 2416 | 38656 | 72529 (1)| 00:14:31 |
- |* 3 | INDEX SKIP SCAN | INX_ID_OWNER | 2416 | 26576 | 72491 (1)| 00:14:30 |
- | 4 | INDEX FAST FULL SCAN| INX_ID | 72475 | 353K| 37 (0)| 00:00:01 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("T1"."ID"="T2"."ID")
- 3 - access("T1"."OWNER"='SYS')
- filter("T1"."OWNER"='SYS')
- Statistics
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 387 consistent gets
- 0 physical reads
- 0 redo size
- 424 bytes sent via SQL*Net to client
- 420 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- 如果owner列作为引导列,那么优化器就可以选择index range scan,这样相比index full scan, index fast full scan
- 肯定要少扫描很多leaf block,逻辑读就会相对较少。
- 其实到这里,是否可以总结一下建立组合索引的原则呢?
- 引导列要选择过滤条件的列作为引导列,比如 where a.xxx='xxx' 或者 a.xxx> 或者 a.xxx<
- 引导列的选择性越高越好,因为选择性越高,扫描的leaf block就越少,效率就越高
- 尽量把join列放到组合索引最后面
- 这里选择以owner列作为引导列,由于owner选择性很低,所以测试索引压缩对于性能的提升
- SQL> analyze index inx_owner_id validate structure;
- Index analyzed.
- SQL> select height,
- 2 blocks,
- 3 lf_blks,
- 4 br_blks,
- 5 OPT_CMPR_COUNT,
- 6 OPT_CMPR_PCTSAVE
- 7 from index_stats
- 8 where name = 'INX_OWNER_ID';
- HEIGHT BLOCKS LF_BLKS BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
- ---------- ---------- ---------- ---------- -------------- ----------------
- 2 256 219 1 1 26
- SQL> drop index inx_owner_id;
- Index dropped
- SQL> create index inx_owner_id on t1(owner,id) compress 1;
- Index created
- SQL> analyze index inx_owner_id validate structure;
- Index analyzed
- SQL>
- SQL> select height,
- 2 blocks,
- 3 lf_blks,
- 4 br_blks,
- 5 OPT_CMPR_COUNT,
- 6 OPT_CMPR_PCTSAVE
- 7 from index_stats
- 8 where name = 'INX_OWNER_ID';
- HEIGHT BLOCKS LF_BLKS BR_BLKS OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
- ---------- ---------- ---------- ---------- -------------- ----------------
- 2 256 161 1 1 0
- 索引压缩之后,Leaf block 由原来的219降低到161个,节约了58个block 现在再来看一看执行计划+统计信息
- SQL> select count(*) from t1,t2 where t1.id=t2.id and t1.owner='SCOTT';
- Elapsed: 00:00:00.03
- Execution Plan
- ----------------------------------------------------------
- Plan hash value: 277464349
- ---------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 16 | 45 (3)| 00:00:01 |
- | 1 | SORT AGGREGATE | | 1 | 16 | | |
- |* 2 | HASH JOIN | | 2416 | 38656 | 45 (3)| 00:00:01 |
- |* 3 | INDEX RANGE SCAN | INX_OWNER_ID | 2416 | 26576 | 7 (0)| 00:00:01 |
- | 4 | INDEX FAST FULL SCAN| INX_ID | 72475 | 353K| 37 (0)| 00:00:01 |
- ---------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("T1"."ID"="T2"."ID")
- 3 - access("T1"."OWNER"='SCOTT')
- Statistics
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 169 consistent gets
- 0 physical reads
- 0 redo size
- 422 bytes sent via SQL*Net to client
- 420 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- 由此可见,索引压缩之后,逻辑读并没有下降,还是169,但是索引的leaf blcok显著减少了,这样减少了存储空间,能降低物理IO
转载:http://blog.csdn.net/robinson1988/article/details/6663503