索引中字段的顺序确实会对访问性能有所影响,不过并不像很多人想象的那么简单。
这一篇讨论选项性高的列作为前缀索引是应用于范围扫描的情况。
索引中字段顺序对访问性能影响(一):http://yangtingkun.itpub.net/post/468/509188
索引中字段顺序对访问性能影响(二):http://yangtingkun.itpub.net/post/468/509265
前面介绍了相等条件时选择性高字段前缀和选择性低字段前缀在性能上没有差别,另外还介绍了选择性低字段前缀索引的一些优点。但是并不是说建立复合索引的时候就一定要将选择性差的字段放在前面,除了考虑是否存在大量单条件访问的情况外,还必须考虑非等值查询的情况。
不考虑全扫描的情况,索引扫描一般由于等值查询或索引字段的范围扫描,如果查询中包括大量的范围扫描,那么情况就不那么简单了。
SQL> drop index ind_t_owner_id;
Index dropped.
SQL> create index ind_t_id_owner
2 on t (object_id, owner);
Index created.
SQL> set autot on
SQL> select object_name
2 from t
3 where owner > 'S'
4 and object_id = 50000;
OBJECT_NAME
-----------------------------------------------------------
MGMT_CURRENT_METRIC_ERRORS
Execution Plan
----------------------------------------------------------
Plan hash value: 1951931306
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=50000 AND "OWNER">'S' AND "OWNER" IS NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
542 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
SQL> select object_name
2 from t
3 where owner > 'S'
4 and object_id in (50000, 50001);
OBJECT_NAME
-------------------------------
MGMT_CURRENT_METRIC_ERRORS
MGMT_TARGET_ASSOC
Execution Plan
----------------------------------------------------------
Plan hash value: 2709690412
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 4 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(("OBJECT_ID"=50000 OR "OBJECT_ID"=50001) AND "OWNER">'S' AND "OWNER" IS
NOT NULL)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
612 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
2 rows processed
如果是后缀字段的范围扫描,对于前缀字段选择性高情况下,查询基本等同于相等的查询。而如果前缀字段是数量不多的IN查询,那么执行计划变成INLIST方式的扫描,方式类似循环中的相等条件查询。
而如果是前缀字段的范围查询:
SQL> select object_name
2 from t
3 where object_id > 45000
4 and wner = 'U1';
OBJECT_NAME
----------------------------------
T_BIG
S1
T
Execution Plan
----------------------------------------------------------
Plan hash value: 1951931306
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 88 (2)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 88 (2)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 87 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">45000 AND "OWNER"='U1' AND "OBJECT_ID" IS NOT NULL)
filter("OWNER"='U1')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
59 consistent gets
0 physical reads
0 redo size
583 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
可以看到,逻辑读有了数据量的提升,而且在Predicate Information中可以看到出现了filter("OWNER"='U1')。
ACCESS方式是指根据查询条件去定位记录,而FILTER方式是指在获得的记录上应用限制条件来进行过滤。
在当前的案例,在ACCESS和FILTER中都出现OWNER = U1这个条件,但是ACCESS中的条件只是确定索引扫描的起点,而在索引扫描的过程中仅根据OBJECT_ID这一个条件来获取记录,因此最终还需要FILTER的步骤来过滤满足后缀字段的限制条件。
为什么当出现了前缀字段的范围查询,后缀字段的查询条件就失去意义呢,这与索引的结构有关。索引的树形结构使得索引支持等值查询,也就是通过索引的根节点、树枝节点最终定位到索引的叶节点。而索引的相邻叶节点直接是双向链表结构,使得索引范围扫描可以实现。当索引根据前缀字段开始范围扫描时,显然没有办法根据后缀字段的值在链表结构中跳跃执行,因此后缀的限制条件只能变成FILTER过滤条件。
SQL> select object_name
2 from t
3 where object_id > 45000
4 and owner in ('U1', 'U2');
OBJECT_NAME
----------------------------------------------
T_BIG
S1
T
Execution Plan
----------------------------------------------------------
Plan hash value: 1951931306
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 88 (2)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 88 (2)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 87 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">45000 AND "OBJECT_ID" IS NOT NULL)
filter("OWNER"='U1' OR "OWNER"='U2')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
59 consistent gets
0 physical reads
0 redo size
583 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
修改一下查询语句,将后缀字段变为IN语句,执行计划、返回结果和逻辑读都没有发生变化,但是ACCESS和FILTER发生了变化。整个后缀字段的限制条件完全出现在FILTER语句中。
可以看到这个查询和上一个查询并没有显著的差别,因为当前的数据分布对于定位数据的时候通过前缀字段还是通过前缀、后缀字段同时进行没有差别。
可以通过下面的例子来构造出二者的区别:
SQL> insert into t
2 (object_id, owner)
3 select 45001, 'A'
4 from dual
5 connect by rownum < 1000;
999 rows created.
SQL> select object_name
2 from t
3 where object_id >= 45001
4 and wner = 'U1';
OBJECT_NAME
----------------------------------------------
T_BIG
S1
T
Execution Plan
----------------------------------------------------------
Plan hash value: 1951931306
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 88 (2)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 88 (2)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 87 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">=45001 AND "OWNER"='U1' AND "OBJECT_ID" IS NOT NULL)
filter("OWNER"='U1')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
60 consistent gets
0 physical reads
0 redo size
583 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL> select object_name
2 from t
3 where object_id >= 45001
4 and owner in ('U1', 'U2');
OBJECT_NAME
--------------------------------------------------
T_BIG
S1
T
Execution Plan
----------------------------------------------------------
Plan hash value: 1951931306
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 35 | 88 (2)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 35 | 88 (2)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IND_T_ID_OWNER | 1 | | 87 (2)| 00:00:02 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID">=45001 AND "OBJECT_ID" IS NOT NULL)
filter("OWNER"='U1' OR "OWNER"='U2')
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
62 consistent gets
0 physical reads
0 redo size
583 bytes sent via SQL*Net to client
492 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
SQL> rollback;
Rollback complete.
前面的这些例子主要为了说明执行计划中ACCESS和FILTER的区别,以及对于SQL执行效率的影响。
回到文章的主题,一旦对于高选择性的前缀字段执行了范围扫描,后缀字段的限制条件将变成过滤条件。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/4227/viewspace-680238/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/4227/viewspace-680238/