回忆
记得刚工作的第一年,在一个市级政府部门驻场,一开发的哥们凑过来问我
“兄弟,我那个页面查询数据有点慢,帮我整整呗”
“查的啥?SQL给我瞧瞧”
“常口的,就是根据身份证号码和姓名去查,有时候一个条件,有时候会有两个条件,咋整?”
“额,我先看看,完事叫你”
然后套路就来了,身份证和姓名分别建索引,这样任意一个条件都不慌,再加一个身份证和姓名的组合索引,这样两个一起查也不慌,索引建完通知他再试试
“整完了,你再试试”
“嘿,比刚才快多了,没问题了,晚上整几杯?”
“who 怕 who啊”
唉,该傻逼的时候,我从没犹豫过
组合索引
组合索引一般包含两个或两个以上的列,创建组合索引时,列的排序也有讲究,第一列称为前导列,如果有可能,前导列最好是选择性高的列,有助于提高组合索引查询效率,如果是单个条件查询,而这个条件列恰好是组合索引的前导列,数据库就有可能使用这个索引,如果条件列是组合索引中的非前导列,也有可能使用这个索引进行索引跳跃扫描,不过效率一般不是很高。
搞清楚这个之后,那个常口查询的就很简单了,姓名列上建个索引,再建(身份证号码,姓名)组合索引,因为身份证号码的选择性比姓名要高,所以身份证号码作为前导列,这样无论是那种查询都能满足,其实,只建一个组合索引也可以,不过索引跳跃扫描效率无法保证。
测试
建立测试表,创建组合索引
SQL> create table test as select * from dba_objects;
Table created.
SQL> create index idx_test_id_name on test(object_id,object_name);
Index created.
前导列查询
SQL> select * from test where object_id = 100;
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 3816852919
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 207 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 207 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_TEST_ID_NAME | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=100)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1611 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
非前导列查询
SQL> select * from test where object_name='TEST';
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 194 | 288 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| TEST | 2 | 194 | 288 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='TEST')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
1031 consistent gets
0 physical reads
0 redo size
1611 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
这里没有使用索引跳跃扫描,优化器认为全表扫描效率更高,下面使用hint,让该查询使用跳跃扫描
SQL> select /*+ index_ss(test)*/ * from test where object_name='TEST';
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 3896253852
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 194 | 72058 (1)| 00:14:25 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 2 | 194 | 72058 (1)| 00:14:25 |
|* 2 | INDEX SKIP SCAN | IDX_TEST_ID_NAME | 2 | | 72057 (1)| 00:14:25 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_NAME"='TEST')
filter("OBJECT_NAME"='TEST')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
408 consistent gets
0 physical reads
0 redo size
1614 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
使用索引跳跃扫描后,逻辑读虽然减少许多,但是cpu耗费比全表扫描高太多。
另外,毫无疑问,组合索引的代价比单列索引更高,无论是空间上还是DML操作上,但是在以查询为主的业务上,who care?