因为B*Tree索引不存储全为NULL的列值,比如(A,B)列的索引,那么只有A与B列同时为NULL才不存储到索引中。那么对这个内容可以进行一个扩展,当我们只关心表中某些枚举值,且这些枚举值行数特别少,比如status状态等,那么可以用下面的技巧。
比如有个表,列status只有有效和无效两种状态,对于无效的占很少,大约最多只有10%,而有效的往往很多,占90%,我们经常查询无效的,有效的查询的机会很少,就算查询也走全表扫描,为了节省维护索引的开销,降低索引存储空间,可以对关心的较少的记录做索引,不关心的较多记录不存储到索引中, 那么可以使用基于函数的索引(对于NULL占少数,非NULL占多数,只关心NULL的也可以这样做),如下:
T表只有两个值,status='VALID'与status='INVALID'。省略部分执行过程:
dingjun123@ORADB> SELECT status,COUNT(*) FROM t GROUP BY status;
STATUS COUNT(*)
------- ----------
INVALID 17056
VALID 1186336
2 rows selected.
INVALID的比例很少,我们只关心INVALID,就算建立普通索引,查找VALID的时候,还是要全表扫描。
建立函数索引将非关心的值置NULL,这样就不存储在索引中了。
CREATE INDEX idx_t ON t(DECODE(status,'INVALID',0,NULL));
CREATE INDEX idx1_t ON t(status);
--分析表和索引
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => USER,tabname => 'T',method_opt => 'for columns status size 5',estimate_percent => 100,cascade => TRUE);
END;
/
看看索引属性:
dingjun123@ORADB> SELECT index_name, blevel,leaf_blocks FROM User_Indexes WHERE table_name='T';
INDEX_NAME BLEVEL LEAF_BLOCKS
------------------------------ ---------- -----------
IDX_T 1 31
IDX1_T 2 2857
建立此函数索引blevel=1,而普通索引blevel=2,函数索引leaf_blocks=31,普通索引leaf_blocks=2857;让索引小很多,高度也变小了。
看看查询,区别不大,只会更好:
dingjun123@ORADB> set autotrace traceonly
dingjun123@ORADB> SELECT * FROM t WHERE status='INVALID';
17056 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2071967826
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17056 | 1632K| 325 (1)| 00:00:04 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 17056 | 1632K| 325 (1)| 00:00:04 |
|* 2 | INDEX RANGE SCAN | IDX1_T | 17056 | | 43 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("STATUS"='INVALID')
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4936 consistent gets
10 physical reads
0 redo size
1861571 bytes sent via SQL*Net to client
12922 bytes received via SQL*Net from client
1139 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
17056 rows processed
dingjun123@ORADB> SELECT * FROM t WHERE decode(status,'INVALID',0,NULL)=0;
17056 rows selected.
Elapsed: 00:00:00.58
Execution Plan
----------------------------------------------------------
Plan hash value: 1594971208
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17056 | 1632K| 2821 (1)| 00:00:34 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 17056 | 1632K| 2821 (1)| 00:00:34 |
|* 2 | INDEX RANGE SCAN | IDX_T | 17056 | | 31 (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(DECODE("STATUS",'INVALID',0,NULL)=0)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4919 consistent gets
0 physical reads
0 redo size
1013943 bytes sent via SQL*Net to client
12922 bytes received via SQL*Net from client
1139 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
17056 rows processed
充分利用B*Tree所以不存储全为NULL的列值,可以对关心的枚举值,而且枚举值行数很少,查询要走索引,建立函数索引,减少索引的管理与存储开销,并且提高索引的使用效率,是很有效的。