原文地址:http://blog.csdn.net/yidian815/article/details/16891021
位图索引是oracle中非常重要的一种索引形式。本文通过总结有关位图索引的资料,尝试回答如下几个问题:
1:什么是位图索引?
2:位图索引适合什么场景,不适合什么场景?
3:位图索引的性能如何?
什么是位图索引?
位图索引,顾名思义,与“位”有关。大家都知道,计算机中的所有信息最终都是通过“位bit”来运算的, 二进制位运算在计算机中是非常高效的。每一个二进制位都可以取值0或者1,而取值的确切含义是由具体的上下文环境决定的。在oracle位图索引中,每一个二进制位代表了某一行中索引列的取值情况。例如,学生表中性别列的位图索引结构如下:
男:0101001101
女:1010110010
在上面的位图结构中,存储了10条学生记录的性别分布情况,以“男”性别为例,从左到右的第n个二进制位代表了第n条记录是否性别为男,如果二进制位为1,代表true即性别为男,0代表false即性别不为男。以此类推,从图中可以看出,第一条记录的性别为女,第二条记录的性别为男,...第九条记录的性别为女,第十条记录的性别为男。
大家都知道,在oracle中是根据rowid来定位记录的,因此,我们需要引入start rowid和end rowid,通过start rowid ,end rowid 和二进制位的偏移,我们就可以非常快速的计算出二进制位所代表的表记录rowid。位图索引的最终逻辑结构如下图:
位图索引适合什么场景,不适合什么场景?
- CREATE or replace FUNCTION ind_spc_test(rn NUMBER) RETURN NUMBER
- AS
- v_j NUMBER;
- v_dis NUMBER;
- v_bm_sp NUMBER;
- v_bt_sp NUMBER;
- BEGIN
- FOR i IN 1 .. 10LOOP
- EXECUTE immediate 'truncate table t_easy1';
- EXECUTE immediate 'truncate table t_easy2';
- SELECT floor(rn/(11-i)) INTO v_j FROM dual;
- FOR j IN 1 .. rn LOOP
- INSERT INTO t_easy1 VALUES (mod(j,v_j));
- INSERT INTO t_easy2 VALUES (mod(j,v_j));
- END LOOP;
- commit;
- select count(distinct id) into v_j from t_easy1;
- EXECUTE immediate 'analyze index i_easy1 COMPUTE STATISTICS';
- SELECT lEAF_BLOCKS INTO v_bt_sp FROM user_indexes where index_name='I_EASY1';
- EXECUTE immediate 'analyze index i_easy2 COMPUTE STATISTICS';
- SELECT LEAF_BLOCKS INTO v_bm_sp FROM user_indexes where index_name='I_EASY2';
- INSERT INTO bitmap_ind_space VALUES (v_j,v_bm_sp,v_bt_sp,rn );
- COMMIT;
- END LOOP;
- RETURN 0;
- END;
- SQL> select * from bitmap_ind_space order by 1;
- DISTINCT_VAL BITMAP_IND_BLKS BTREE_IND_BLKS ROW_NUM
- ------------ --------------- -------------- ----------
- 10000 139 300 100000
- 11111 79 335 100000
- 12500 89 285 100000
- 14285 103 220 100000
- 16666 120 257 100000
- 20000 146 310 100000
- 25000 183 293 100000
- 33333 246 262 100000
- 50000 371 296 100000
- 100000 408 200 100000
这里的测试比较简单,下面看看大师的实验结果:
位图索引的查询性能如何?
- SQL> create table emp_normal(empno number(10), ename varchar2(30), sal number(10));
- 表已创建。
- Begin
- For i in 1..1000000
- Loop
- Insert into emp_normal
- values(i, dbms_random.string('U',30), dbms_random.value(1000,7000));
- If mod(i, 10000) = 0 then
- Commit;
- End if;
- End loop;
- 10 End;
- 11 /
- PL/SQL 过程已成功完成。
- SQL> create table emp_random as select /* +append */ * from emp_normal order by dbms_random.random;
- SQL> create bitmap index bm_normal on emp_normal(empno);
- 索引已创建。
- SQL> analyze table emp_normal compute statistics for table for all indexes for all indexed columns;
- 表已分析。
- SQL> select index_name,clustering_factor from user_indexes;
- INDEX_NAME CLUSTERING_FACTOR
- ------------------------------ -----------------
- BM_NORMAL 1000000
- SQL> set autot traceonly
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 1000
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=1000
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1526426521
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 3 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_NORMAL | 1 | 34 | 3 (0)| 00:00:01 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX SINGLE VALUE | BM_NORMAL | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO"=1000)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 702 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 2398
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=2398
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1526426521
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 3 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_NORMAL | 1 | 34 | 3 (0)| 00:00:01 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX SINGLE VALUE | BM_NORMAL | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO"=2398)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 703 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 8545
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=8545
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1526426521
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 3 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_NORMAL | 1 | 34 | 3 (0)| 00:00:01 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX SINGLE VALUE | BM_NORMAL | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO"=8545)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 703 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 128444
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=128444
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1526426521
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 3 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_NORMAL | 1 | 34 | 3 (0)| 00:00:01 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX SINGLE VALUE | BM_NORMAL | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO"=128444)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 704 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> drop index bm_normal;
- 索引已删除。
- SQL> create index bt_normal on emp_normal(empno);
- 索引已创建。
- SQL> analyze table emp_normal compute statistics for table for all indexes for all indexed columns;
- 表已分析。
- SQL> select index_name,clustering_factor from user_indexes;
- INDEX_NAME CLUSTERING_FACTOR
- ------------------------------ -----------------
- BT_NORMAL 6210
- SYS_IL0000076897C00002$$
- PK_EMP 1
- PK_DEPT 1
- SQL> set autot traceonly
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 1000
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=1000
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 733975378
- ------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| EMP_NORMAL | 1 | 34 | 4 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | BT_NORMAL | 1 | | 3 (0)| 00:00:01 |
- ------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("EMPNO"=1000)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 702 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 128444
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=128444
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 733975378
- ------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| EMP_NORMAL | 1 | 34 | 4 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | BT_NORMAL | 1 | | 3 (0)| 00:00:01 |
- ------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("EMPNO"=128444)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 704 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select * from emp_normal where empno=&empno;
- 输入 empno 的值: 2398
- 原值 1: select * from emp_normal where empno=&empno
- 新值 1: select * from emp_normal where empno=2398
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 733975378
- ------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| EMP_NORMAL | 1 | 34 | 4 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | BT_NORMAL | 1 | | 3 (0)| 00:00:01 |
- ------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("EMPNO"=2398)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 5 consistent gets
- 0 physical reads
- 0 redo size
- 703 bytes sent via SQL*Net to client
- 520 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
BITMAP | EMPNO | B-TREE | ||
Consistent Reads | Physical Reads | Consistent Reads | Physical Reads | |
5 | 0 | 1000 | 5 | 0 |
5 | 0 | 2398 | 5 | 0 |
5 | 0 | 8545 | 5 | 0 |
5 | 0 | 98008 | 5 | 0 |
5 | 0 | 85342 | 5 | 0 |
5 | 0 | 128444 | 5 | 0 |
5 | 0 | 858 | 5 | 0 |
对emp_random表进行实验,得出的结果与之类似,这里不再獒述。从这里可以看出,在唯一列上的等值查询,位图索引与btree索引的效率相当。
下面,我们在针对范围查询来进行测试。
- SQL> create bitmap index bm_random on emp_random(empno);
- 索引已创建。
- SQL> analyze table emp_random compute statistics for table for all indexes for all columns;
- 表已分析。
- SQL> select index_name,clustering_factor from user_indexes;
- INDEX_NAME CLUSTERING_FACTOR
- ------------------------------ -----------------
- BM_RANDOM 1000000
- SQL> set autot traceonly
- SQL> select * from emp_random where empno between &range1 and &range2;
- 输入 range1 的值: 1
- 输入 range2 的值: 2300
- 原值 1: select * from emp_random where empno between &range1 and &range2
- 新值 1: select * from emp_random where empno between 1 and 2300
- 已选择2300行。
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 811843605
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 2299 | 85063 | 418 (1)| 00:00:06 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 2299 | 85063 | 418 (1)| 00:00:06 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO">=1 AND "EMPNO"<=2300)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 2463 consistent gets
- 0 physical reads
- 0 redo size
- 130225 bytes sent via SQL*Net to client
- 2203 bytes received via SQL*Net from client
- 155 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 2300 rows processed
- SQL> select * from emp_random where empno between &range1 and &range2;
- 输入 range1 的值: 8
- 输入 range2 的值: 1980
- 原值 1: select * from emp_random where empno between &range1 and &range2
- 新值 1: select * from emp_random where empno between 8 and 1980
- 已选择1973行。
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 811843605
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1972 | 72964 | 366 (0)| 00:00:05 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 1972 | 72964 | 366 (0)| 00:00:05 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO">=8 AND "EMPNO"<=1980)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 2114 consistent gets
- 0 physical reads
- 0 redo size
- 111758 bytes sent via SQL*Net to client
- 1961 bytes received via SQL*Net from client
- 133 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1973 rows processed
- SQL> select * from emp_random where empno between &range1 and &range2;
- 输入 range1 的值: 28888
- 输入 range2 的值: 31850
- 原值 1: select * from emp_random where empno between &range1 and &range2
- 新值 1: select * from emp_random where empno between 28888 and 31850
- 已选择2963行。
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 811843605
- -------------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 2962 | 107K| 513 (0)| 00:00:07 |
- | 1 | TABLE ACCESS BY INDEX ROWID | EMP_RANDOM | 2962 | 107K| 513 (0)| 00:00:07 |
- | 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
- |* 3 | BITMAP INDEX RANGE SCAN | BM_RANDOM | | | | |
- -------------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 3 - access("EMPNO">=28888 AND "EMPNO"<=31850)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 3172 consistent gets
- 0 physical reads
- 0 redo size
- 170625 bytes sent via SQL*Net to client
- 2687 bytes received via SQL*Net from client
- 199 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 2963 rows processed
- SQL> drop index bm_random;
- 索引已删除。
- SQL> create index bt_random on emp_random(empno);
- 索引已创建。
- SQL> analyze table emp_random compute statistics for table for all indexes for all columns;
- 表已分析。
- SQL> set autot off
- SQL> select index_name,clustering_factor from user_indexes;
- INDEX_NAME CLUSTERING_FACTOR
- ------------------------------ -----------------
- BT_RANDOM 999834
- SQL> set autot traceonly
- SQL> select * from emp_random where empno between &range1 and &range2;
- 输入 range1 的值: 1
- 输入 range2 的值: 2300
- 原值 1: select * from emp_random where empno between &range1 and &range2
- 新值 1: select * from emp_random where empno between 1 and 2300
- 已选择2300行。
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 731629521
- --------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 2299 | 85063 | 1735 (1)| 00:00:21 |
- |* 1 | TABLE ACCESS FULL| EMP_RANDOM | 2299 | 85063 | 1735 (1)| 00:00:21 |
- --------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("EMPNO"<=2300 AND "EMPNO">=1)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 6410 consistent gets
- 0 physical reads
- 0 redo size
- 121081 bytes sent via SQL*Net to client
- 2203 bytes received via SQL*Net from client
- 155 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 2300 rows processed
- SQL> select * from emp_random where empno between &range1 and &range2;
- 输入 range1 的值: 8
- 输入 range2 的值: 1980
- 原值 1: select * from emp_random where empno between &range1 and &range2
- 新值 1: select * from emp_random where empno between 8 and 1980
- 已选择1973行。
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 731629521
- --------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1972 | 72964 | 1735 (1)| 00:00:21 |
- |* 1 | TABLE ACCESS FULL| EMP_RANDOM | 1972 | 72964 | 1735 (1)| 00:00:21 |
- --------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("EMPNO"<=1980 AND "EMPNO">=8)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 6388 consistent gets
- 0 physical reads
- 0 redo size
- 103922 bytes sent via SQL*Net to client
- 1961 bytes received via SQL*Net from client
- 133 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1973 rows processed
归纳如下,
BITMAP | EMPNO (Range) | B-TREE | ||
Consistent Reads | Physical Reads | Consistent Reads | Physical Reads | |
2463 | 0 | 1-2300 | 6410 | 0 |
2114 | 0 | 8-1980 | 6388 | 0 |
2572 | 0 | 1850-4250 | 6418 | 0 |
3172 | 0 | 28888-31850 | 6456 | 0 |
2762 | 0 | 82900-85478 | 6431 | 0 |
7254 | 0 | 984888-1000000 | 7254 | 0 |
从这里可以看出,位图索引要优于btree索引,这是因为btree索引的cluster factor 较大,从而优化器选择了全表扫描。即便在emp_normal 表下,即clustering factor较小时,位图索引btree索引相当的。因此在distinct cardinality 较大的情况下,范围扫描的效率位图索引也是不逊色与btree索引。
总结如下:
位图索引的查询性能经常是优于btree索引的,即便在distinct cardinality较大的情况下
位图索引不适合与dml频繁的环境
位图索引适用于DSS系统
位图索引可以进行逻辑运算,多个索引和同时在查询语句中发挥作用,这是一个非常重要的地方