本次试验讨论位图索引在即席查询中发挥的巨大作用。
即席查询(多维度报表查询:select * from t where col1=XXX and col2=XXX and col3=XXX...)
1.构造表,该表有性别,年龄范围,出生地等字段。投入约10万条的数据,为即席查询做准备。
SYS@ orcl>drop table t purge;
Table dropped.
SYS@ orcl>create table t (
2 name_id,
3 gender not null,
4 location not null,
5 age_group not null,
6 data
7 )
8 as
9 select rownum,
10 decode(ceil(dbms_random.value(0,2)),
11 1,'M',
12 2,'F')gender,
13 ceil(dbms_random.value(1,50)) location,
14 decode(ceil(dbms_random.value(0,3)),
15 1,'child',
16 2,'young',
17 3,'middle_age',
18 4,'old'),
19 rpad('*',20,'*')
20 from dual
21 connect by rownum<=100000;
Table created.
2.全表扫描的情况下的查询
SYS@ orcl>set linesize 1000
SYS@ orcl>set autotrace traceonly
SYS@ orcl>select *
2 from t
3 where gender='M'
4 and location in (1,10,30)
5 and age_group='child';
663 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 575 | 27025 | 138 (3)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 575 | 27025 | 138 (3)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("GENDER"='M' AND ("LOCATION"=1 OR "LOCATION"=10 OR
"LOCATION"=30) AND "AGE_GROUP"='child')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
653 consistent gets
0 physical reads
0 redo size
13755 bytes sent via SQL*Net to client
865 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
663 rows processed
3.建立三个列的联合索引,三个列都为高度重复的列。虽然建立了联合索引,但仍然走的是全表扫描。
SYS@ orcl>create index idx_union on t(gender,location,age_group);
Index created.
SYS@ orcl>select *
2 from t
3 where gender='M'
4 and location in (1,10,30)
5 and age_group='child';
663 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 575 | 27025 | 138 (3)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| T | 575 | 27025 | 138 (3)| 00:00:02 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("GENDER"='M' AND ("LOCATION"=1 OR "LOCATION"=10 OR
"LOCATION"=30) AND "AGE_GROUP"='child')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
653 consistent gets
0 physical reads
0 redo size
13755 bytes sent via SQL*Net to client
865 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
663 rows processed
4.强制执行联合索引,终于明白为什么走全表扫描了。
SYS@ orcl>select /*+index(t,idx_union)*/*
2 from t
3 where gender='M'
4 and location in (1,10,30)
5 and age_group='child';
663 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 306189815
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 575 | 27025 | 38152 (1)| 00:07:38 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T | 575 | 27025 | 38152 (1)| 00:07:38 |
|* 3 | INDEX RANGE SCAN | IDX_UNION | 49691 | | 151 (2)| 00:00:02 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("GENDER"='M' AND ("LOCATION"=1 OR "LOCATION"=10 OR "LOCATION"=30)
AND "AGE_GROUP"='child')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
564 consistent gets
0 physical reads
0 redo size
34169 bytes sent via SQL*Net to client
865 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
663 rows processed
5.本次的主角:位图索引。在gender,location,age_group三个字段分别建位图索引。
SYS@ orcl>create bitmap index gender_idx on t(gender);
Index created.
SYS@ orcl>create bitmap index location_idx on t(location);
Index created.
SYS@ orcl>create bitmap index age_group_idx on t(age_group);
Index created.
SYS@ orcl>select *
2 from t
3 where gender='M'
4 and location in (1,10,30)
5 and age_group='41 and over';
663 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 687389132
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 575 | 27025 | 9 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | T | 575 | 27025 | 9 (0)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
| 3 | BITMAP AND | | | | | |
|* 4 | BITMAP INDEX SINGLE VALUE | GENDER_IDX | | | | |
| 5 | BITMAP OR | | | | | |
|* 6 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | | |
|* 7 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | | |
|* 8 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | | |
|* 9 | BITMAP INDEX SINGLE VALUE | AGE_GROUP_IDX | | | | |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("GENDER"='M')
6 - access("LOCATION"=1)
7 - access("LOCATION"=10)
8 - access("LOCATION"=30)
9 - access("AGE_GROUP"='child')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
426 consistent gets
0 physical reads
0 redo size
34169 bytes sent via SQL*Net to client
865 bytes received via SQL*Net from client
46 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
663 rows processed
总结:
全表扫描:花费138
组合索引:花费38152(TABLE ACCESS BY INDEX ROWID),花费151(INDEX RANGE SCAN)。即使走索引扫描也比全表扫描高。
位图索引:花费9(全表扫描是其15倍,组合索引是其4239倍。如果多个字段差别更加明显。)