广义上分为3类:B-树索引,位图索引和索引表组织
位图索引实现适用与不经常进行更新,插入,删除的列 。更适合于具有较少唯一值的静态列,典型的例子就是在数据仓库应用中,在一张包含人口统计信息的表中性别列是一个很好的例子。
B-树索引在各类应用中得到了广泛的使用,分区索引,压缩索引,基于函数的索引都是实现为B-树索引。
B-树索引适合于具有较低选择度的列。如果列的选择度不够低,索引扫描就会较慢。并且,选择度不够
的列将会从叶子快中取出大量的行编号从而导致对表进行过多的但数据块访问。
什么时候使用索引
一般来说,如果sql语句中声明的谓语是选择性的,及应用所声明的谓语将会获得很少的数据行,那么基于索引的访问路径性能将会更好。典型的基于索引的访问路径通常包含下面3步:
1.遍历索引树并在将sql语句中的谓语应用到索引列后面收集叶子块的行编号。
2.使用行编号从表数据块中获取行数据。
3.在获取的数据行上应用其余的谓语来得出最终结果集。
如果在第一步中返回了大量的行编号,第二部访问表数据块的代价就会更高。对于来自索引叶子快的每一个行编号,都需要访问表数据块。并且这可能会导致多次物理IO从而引起性能问题。同时,从物理上来说一次只
能访问一个表数据块,进一步放大了性能问题。
例如下面的SQL:
create index sh.sales_fact_c2 on sh.sales_fact(country);
select /*+ index (s sales_fact_c2) */ count(distinct(region)) from sh.sales_fact s where country='Spain';
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 977 (1)| 00:00:12 |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
| 2 | VIEW | VW_DAG_0 | 7 | 119 | 977 (1)| 00:00:12 |
| 3 | HASH GROUP BY | | 7 | 175 | 977 (1)| 00:00:12 |
| 4 | TABLE ACCESS BY INDEX ROWID| SALES_FACT | 8382 | 204K| 975 (0)| 00:00:12 |
|* 5 | INDEX RANGE SCAN | SALES_FACT_C2 | 8382 | | 27 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("COUNTRY"='Spain')
select /*+ index (s sales_fact_c2) */ count(*) from sh.sales_fact s where country='Spain';
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 310 (1)| 00:00:04 |
| 1 | SORT AGGREGATE | | 1 | 17 | | |
| 2 | VIEW | VW_DAG_0 | 7 | 119 | 310 (1)| 00:00:04 |
| 3 | HASH GROUP BY | | 7 | 175 | 310 (1)| 00:00:04 |
|* 4 | TABLE ACCESS FULL| SALES_FACT | 8382 | 204K| 309 (1)| 00:00:04 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("COUNTRY"='Spain')
只使用了一个谓语country = ‘Spain’ 来访问sales表的sql语句,第五步返回的数据行数估计为8382行。因此,
预计要从执行步骤中8382个行编号并且表数据块必须至少访问8382次以取出数据记录。如果数据块还没有在缓冲区缓
存中,其中一些表数据块的访问就可能会导致物理IO,因此对于这个特定的例子,基于索引的访问路径性能可能更差。
空值问题
在sql语句中经常会声明IS NULL 谓语,空值不存储在某个单独列的索引中,因此谓语IS NULL 将
不会使用单列索引。但空值是存储在多列索引中的。使用另一个虚拟列来创建多列索引,就可以在is null子句中启动索引。
在N1上建立了一个单列索引T1_n1。优化器没有为具有 n1 is null 谓语的select 语句选择索引访问路径。另一个索引t1_n10建立在表达式(n1,0)上,优化器选择了使用这个索引的访问路径,因为空值存储在这个多列索引中。在索引中加入一个虚拟0值,索引的大小仍然保持较小。
drop table t1 purge;
create table t1 (n1 number, n2 varchar2(100));
insert into t1 select object_id,object_name from dba_objects where rownum <101;
commit;
create index t1_n1 on t1(n1);
select * from t1 where n1 is null;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 65 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1" IS NULL)
create index t1_n10 on t1(n1,0);
select * from t1 where n1 is null;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 65 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_N10 | 5 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N1" IS NULL)
update t1 set n1='' where n1 between 2 and 100;
commit;
select * from t1 where n1 is null;
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 715 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | T1 | 11 | 715 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_N10 | 5 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N1" IS NULL)
drop index T1_N10;
select * from t1 where n1 is null;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 715 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T1 | 11 | 715 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1" IS NULL)
位图索引
位图索引的组织结构和实现方式与B-树索引不同,使用位图来表示列值的行编号。位图索引不适合需要进行大量更新的列或具有较多DML操作的表。位图索引适合大多数对具有较少唯一值的列进行只读运算的数据仓库表。
在country和region上建了两个新的位图索引。select 语句声明了country和region列上的谓语。执行计划显示出了三个主要运算:应用country=‘Spain’ 从位图索引sales_fact_part_bm1中取出了位图,应用谓语region=‘Western Europe’ 从位图索引sales_fact_part_bm2中取出了位图,然后将这两个位图使用BITMAP AND运算来进行计算以得出最终的位图。
这个合成的位图被转化为行编号并使用这些行编号来访问数据行。
select * from dba_indexes where owner='SH';
create table sh.sales_fact_part as select * from sh.sales_fact;
create bitmap index sh.sales_fact_part_bm1 on sh.sales_fact_part (country); --local分区索引
create bitmap index sh.sales_fact_part_bm2 on sh.sales_fact_part (region); --local分区索引
select * from sh.sales_fact_part
where country='Spain' and region ='Western Europe';
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9228 | 1063K| 7 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID | SALES_FACT_PART | 9228 | 1063K| 7 (0)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS| | | | | |
| 3 | BITMAP AND | | | | | |
|* 4 | BITMAP INDEX SINGLE VALUE| SALES_FACT_PART_BM1 | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE| SALES_FACT_PART_BM2 | | | | |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("COUNTRY"='Spain')
5 - access("REGION"='Western Europe')
drop index sh.sales_fact_part_bm1;
drop index sh.sales_fact_part_bm2;
select * from sh.sales_fact_part
where country='Spain' and region ='Western Europe';
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9228 | 1063K| 309 (1) | 00:00:04 |
|* 1 | TABLE ACCESS FULL| SALES_FACT_PART | 9228 | 1063K| 309 (1) | 00:00:04 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COUNTRY"='Spain' AND "REGION"='Western Europe')
如果索引建立在需要通过DML活动进行大量修改的列上,位图索引可能会引起严重的锁定问题。更新一个具有位图索引的列必须同时更新位图,而位图通常覆盖一组数据行。因此,对某一行进行更新可能会锁定位图中的一组数据库行。位图索引主要
在数据仓库应用中使用,在OLTP应用中的使用很有限。
索引组织表
在常规的表中,每一行都有一个行编号。一旦在表中建立了一行数据,他们就不在移动(可能会有行连接或者行迁移,但是每个行的头部不会移动)。不同的是,IOT(索引组织表Index Organized Table)数据行存储的索引结构自身中。因此,数据行可能由于DML运算而迁移到不同的叶子块中,从而引起索引叶子快结构的分裂与 合并。简单来说,IOT中的数据行没有物理行编号,而位于堆表中的数据行都会有一个固定的行编号。
IOT适合于具有下面特点的表:
1.数据行长度较短的表。数据列较少并且很短的表适合与IOT。如果数据行长度更长,索引结构就会过大,导致比堆表使用更多的资源。
2.大多使用主键列进行访问的表。尽管可以在IOT上建立次级索引,如果主键列较长则次级索引也可能会耗占大量资源。
–创建索引组织表
drop table sh.sales_iot;
create table SH.SALES_IOT
(prod_id NUMBER not null,
cust_id NUMBER not null,
time_id DATE not null,
channel_id NUMBER not null,
promo_id NUMBER not null,
quantity_sold NUMBER(10,2) not null,
amount_sold NUMBER(10,2) not null,
primary key (PROD_ID, CUST_ID, TIME_ID, CHANNEL_ID, PROMO_ID))
organization index;
insert into sh.sales_iot select * from sh.sales;
commit;
select quantity_sold,amount_sold from sh.sales_iot i where
i.prod_id=13 and i.cust_id=2 and i.channel_id=3 and i.prod_id=999;
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | 0 (0)| |
|* 1 | FILTER | | | | | |
|* 2 | INDEX RANGE SCAN| SYS_IOT_TOP_77433 | 1 | 65 | 2 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NULL IS NOT NULL)
2 - access("I"."PROD_ID"=13 AND "I"."CUST_ID"=2 AND "I"."CHANNEL_ID"=3)
filter("I"."CHANNEL_ID"=3)
--索引组织表的rowid是*开头
select a.*,a.rowid from sh.sales_iot a ;
--IOT中的次级索引
drop index sh.sales_iot_sec;
create index sh.sales_iot_sec on
sh.sales_iot(channel_id,time_id,promo_id,cust_id);
select si.quantity_sold,si.amount_sold,si.time_id from sh.sales_iot si
where si.channel_id=3 and si.promo_id=999 and si.cust_id=12345
and si.time_id='26-2月 -98';
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 74 | 2 (0) | 00:00:01 |
|* 1 | INDEX UNIQUE SCAN| SYS_IOT_TOP_77433 | 1 | 74 | 2 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN| SALES_IOT_SEC | 1 | | 2 (0) | 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("SI"."CHANNEL_ID"=3 AND "SI"."TIME_ID"='26-2月 -98' AND
"SI"."PROMO_ID"=999 AND "SI"."CUST_ID"=12345)
2 - access("SI"."CHANNEL_ID"=3 AND "SI"."TIME_ID"='26-2月 -98' AND
"SI"."PROMO_ID"=999 AND "SI"."CUST_ID"=12345)
索引组织表是一种能够有效减少数据行较短且需要进行大量DML和select活动的表中额外索引的特殊结构。但如果IOT的主键列较长,在其中加入次级索引可能会增大索引大小、重做却大小以及撤销区大小。
分区索引
如果局部索引包含分区健列并且如果sql语句声明了分区键列上的谓语,执行计划就仅需要访问一个或很少的索引分区。这个概念通常称为分区消除。如果执行假话在最少的分区中进行搜索,性能就会得到提升。在下面的sql中分区建立 了一张分区表sales_fact_part。在列product和year上创建了分区索引sales_fact_part_n1.
首先,select语句只声明了product列上的谓语而没有声明任何分区键列上的谓语。在这个例子中,所有5个索引分区都必须使用谓语product =‘Xtend Memory’来进行访问。执行计划中的pstrat和pstop列表命了执行这个sql
语句,访问了所有分区。
drop table sh.sales_fact_part;
create table sh.sales_fact_part
partition by range(year)
(partition p_1997 values less than (1998),
partition p_1998 values less than (1999),
partition p_1999 values less than (2000),
partition p_2000 values less than (2001),
partition p_max values less than (maxvalue)
)as
select * from sh.sales_fact;
select * from (select * from sh.sales_fact_part where product= 'Xtend Memory')
where rownum <21;
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2360 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL| | 1298 | 149K| 7 (0)| 00:00:01 | 1 | 5 |
|* 3 | TABLE ACCESS FULL | SALES_FACT_PART | 1298 | 149K| 7 (0)| 00:00:01 | 1 | 5 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
3 - filter("PRODUCT"='Xtend Memory')
create index sh.sales_fact_part_n1 on sh.sales_fact_part(product,year) local;
select * from (select * from sh.sales_fact_part where product= 'Xtend Memory')
where rownum <21;
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2360 | 14 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE ALL | | 1895 | 218K| 14 (0)| 00:00:01 | 1 | 5 |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES_FACT_PART | 1895 | 218K| 14 (0)| 00:00:01 | 1 | 5 |
|* 4 | INDEX RANGE SCAN | SALES_FACT_PART_N1 | | | 6 (0)| 00:00:01 | 1 | 5 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
4 - access("PRODUCT"='Xtend Memory')
select * from (select * from sh.sales_fact_part where product= 'Xtend Memory' and year=1998)
where rownum <21;
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2360 | 7 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE | | 453 | 53454 | 7 (0)| 00:00:01 | 2 | 2 |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES_FACT_PART | 453 | 53454 | 7 (0)| 00:00:01 | 2 | 2 |
|* 4 | INDEX RANGE SCAN | SALES_FACT_PART_N1 | | | 1 (0)| 00:00:01 | 2 | 2 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
4 - access("PRODUCT"='Xtend Memory' AND "YEAR"=1998)
drop index sh.sales_fact_part_n1;
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 2360 | 5 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE| | 413 | 48734 | 5 (0)| 00:00:01 | 2 | 2 |
|* 3 | TABLE ACCESS FULL | SALES_FACT_PART | 413 | 48734 | 5 (0)| 00:00:01 | 2 | 2 |
-----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
3 - filter("PRODUCT"='Xtend Memory' AND "YEAR"=1998)
全局索引
全局索引通过关键字GLOBAL来创建。在全局索引中,索引的分区边界与表的分区边界不一定要匹配,并且表和索引的分区键也可以不一样。 下面sql在year列上创建了一个全局索引sles_fact_part_n1。尽管分区列是一样的,表和索引的额分区边界是不同的。接下来的select语句 声明了谓语year=1998来访问表,执行计划显示访问了索引分区1和表分区2。在表和索引级上都进行了分区裁剪。
create index sh.sales_fact_part_n1 on sh.sales_fact_part(year)
global partition by range (year)(
partition p_1998 values less than (1999),
partition p_2000 values less than (2001),
partition p_max values less than (maxvalue)
)
select * from (select * from sh.sales_fact_part where product= 'Xtend Memory' and year=1998)
where rownum <21;
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 1340 | 16 (0)| 00:00:01 | | |
|* 1 | COUNT STOPKEY | | | | | | | |
| 2 | PARTITION RANGE SINGLE | | 21 | 1407 | 16 (0)| 00:00:01 | 1 | 1 |
|* 3 | TABLE ACCESS BY GLOBAL INDEX ROWID | SALES_FACT_PART | 21 | 1407 | 16 (0)| 00:00:01 | 2 | 2 |
|* 4 | INDEX RANGE SCAN | SALES_FACT_PART_N1 | | | 4 (0)| 00:00:01 | 1 | 1 |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
3 - filter("PRODUCT"='Xtend Memory')
4 - access("YEAR"=1998)
对全局索引的任何维护都将需要获得对表的较高层级的锁,从而降低了应用的可用性。相反,对于全局部索引的维护可以只在分区级上完成,只会影响相应的表分区。这个例子中,重建索引sales_fact_part_n1将会需要排他模式的表级锁,导致应用停机。
散列分区与范围分区在散列分区方案中,分区键列的值使用散列算法进行散列化来确定存储数据行的分区。这种类型的分区方案适合于分区列使用人造键值填充的情况,例如分区列又顺序生成的值填充。如果劣质的分布是均匀的,那么每个分区将会存储几乎相等数目的数据行。
散列分区方案还有几个额外的有点。范围分区方案有一些管理开支,因为在将来要想放入新的数据行就需要定期加入新的分区。例如,如果分区键是order_date列,那么就必须增加新分区(或者具有最大值的分区进行分裂)来放入将来日期的数据。而在散列分区方案中,这个开支就避免了,因为数据行在使用散列算法进行分区的各个分区上是均匀分布的。如果劣质是均匀分布的则所有分区将保存近似相等数量的数据行,也就没有理由定期增加更多的分区了。
散列分区方案
drop sequence sh.sfseq;
create sequence sh.sfseq cache 200;
drop table sh.sales_fact_part;
create table sh.sales_fact_part
partition by hash (id)
partitions 32
as select sh.sfseq.nextval id , f.* from sh.sales_fact f;
select * from sh.sales_fact_part a where a.id = 1000;
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 131 | 13 (0)| 00:00:01 | | |
| 1 | PARTITION HASH SINGLE| | 1 | 131 | 13 (0)| 00:00:01 | 25 | 25 |
|* 2 | TABLE ACCESS FULL | SALES_FACT_PART | 1 | 131 | 13 (0)| 00:00:01 | 25 | 25 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"."ID"=1000)
create unique index sh.sales_fact_part_n1 on sh.sales_fact_part(id) local;
select * from sh.sales_fact_part a where a.id = 1000;
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 131 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION HASH SINGLE | | 1 | 131 | 2 (0)| 00:00:01 | 25 | 25 |
| 2 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES_FACT_PART | 1 | 131 | 2 (0)| 00:00:01 | 25 | 25 |
|* 3 | INDEX UNIQUE SCAN | SALES_FACT_PART_N1 | 1 | | 1 (0)| 00:00:01 | 25 | 25 |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."ID"=1000)
压缩索引
压缩索引是常规B-树索引的变体。这种类型的索引更适合引导列中具有重复值的列。通过将引导列中的重复值在索引叶子块进保存一次来实现压缩。数据行区的指针指向这些前置行,避免在数据行区显示存储这些重复值。如果列具有很多重复值,与常规的索引相比压缩索引更可能小很多。在处理压缩索引时CPU使用率可能会略有提升,这可以和安全地忽略。
create index <index name> on <schema.table_name>
(col1 [,col2....coln])
compress n storage-parameter-clause;
使用compress N 语法 创建压缩索引时,可以声明压缩的前导列数目。例如,在一个3列索引中对两个引导列进行压缩,可以声明子句compress 2 。前两列中的重复值只在前置区保存一次。你只能对引导列进行压缩。例如,你不能压缩列1和列3.
在下面的sql中,通过声明compress 2 压缩子句对两个引导列进行压缩,在列product,year和week上创建了压缩索引sales_fact_c1。在这个例子中,product列和year列的重复值因为声明了compress 2 在叶子块中只保存了一次。由于这两列的重复度较高,压缩这两个引导列后,索引尺寸从6MB(常规索引)降低到了2MB(压缩索引)。
select * from (select product,year,week,sale from sh.sales_fact
order by product,year,week)
where rownum < 21;
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 1320 | | 1431 (1)| 00:00:18 |
|* 1 | COUNT STOPKEY | | | | | | |
| 2 | VIEW | | 111K| 7175K| | 1431 (1)| 00:00:18 |
|* 3 | SORT ORDER BY STOPKEY| | 111K| 4240K| 5688K| 1431 (1)| 00:00:18 |
| 4 | TABLE ACCESS FULL | SALES_FACT | 111K| 4240K| | 309 (1)| 00:00:04 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
3 - filter(ROWNUM<21)
create index sh.sales_fact_c1 on sh.sales_fact(product,year,week);
select * from (select product,year,week,sale from sh.sales_fact
order by product,year,week)
where rownum < 21;
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 1320 | 21 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 20 | 1320 | 21 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| SALES_FACT | 111K| 4240K| 21 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | SALES_FACT_C1 | 20 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
select 'Compressed index size(MB): '|| trunc (bytes/1024/1024,2)
from sh.user_segments where segment_name='SALES_FACT_C1';
--Compressed index size(MB): 6
drop index sh.sales_fact_c1;
create index sh.sales_fact_c1 on sh.sales_fact(product,year,week) compress 2;
select * from (select product,year,week,sale from sh.sales_fact
order by product,year,week)
where rownum < 21;
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20 | 1320 | 21 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 20 | 1320 | 21 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID | SALES_FACT | 111K| 4240K| 21 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | SALES_FACT_C1 | 20 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<21)
select 'Compressed index size(MB): '|| trunc (bytes/1024/1024,2)
from sh.user_segments where segment_name='SALES_FACT_C1';
--Compressed index size(MB): 2
数据自身得特点对压缩比骑着非常重要得作用。如果列值重复得次数越高,则索引压缩就能得到越多得益处。如果没有重复数据,则压缩索引可能比常规索引还要大。因此,压缩索引适合于引导列具有较少唯一值得索引。dba_indexes/user_indexes视图中得compression和prefix_length列显示了索引得压缩属性。
选择对多少列进行压缩取决于列值得分布。为了确定用于进行压缩得最佳列数,可以使用analyze index/validate structure语句。在下面sql中,使用validate structure子句对未压缩得索引sales_fact_c1进行分析。这个分析语句对index_ststs视图进行填充。index_stats.opt_cmpr_count列给出了最优得压缩列数。
对于这个索引来说为2。index_stats.Cmpr_pctsave列显示了对opt_cmpr_count这个列进行压缩锁节约得索引大小。在这个例子中,将会节约67%得索引空间。因此,使用compress 2 子句压缩得索引大小将为常规未压缩索引大小得33%。这个估算值得值为1.98MB。很接近实际得索引大小。
最优压缩列数
drop index sh.sales_fact_c1;
create index sh.sales_fact_c1 on sh.sales_fact(product,year,week);
analyze index sales_fact_c1 validate structure;
select opt_cmpr_count,opt_cmpr_pctsave from index_stats where name = 'SALES_FACT_C1';
--基于函数的索引
drop index sh.sales_fact_part_fbi1;
select * from sh.sales_fact_part where to_char(id)='1000';
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 131 | 371 (1)| 00:00:05 | | |
| 1 | PARTITION HASH ALL| | 1 | 131 | 371 (1)| 00:00:05 | 1 | 32 |
|* 2 | TABLE ACCESS FULL| SALES_FACT_PART | 1 | 131 | 371 (1)| 00:00:05 | 1 | 32 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_CHAR("ID")='1000')
create index sh.sales_fact_part_fbi1 on sh.sales_fact_part(to_char(id));
select * from sh.sales_fact_part where to_char(id)='1000';
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1470 | 219K| 368 (0)| 00:00:05 | | |
| 1 | TABLE ACCESS BY GLOBAL INDEX ROWID| SALES_FACT_PART | 1470 | 219K| 368 (0)| 00:00:05 | ROWID | ROWID |
|* 2 | INDEX RANGE SCAN | SALES_FACT_PART_FBI1 | 588 | | 1 (0)| 00:00:01 | | |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(TO_CHAR("ID")='1000')
--虚拟列与基于函数的索引
select data_default,hidden_column,virtual_column from dba_tab_cols
where table_name='SALES_FACT_PART' and virtual_column='YES';
select b.index_name,b.column_name from dba_ind_columns b
where index_name='SALES_FACT_PART_FBI1';
在增加了基于函数的索引后收集表的统计信息很重要的。如果不收集,新的虚拟列就没有统计信息,这有可能导致性能异常。
execute dbms_stats.gather_table_stats(ownname => user,
table_name =>'SALES_FACT_PART',
cascade =>true);
基于函数的索引也可以显示使用虚拟列实现。在这个虚拟列上也可以加上索引。这种方法额外的好处就是你还可以使用虚拟列作为分区键来应用分区方案。在下面sql中,使用virtual关键字在表中加入了一个新的虚拟列id_char。
然后在id_chhar列上建立了全局分区索引。select语句的执行计划显示表使用新建的索引来访问,并且谓语to_char(id)='1000’被重写为谓语id_char='1000’以使用虚拟列。
drop index sh.sales_fact_part_fbi1;
alter table sh.sales_fact_part add (id_char varchar2(40)
generated always as (to_char(id)) virtual);
drop index sh.sales_fact_part_c1;
create index sh.sales_fact_part_c1 on sh.sales_fact_part(id_char)
global partition by hash(id_char)
partitions 32;
select * from sh.sales_fact_part where to_char(id)='1000';
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart | Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1470 | 219K| 368 (0)| 00:00:05 | | |
| 1 | PARTITION HASH SINGLE | | 1470 | 219K| 368 (0)| 00:00:05 | 1 | 1 |
| 2 | TABLE ACCESS BY GLOBAL INDEX ROWID| SALES_FACT_PART | 1470 | 219K| 368 (0)| 00:00:05 | ROWID | ROWID |
|* 3 | INDEX RANGE SCAN | SALES_FACT_PART_C1 | 588 | | 1 (0)| 00:00:01 | 1 | 1 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access(TO_CHAR("ID")='1000')
反转键索引
在反转键索引中,列值按照逐个字符的反向顺序存储。例如,列值12345在索引中存储为54321。因为列值是按照反向顺序存储的,连续的列值将会存储爱不同的索引叶子块中,从而避免了右侧增长索引所带来的资源征用问题。但是,在表数据块中,这些列值还是存储为12345的。
反转键索引有两个问题。
1.反转键索引的范围扫描不能使用范围运算符如between,<,>等。这是可以理解的,因为索引范围扫描的基本
假设就是列值按照反转顺序存储,没有按照逻辑键的顺序来维护违反了这个假设。因此索引范围扫描不适用于反转
键索引。
2.反转键索引可能会人为的增加物理读取的次数,因为列值被存储在很多个叶子块中,而这些叶子块可能需要
读取到缓冲区缓存中来修改块。但是,这个IO成本的增加需要与右侧增长索引引起的并发性问题相对照来衡量。
在下面sql中,使用关键字reverse建立了一个反转键索引sales_fact_part_n1。首先,具有谓语id=1000的select语句使用了反转键索引,因为等式谓语可以使用反转键索引。但接下来的具有谓语id between 1000 and 1001 的 select 语句使用了全表扫描访问路径,因为索引范围扫描访问路径不能使用反转键索引。
drop index sh.sales_fact_part_c1;
drop index sh.sales_fact_part_n1;
create unique index sh.sales_fact_part_n1 on sh.sales_fact_part(id) global reverse;
select * from sh.sales_fact_part where id=1000;
-------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 153 | 2 (0)| 00:00:01 | | |
| 1 | TABLE ACCESS BY GLOBAL INDEX ROWID| SALES_FACT_PART | 1 | 153 | 2 (0)| 00:00:01 | 25 | 25 |
|* 2 | INDEX UNIQUE SCAN | SALES_FACT_PART_N1 | 1 | | 1 (0)| 00:00:01 | | |
-------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1000)
select * from sh.sales_fact_part where id between 1000 and 1001;
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 29 | 4437 | 371 (1)| 00:00:05 | | |
| 1 | PARTITION HASH ALL| | 29 | 4437 | 371 (1)| 00:00:05 | 1 | 32 |
|* 2 | TABLE ACCESS FULL| SALES_FACT_PART | 29 | 4437 | 371 (1)| 00:00:05 | 1 | 32 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("ID">=1000 AND "ID"<=1001)
Note
------
dynamic sampling used for this statement (level=2)
尤其是在rac中,右侧增长索引可能会引起不能容忍的性能问题。反转键索引被引入来解决性能问题。但有时候你可能应该考虑散列分区索引而不是反转键索引。
降序索引
索引默认按照升序存储列值,但可以使用降序索引来切换为降序存储。如果你的应用按照特定的顺序来获取数据,则在数据行被发送给应用之前需要排序。通过降序索引可以避免这个排序。如果应用按照某个特定的顺序上百万次的获取数据,则这类索引是非常有用的。例如,取自客户交易表的按照时间顺序逆序排列的客户数据。
下面sql中,使用product desc,year desc 及 week desc 声明了3个降序排列的列,在其上添加索引sales_fact_c1,select语句声明了与索引顺寻像匹配的排序顺序prodduct desc,year desc, week desc 来访问表。执行计划显示 即使在select语句中有order by 子句,在语句执行的过程中并没有排序步骤。
自Oracle 11gR2版本开始,降序索引实现为基于函数的索引。
select year,week from sh.sales_fact s
where year in (1998,1999,2000) and week <5 and product = 'Xtend Memory'
order by product desc,year desc, week desc;
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 112 | 3808 | 311 (1)| 00:00:04 |
| 1 | SORT ORDER BY | | 112 | 3808 | 311 (1)| 00:00:04 |
|* 2 | TABLE ACCESS FULL| SALES_FACT | 112 | 3808 | 310 (1)| 00:00:04 |
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("PRODUCT"='Xtend Memory' AND "WEEK"<5 AND ("YEAR"=1998 OR
"YEAR"=1999 OR "YEAR"=2000))
drop index sh.sales_fact_c1;
create index sh.sales_fact_c1 on sh.sales_fact(product desc,year desc, week desc);
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 112 | 3808 | 5 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| SALES_FACT_C1 | 112 | 3808 | 5 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access(SYS_OP_DESCEND("PRODUCT")=HEXTORAW('A78B9A919BDFB29A92908D86
FF') AND SYS_OP_DESCEND("WEEK")>HEXTORAW('3EF9FF') )
filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("PRODUCT"))='Xtend Memory'
AND SYS_OP_UNDESCEND(SYS_OP_DESCEND("WEEK"))<5 AND
(SYS_OP_UNDESCEND(SYS_OP_DESCEND("YEAR"))=1998 OR
SYS_OP_UNDESCEND(SYS_OP_DESCEND("YEAR"))=1999 OR
SYS_OP_UNDESCEND(SYS_OP_DESCEND("YEAR"))=2000))
不可见索引
在某些场景下,你可能需要增加一个索引来调优sql语句的性能,但你不太确定索引带来的负面影响。不可见索引在以较小的风险来衡量新索引所带来的影响方面非常有用。索引可以加入到数据库中并被表急为不可见,这样优化器就不会选用这个索引。可以在确定某个索引没有负面影响或对执行计划没有负面影响后把它标记为可见。在数据库中加入索引以后,你可以在会话中将optimizer_use_invisible_indexes参数设置为TRUE,这样不会影响应用性能。然后复查sql语句的执行计划。在下面sql中,第一个select语句在执行计划中使用索引sales_fact_c1。
接下来的sql语句将sales_fact_c1索引标记为不可见,从而同一个select语句的第2个执行计划显示该索引被优化器忽略了。
select * from (
select * from sh.sales_fact where product='Xtend Memory' and year= 1998 and week=1
) where rownum < 21;
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 462 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| SALES_FACT | 7 | 462 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SALES_FACT_C1 | 1 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
alter index sh.SALES_FACT_C1 invisible; --不可见
alter index sh.SALES_FACT_C1 visible; --可见
select * from (
select * from sh.sales_fact where product='Xtend Memory' and year= 1998 and week=1
) where rownum < 21;
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 462 | 310 (1)| 00:00:04 |
|* 1 | COUNT STOPKEY | | | | | |
|* 2 | TABLE ACCESS FULL| SALES_FACT | 7 | 462 | 310 (1)| 00:00:04 |
---------------------------------------------------------------------------------
不可见索引还有另一个应用场景。这种索引有助于在删除不适用的索引时来降低风险。从生产数据库中删除不使用的索引并不是令人愉快的经历,可能之后却意外的认识到删除的索引在一个很重要的报表中用到了。即使在经过充分的分析后,也有可能被删除的索引在某个商务过程中是必须的,而重建索引可能会导致应用停机。从Oracle 11g以来,你可以将索引标记为不可见,等过了几周后,如果没有任何进程要用到这个索引,则可以较为安全的将其删掉。如果被表急不可见后发现索引是需要的,则可以很快的使用一个sql语句将索引还原为可见状态。
alter session set optimizer_use_invisible_indexes = true; --可见
alter session set optimizer_use_invisible_indexes = false; --不可见
select * from (
select * from sh.sales_fact where product='Xtend Memory' and year= 1998 and week=1
) where rownum < 21;
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 462 | 4 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| SALES_FACT | 7 | 462 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SALES_FACT_C1 | 1 | | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
虚拟索引
你曾经有过增加一个索引,但后来却意外的认识到由于数据分布或某种统计信息问题的原因这个索引不会被优化器选中的经历吗?虚拟索引对于查看索引的有效性是很有用的。虚拟索引不会分配存储空间,因此可以很快建立。虚拟索引与不可见索引的不同之处在于不可见索引是有与之相关存储的存储的,只是优化器不选择它们。而虚拟索引没有与之关联的存储空间。由于这个原因,虚拟索引也被称为无段索引。
会话可修改的一个下划线参数_use_nosegment_indexes控制了优化器是否可以考虑选择虚拟索引。这个参数的默认值是FALSE,应用不会选择虚拟索引。可以通过下面这种方法在不影响应用的基础上测试虚拟索引:创建索引,在你的会话中将这个参数设置为TRUE,然后验证sql语句的执行计划。下面sql中,使用nosegment子句创建了一个虚拟索引sales_virt。
在当前会话中讲参数值修改为TRUE之后,检查了SELECT语句的执行计划。执行计划显示优化器讲sql语句选择这个索引。在检查过执行计划之后,可以讲这个索引删掉并重建为常规索引。
create index sh.sales_vit on sh.sales(cust_id,promo_id) nosegment; --在当前会话创建虚拟索引
alter session set "_use_nosegment_indexes"=false; --当前会话关闭虚拟索引
alter session set "_use_nosegment_indexes"=false; --当前会话打开虚拟索引
explain plan for
select * from sh.sales
where cust_id=:b1 and promo_id=:b2;
select * from table(dbms_xplan.display);
位图联结索引
位图联结索引对于数据仓库应用中物化事实表和维度表之间的联结是很有用的。在数据仓库表中,一般来说,事实表比维度表要大的多,并且维度和事实表使用主键进行联结——在它们之间存在外键关系。这种联结的成本由于事实表很大而更高。如果能够预先存储联结结果则这些查询的性能就会得到提高。物化视图是预先计算联结结果的可选项之一,位图联结索
引是另一个可选项。
在下面sql中,给出了一个典型的数据仓库查询及其执行计划。在这个查询中,sales表是一张事实表,其他表是维度表。sales表与其他维度白哦通过维度表上的主键列相联结。自行计划显示在这个联结步骤中sales表是引导表,与其他维度表进行联结来得出最终的结果集。这就是执行计划中的4个联结运算。如果事实表很大的画这个执行计划的成本就会很高。
–典型的数据仓库(DW)查询
select sum(s.quantity_sold) sum_sq,sum(s.amount_sold) sum_sa
from sh.sales s ,sh.products p ,sh.customers c ,sh.channels ch
where s.prod_id=p.prod_id
and s.cust_id=c.cust_id
and s.channel_id=ch.channel_id
and p.prod_name='Y box'
and c.cust_first_name='Abigail'
and ch.channel_desc='Direct_sales';
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 75 | 784 (1)| 00:00:10 | | |
| 1 | SORT AGGREGATE | | 1 | 75 | | | | |
|* 2 | HASH JOIN | | 20 | 1500 | 784 (1)| 00:00:10 | | |
|* 3 | TABLE ACCESS FULL | CUSTOMERS | 43 | 516 | 405 (1)| 00:00:05 | | |
| 4 | NESTED LOOPS | | | | | | | |
| 5 | NESTED LOOPS | | 3235 | 199K| 378 (0)| 00:00:05 | | |
| 6 | MERGE JOIN CARTESIAN | | 1 | 43 | 6 (0)| 00:00:01 | | |
|* 7 | TABLE ACCESS FULL | CHANNELS | 1 | 13 | 3 (0)| 00:00:01 | | |
| 8 | BUFFER SORT | | 1 | 30 | 3 (0)| 00:00:01 | | |
|* 9 | TABLE ACCESS FULL | PRODUCTS | 1 | 30 | 3 (0)| 00:00:01 | | |
| 10 | PARTITION RANGE ALL | | | | | | 1 | 28 |
| 11 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
| 12 | BITMAP AND | | | | | | | |
|* 13 | BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX | | | | | 1 | 28 |
|* 14 | BITMAP INDEX SINGLE VALUE | SALES_CHANNEL_BIX | | | | | 1 | 28 |
| 15 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 3190 | 63800 | 378 (0)| 00:00:05 | 1 | 1 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("S"."CUST_ID"="C"."CUST_ID")
3 - filter("C"."CUST_FIRST_NAME"='Abigail')
7 - filter("CH"."CHANNEL_DESC"='Direct_sales')
9 - filter("P"."PROD_NAME"='Y box')
13 - access("S"."PROD_ID"="P"."PROD_ID")
14 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID")
在下面sql中,建立了一个位图联结索引sales_bji1累预先计算联结结果。索引创建语句与查询中类似的联结了salesman表和维度表。
在创建索引后再次执行了select语句,该select语句的执行计划显示首先访问位图联结索引,接下来不适用任何联结操作访问sales表。在内部,表中增加了3个新的虚拟列,并且在这3个虚拟列上创建了索引。简单来说,位图联结索引通过虚拟列上的索引物化了结果集,从而避免了成本较高的联结操作。
关于位图联结所以有一点局限性,所有维度都需要定义有经过验证的主键或唯一键约束,索引必须是局部的等。下面sql中前三个语句讲约束的状态修改为validated以启用位图联结索引的创建。
alter table sh.products modify primary key validate;
alter table sh.customers modify primary key validate;
alter table sh.channels modify primary key validate;
create bitmap index sh.sales_bji1 on sh.sales(p.prod_name,c.cust_first_name,ch.channel_desc)
from sh.sales s ,sh.products p,sh.customers c ,sh.channels ch
where s.prod_id=p.prod_id
and s.cust_id=c.cust_id
and s.channel_id=ch.channel_id
local;
select sum(s.quantity_sold) sum_sq,sum(s.amount_sold) sum_sa
from sh.sales s ,sh.products p ,sh.customers c ,sh.channels ch
where s.prod_id=p.prod_id
and s.cust_id=c.cust_id
and s.channel_id=ch.channel_id
and p.prod_name='Y box'
and c.cust_first_name='Abigail'
and ch.channel_desc='Direct_sales';
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 20 | 61 (0)| 00:00:01 | | |
| 1 | SORT AGGREGATE | | 1 | 20 | | | | |
| 2 | PARTITION RANGE ALL | | 19 | 380 | 61 (0)| 00:00:01 | 1 | 28 |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 19 | 380 | 61 (0)| 00:00:01 | 1 | 28 |
| 4 | BITMAP CONVERSION TO ROWIDS | | | | | | | |
|* 5 | BITMAP INDEX SINGLE VALUE | SALES_BJI1 | | | | | 1 | 28 |
------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("S"."SYS_NC00008$"='Y box' AND "S"."SYS_NC00009$"='Abigail' AND
"S"."SYS_NC00010$"='Direct_sales')
只有好用的数据模型,位图联结索引就可以提高数据仓库的性能。但是这种索引在OLTP的应用中是没用的。