oracle 索引压缩

最新推荐文章于 2023-06-08 11:30:19 发布

谭祖爱

最新推荐文章于 2023-06-08 11:30:19 发布

阅读量2k

点赞数

分类专栏： [数据库SQL优化] 文章标签： oracle

本文链接：https://blog.csdn.net/tanzuai/article/details/42493383

版权

[数据库SQL优化] 专栏收录该内容

51 篇文章 4 订阅

订阅专栏

oracle 索引压缩（key compression）是oracle 9i 中引入的一项新特性。该特性可以压缩索引或者索引组织表中的重复键值，从而节省存储空间。非分区的unique 索引和 non-unique（至少两列）索引都能够被压缩。bitmap 索引不能够进行压缩。

在oracle 索引压缩中有几个比较纠结的术语，需要说明一下。索引压缩是通过将索引中的键值拆分成两部分实现的，也就是grouping piece 也称作prefix 和 unique piece 也称作suffix 。grouping piece 是用来压缩的被unique piece 共享的部分。如果键值不能提供unique piece，那么oracle 将会使用rowid 来唯一标识。只有B-tree 索引的叶子节点能够被压缩，分支节点不能够被压缩。索引压缩是在单个block 中完成的，不能够跨blocks 进行索引压缩。grouping piece (prefix) 和 unique piece (suffix) 存储在同一个索引 block 中。

具体prefix 和 suffix 是怎么划分的呢？默认prefix 长度等于索引列的数量减去1。当然我们可以人为控制prefix 的长度，非唯一索引的最大prefix 长度等于索引列的数量。唯一索引的最大prefix 长度等于索引列的数量减去1。比如，假设索引有三个列：

默认的时候：prefix (column1,column2) suffix (column3)

如果有以下几组键值（1,2,3),（1,2,4),(1,2,7),(1,3,5),(1,3,4),(1,4,4) 那么在prefix中重复的（1,2），(1,3) 将会被压缩至保留一份。

索引压缩适合于那些键值重复率高的索引，这样才能够达到压缩键值，节省存储空间目的。索引压缩以后一个索引块可以存放更多的键值，这样当进行full index scan,full fast index scan 的时候IO性能会更好，但是CPU的负载会增加，至于总体的性能就要看IO性能的提高和CPU 负载增加那个是主要方面了。我不认为索引压缩性能总是提高的，更多的意义在于节省存储空间，减少IO时间。

SQL> create table objects1 as select object_id,object_name from dba_objects;

Table created.

SQL> create table objects2 as select 100 object_id,object_name from dba_objects;

Table created.

SQL> create table objects3 as select object_id,object_name from dba_objects;

Table created.

SQL> create index objects1_idx on objects1 (object_id) compress 1;

Index created.

SQL> create index objects2_idx on objects2 (object_id) compress 1;

Index created.

SQL> create index objects3_idx on objects3 (object_id);

Index created.--创建一个不压缩的索引。

SQL> select index_name,compression,leaf_blocks

2 from user_indexes

3 where index_name in ('OBJECTS1_IDX','OBJECTS2_IDX','OBJECTS3_IDX');

INDEX_NAME COMPRESS LEAF_BLOCKS

------------------------------ -------- -----------

OBJECTS1_IDX ENABLED 222

OBJECTS2_IDX ENABLED 112

OBJECTS3_IDX DISABLED 161

我们可以看到对于objects1 和 objects3 因为object_id 都是唯一的，所以没有压缩的空间，压缩以后索引反而占用了更大的空间，还不如不压缩。而objects2 中 object_id 都是重复的压缩效果明显。

除了创建的时候进行索引压缩，还可以在rebuild index 的时候指定索引压缩和解压缩。

SQL> alter index objects1_idx rebuild nocompress;

Index altered.

SQL> alter index objects1_idx rebuild compress;

Index altered.

注：压缩也是会引入存储开销的，只是很多时候压缩节省的空间比压缩需要的存储开销更大，所以压缩以后整体的存储开销减小了。

compress 后面接的数字表示的是prefix 的深度，也就是需要用来压缩的columns 的数量。

我们下面在通过一个例子加深对压缩索引的理解！

首先，我们结合index_stats得到的索引分析数据看一下，在不同索引列压缩情况下的效果。然后统一总结实验效果。

1.创建测试用表t_compress_index
create table t_compress_index as select * from all_objects;

2.不使用索引压缩技术创建索引
sec@secooler> create index idx_t_compress_index on t(owner,object_type,object_name);

Index created.

sec@secooler> analyze index idx_t_compress_index validate structure;

Index analyzed.

sec@secooler> select height, lf_blks, br_blks, btree_space, opt_cmpr_count, opt_cmpr_pctsave from index_stats;

HEIGHT LF_BLKS BR_BLKS BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
------ ------- ------- ----------- -------------- ----------------
2 64 1 519772 2 28

3.尝试只使用第一列进行压缩
sec@secooler> drop index idx_t_compress_index;

Index dropped.

sec@secooler> create index idx_t_compress_index on t(owner,object_type,object_name) compress 1;

Index created.

sec@secooler> analyze index idx_t_compress_index validate structure;

Index analyzed.

sec@secooler> select height, lf_blks, br_blks, btree_space, opt_cmpr_count, opt_cmpr_pctsave from index_stats;

HEIGHT LF_BLKS BR_BLKS BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
------ ------- ------- ----------- -------------- ----------------
2 56 1 455580 2 18

4.尝试使用前两列进行压缩
sec@secooler> drop index idx_t_compress_index;

Index dropped.

sec@secooler> create index idx_t_compress_index on t(owner,object_type,object_name) compress 2;

Index created.

sec@secooler> analyze index idx_t_compress_index validate structure;

Index analyzed.

sec@secooler> select height, lf_blks, br_blks, btree_space, opt_cmpr_count, opt_cmpr_pctsave from index_stats;

HEIGHT LF_BLKS BR_BLKS BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
------ ------- ------- ----------- -------------- ----------------
2 46 1 375660 2 0

5.尝试使用前三列进行压缩
sec@secooler> drop index idx_t_compress_index;

Index dropped.

sec@secooler> create index idx_t_compress_index on t(owner,object_type,object_name) compress 3;

Index created.

sec@secooler> analyze index idx_t_compress_index validate structure;

Index analyzed.

sec@secooler> select height, lf_blks, br_blks, btree_space, opt_cmpr_count, opt_cmpr_pctsave from index_stats;

HEIGHT LF_BLKS BR_BLKS BTREE_SPACE OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
------ ------- ------- ----------- -------------- ----------------
2 73 1 591444 2 36

6.注意：因为索引列之后三个，所以记住不能使用compress 4进行压缩，这个是显然滴~~
sec@secooler> drop index idx_t_compress_index;

Index dropped.

sec@secooler> create index idx_t_compress_index on t(owner,object_type,object_name) compress 4;
create index idx_t_compress_index on t(owner,object_type,object_name) compress 4
*
ERROR at line 1:
ORA-25194: invalid COMPRESS prefix length value

7.索引压缩小结
（1）通过上面的这个演示过程，可以得到以下结论：
1）对前两列进行压缩效果最好
2）对全部的三列压缩反倒比不使用压缩技术耗用更多的索引空间，这与压缩机制有关
3）要在实践中反复的测试，得出最佳的压缩系数

（2）索引压缩缺点：
1.维护索引时，更耗时，因为需要更多的计算
2.查询时，搜索索引需要较长的时间，因为需要更多的计算
3.需要更多的CPU处理索引
4.增加了块竞争

（3）索引压缩好处：
1.索引占用的磁盘空间少，这是显然的
2.块缓冲区缓存能存放更多的索引条目
3.缓存命中率较高
4.物理I/O较少

任何一种技术都是一种均衡各种资源后的产物，索引压缩技术就充分的体现了这方的特点，需要在disk和CPU之间做到取舍与平衡，需要具体问题具体分析。
友情提示：如果联合索引的前几列存在大量的重复数据的时候，不妨使用一下索引压缩技术。

整理自网络
------------------------------------------------------------------------------
Blog： http://blog.csdn.net/tanzuai
IT群:69254049;