Exadata 之Storage index（摘录加总结）

最新推荐文章于 2023-11-14 11:20:16 发布

cqhiabc50405

最新推荐文章于 2023-11-14 11:20:16 发布

阅读量353

点赞数

Exadata Storage index

存储索引不同于传统意义上的 B-tree 或者bitmap 索引；它不是用来查找数据的的，而是用来过滤数据的。其设计的目的是用来消除磁盘IO：

•Exadata存储索引在Cell节点的内存中自动维护

–存放访问列的最大最小值

–通常以每MB数据来进行索引

•通过最大最小值来过滤WHERE语句中不必要的IO操作

•完全自动维护不透明

例如SELECT cust_state FROM customers WHERE cust_age > 35

当查询时只要根据此索引，访问max value >= 35 相应的存储单元即可；这样就降低了IO。

存储索引最多可以有 8个列，它们位于内存中，并且永远不会被写入磁盘；

Queries using the following comparisons are improved by the storage index:

■ Equality (=)

■ Inequality ()

■ Less than or equal (<=)

■ Greater than or equal (>=)

■ IS NULL

■ IS NOT NULL

存储索引在排序了的数据上更加有效。

I should point out that there are many cases where several columns will benefit from this

behavioral characteristic of Storage Indexes. It’s common in data warehouse environments to have data that is partitioned on a date column. And there are often many other columns that track the partition key such as associated dates (order date, ship date, insert date, return date for example) or sequentially generated numbers like order numbers. Queries against these column are often problematic due the fact that partition eliminate cannot help them. Storage Indexes will provide a similar benefit to partition elimination as long as care is taken to ensure that the data is pre-sorted prior to loading.

监控存储索引：

Statistic（cell physical IO bytes saved by storage index）：

V$sesstat，v$sysstat中的相关信息；

跟踪：

启用隐含参数：_CELL_STORAGE_INDEX_DIAG_MODE=true

统计信息的一个例子：

SQL> select name, value

2 from v$mystat s, v$statname n

3 where n.statistic# = s.statistic#

4 and name like '%storage%';

NAME VALUE

---------------------------------------------------------------- ----------

cell physical IO bytes saved by storage index 0

SQL> select /*+ parallel(a 32) */ sum(POINTS) from test_hcc1 a where POINTS<10;

SUM(POINTS)

-----------

-2.428E+09

SQL> select name, value

2 from v$mystat s, v$statname n

3 where n.statistic# = s.statistic#

4 and name like '%storage%';

NAME VALUE

---------------------------------------------------------------- ----------

cell physical IO bytes saved by storage index 24166400

通过参数_kcfis_storageidx_disabled可以实现对 storage index 的控制：

alter session set "_kcfis_storageidx_disabled"=true;

SQL> alter session set "_kcfis_storageidx_disabled"=true;

Session altered.

Elapsed: 00:00:00.00

SQL> select count(*) from TEST_hcc1 where eid is null;

COUNT(*)

----------

Elapsed: 00:00:00.79

SQL> alter session set "_kcfis_storageidx_disabled"=false;

Session altered.

Elapsed: 00:00:00.00

SQL> select count(*) from TEST_hcc1 where eid is null;

COUNT(*)

----------

Elapsed: 00:00:00.36

SQL>

SQL> @si.sql

NAME VALUE

---------------------------------------------------- ---------------

cell physical IO bytes saved by storage index 2534064128

Elapsed: 00:00:00.01

SQL>

SQL> select count(*) from TEST_hcc1 where eid is null;

COUNT(*)

----------

Elapsed: 00:00:00.35

SQL>

SQL> @si.sql

NAME VALUE

---------------------------------------------------- ---------------

cell physical IO bytes saved by storage index 3526189056

Elapsed: 00:00:00.01

总结

存储索引是只有在数据库可以使用智能扫描的情况下才会使用的一种优化手段。他可以带来显著的性能提升，存储索引可以被看成是另外一种类似分区的解决方案，但是没有普通索引的限制。对于那些通过与主分区键相关联的候选键访问数据的查询，存储索引是非常有效的。数据如何在磁盘上进行物理存储需要慎重考虑。这会对存储索引的效果产生巨大的影响。往Exadata平台中迁移数据的时候，应该确保数据在磁盘上的有序，才能更好的发挥存储索引的作用。

aa.JPG