高效的SQL(Index-Organized Tables优化精确查询和范围查询)
本文3部分
1、创建索引组织表(PCTTHRESHOLD和OVERFLOW的意义)
2、索引组织表中删除大量数据及重建索引(不是index rebuild而是alter table tablenamesmove online)
3、索引组织表查询优化(2种情况:3-1、exact match | 3-2、range search)【还有一种情况exact match和range search同时使用,但我们的实验没有涉及】
1、 create Index-Organized Tables
CREATE TABLE doudou_idx(
token char(20),
doc_id NUMBER,
token_frequency NUMBER,
token_offsets VARCHAR2(2000),
CONSTRAINT pk_admin_docindex PRIMARY KEY (token, doc_id))
ORGANIZATION INDEX
TABLESPACE doudou_index
PCTTHRESHOLD 20
OVERFLOW TABLESPACE users;
This example creates an index-organized table nameddoudou_idx_organized, with a primary key composed of the columnstoken
anddoc_id
.TheOVERFLOW
andPCTTHRESHOLD
clauses specify that if the length of a row exceeds 20% of the index block size, then the column that exceeded that threshold and all columns after it are moved to the overflow segment. The overflow segment is stored in theuser
tablespace
doudou@TEST> CREATE TABLE doudou_idx(
2 token char(20),
3 doc_id NUMBER,
4 token_frequency NUMBER,
5 token_offsets VARCHAR2(2000),
6 CONSTRAINT pk_admin_docindex PRIMARY KEY (token, doc_id))
7 ORGANIZATION INDEX
8 TABLESPACE doudou_index
9 PCTTHRESHOLD 20
10 OVERFLOW TABLESPACE users;
Table created.
doudou@TEST> begin
2 for i in 1..100000 loop
3 insert into doudou_idx (token,doc_id) values (i,i+2);
4 end loop;
5 commit;
6 end ;
7 /
PL/SQL procedure successfully completed.
doudou@TEST> select index_name,index_type,status from user_indexes where table_name='DOUDOU_IDX';
INDEX_NAME INDEX_TYPE STATUS
------------------------------ ------------------------------ ----------------
PK_ADMIN_DOCINDEX IOT - TOP VALID
【索引组织表的索引index_type 是IOT – TOP】
2、deleting substantial amount of rows onindex organized tables
doudou@TEST> analyze index PK_ADMIN_DOCINDEX validate structure;
doudou@TEST> select height, blocks, del_lf_rows from index_stats;
HEIGHT BLOCKS DEL_LF_ROWS
---------- ---------- -----------
3 896 0
doudou@TEST> delete from doudou_idx where mod(doc_id,2)=1;
50000 rows deleted.
doudou@TEST> commit;
Commit complete.
doudou@TEST> analyze index PK_ADMIN_DOCINDEX validate structure;
doudou@TEST> select height, blocks, del_lf_rows from index_stats;
HEIGHT BLOCKS DEL_LF_ROWS
---------- ---------- -----------
3 896 50000
索引组织表中重建索引(move online)【9i以后有的特性】
doudou@TEST> alter table doudou_idx move online;
Table altered.
doudou@TEST> analyze index PK_ADMIN_DOCINDEX validate structure;
doudou@TEST> select height, blocks, del_lf_rows from index_stats;
HEIGHT BLOCKS DEL_LF_ROWS
---------- ---------- -----------
2 256 0
Delete many rows of index organized tables are OK!
3、Using Index-Organized Tables for Performance
Index-organized tables data rows are stored in the index, index-organized tables provide faster key-based access to table data forqueries that involve exact match or range search or both.
3-1:exact match
doudou@TEST> select * from doudou_idx where token='6' and doc_id=8;
Execution Plan
----------------------------------------------------------
Plan hash value: 2472525284
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1050 | 2 (0)| 00:00:01 |
|* 1 | INDEX UNIQUE SCAN| PK_ADMIN_DOCINDEX | 1 | 1050 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TOKEN"='6' AND "DOC_ID"=8)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
615 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
doudou@TEST> select *from doudou_test where token='6' and doc_id=8;
Execution Plan
----------------------------------------------------------
Plan hash value: 930504952
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 1050 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| DOUDOU_TEST | 1 | 1050 | 2 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | PK_TEST | 1 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TOKEN"='6' AND "DOC_ID"=8)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
615 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
【索引组织表中2 consistent gets,一般表中3 consistent gets,同样都INDEX UNIQUE SCAN ,为什么一般表会多一次consistent gets呢?多出的一次consistent gets就是附加的索引扫描IO。实验环境效果不是很明显,生产环境中会有效果的。】
结论:索引组织表exact match query性能高
3-2:range search
doudou@TEST> select * from doudou_idx where tokenlike '22%';
556 rows selected.
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
Plan hash value: 2166667558
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 841 | 862K| 12 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| PK_ADMIN_DOCINDEX | 841 | 862K| 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TOKEN" LIKE '22%')
filter("TOKEN" LIKE '22%')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
557 recursive calls
0 db block gets
185 consistent gets
0 physical reads
0 redo size
22453 bytes sent via SQL*Net to client
807 bytes received via SQL*Net from client
39 SQL*Net roundtrips to/from client
9 sorts (memory)
0 sorts (disk)
556 rows processed
doudou@TEST> select * from doudou_test where tokenlike '22%';
556 rows selected.
Elapsed: 00:00:00.01
Execution Plan
----------------------------------------------------------
Plan hash value: 502932140
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 570K| 7 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| DOUDOU_TEST | 556 | 570K| 7 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | PK_TEST | 556 | | 4 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TOKEN" LIKE '22%')
filter("TOKEN" LIKE '22%')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
425 recursive calls
0 db block gets
214 consistent gets
0 physical reads
0 redo size
22453 bytes sent via SQL*Net to client
807 bytes received via SQL*Net from client
39 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
556 rows processed
【索引组织表中185 consistent gets,一般表中214 consistent gets。】
结论:索引组织表range search query 性能高
总结:
1、 索引组织表数据存储在索引里,减少了附加的索引扫描IO,提高了性能
2、 索引组织表重建索引是(alter table tablenames move online;)【不是index rebuild 需注意】
3、 索引组织表PCTTHRESHOLD和OVERFLOW的意义:
PCTTHRESHOLD描述了一行长度与索引块大小的关系。
OVERFLOW指明所有列move到的新表空间或说新的段。
附表
Creating Index-Organized Tables
http://docs.oracle.com/cd/E11882_01/server.112/e25494/tables012.htm#ADMIN01506
deleting substantial amount of rows onindex organized tables
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:239614547000 (asktom)
Using Index-Organized Tables for Performance
http://docs.oracle.com/cd/E11882_01/server.112/e16638/data_acc.htm#PFGRF94787
普通表(HOT)与索引组织表(IOT)的不同
Table 3-4 Comparison of Heap-Organized Tables with Index-Organized Tables
Heap-Organized Table | Index-Organized Table |
Therowid uniquely identifies a row. Primary key constraint may optionally be defined. | Primary key uniquely identifies a row. Primary key constraint must be defined. |
Physical rowid inROWID pseudocolumn allows building secondary indexes. | Logical rowid inROWID pseudocolumn allows building secondary indexes. |
Individual rows may be accessed directly by rowid. | Access to individual rows may be achieved indirectly by primary key. |
Sequentialfull table scan returns all rows in some order. | A full index scan or fast full index scan returns all rows in some order. |
Can be stored in atable cluster with other tables. | Cannot be stored in a table cluster. |
Can contain a column of theLONG data type and columns ofLOB data types. | Can contain LOB columns but notLONG columns. |
Can containvirtual columns (only relational heap tables are supported). | Cannot contain virtual columns. |