dbms_stats 包对段表的分析有三个层次:
表自身的分析:包括表中的行数、数据块数、行长等信息。
列的分析:包括列值的重复数、列上的空值、数据在列上的分布情况。
索引的分析:包括索引叶块的数量、索引的深度、索引的聚合因子等。
直方图有时对于CBO非常重要,特别是对于有字段数据非常倾斜的表,做直方图分析尤为重要,比如下面的例子:
SQL> show user
USER 为 "U1"
SQL> create table t3 as select 1 id,object_name name from test;
表已创建。
SQL> update t3 set id = 99 where rownum = 1;
已更新 1 行。
SQL> commit;
提交完成。
SQL> select id,count(*) from t3 group by id;
ID COUNT(*)
---------- ----------
1 402340
99 1
SQL> create index idx_t3_id on t3(id);
索引已创建。
SQL> exec dbms_stats.gather_table_stats(ownname=>'u1',tabname=>'t3',cascade=>true);
PL/SQL 过程已成功完成。
我们创建了一张表,他的id字段数值严重倾斜,除了有一条记录为99外,其他的记录全部是1,然后对表做分析,默认情况下,dbms_stats包会对所有的列做直方图分析,可以从这个视图上查到相关的信息:
SQL> select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name = 'T3';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ --------------- --------------- --------------
T3 ID 0 1
T3 NAME 0 2.4504E+35
T3 ID 1 1
T3 NAME 1 6.2511E+35
SQL> select table_name,num_rows,avg_row_len,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') from user_tables;
TABLE_NAME NUM_ROWS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ
------------------------------ ---------- ----------- -------------------
T3 400883 27 2011-12-10 03:41:55
现在来看让id值分布等于1,99的时候,CBO产生的执行计划的情况:
SQL> set autotrace traceonly;
SQL> select * from t3 where id = 99;
执行计划
----------------------------------------------------------
Plan hash value: 3977267011
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 27 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 27 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=99)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
4 consistent gets
1 physical reads
0 redo size
466 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
从这里我们可以看到,CBO根据直方图的信息知道id=99的记录只有一行,所以它选择了索引,这个选择非常正确。
SQL> set autotrace traceonly;
SQL> select * from t3 where id = 1;
已选择402340行。
执行计划
----------------------------------------------------------
Plan hash value: 4161002650
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 400K| 10M| 418 (4)| 00:00:06 |
|* 1 | TABLE ACCESS FULL| T3 | 400K| 10M| 418 (4)| 00:00:06 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=1)
统计信息
----------------------------------------------------------
2 recursive calls
0 db block gets
28495 consistent gets
1728 physical reads
0 redo size
12140483 bytes sent via SQL*Net to client
295442 bytes received via SQL*Net from client
26824 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
402340 rows processed
同样,CBO从直方图的信息中得到id=1的值几乎与表的记录数一样,所以全表扫描被使用。现在让我们将直方图信息删除,保留表和索引的分析信息:
SQL> exec dbms_stats.delete_column_stats(user,'t3','id');
PL/SQL 过程已成功完成。
SQL> select table_name,column_name,endpoint_number,endpoint_value from user_histograms where table_name = 'T3';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
------------------------------ --------------- --------------- --------------
T3 NAME 0 2.4504E+35
T3 NAME 1 6.2963E+35
SQL> select table_name,num_rows,avg_row_len,to_char(last_analyzed,'yyyy-mm-dd hh24:mi:ss') from user_tables where table_name = 'T3';
TABLE_NAME NUM_ROWS AVG_ROW_LEN TO_CHAR(LAST_ANALYZ
------------------------------ ---------- ----------- -------------------
T3 403448 27 2011-12-10 04:12:42
SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name = 'T3';
BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED
---------- ----------- ------------- --------------
2 786 2 10-12月-11
可以看见id字段的信息已经被删除,索引和表的信息仍然存在,并且索引中甚至可以找到重复的数值只有2个,但是CBO却无法得到这两个重复的数值的分布情况,所以依然没有办法选出一个正确的执行计划。
SQL> set autotrace traceonly;
SQL> select * from t3 where id = 1;
已选择402340行。
执行计划
----------------------------------------------------------
Plan hash value: 3977267011
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4034 | 106K| 403 (1)| 00:00:05 |
| 1 | TABLE ACCESS BY INDEX ROWID| T3 | 4034 | 106K| 403 (1)| 00:00:05 |
|* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1614 | | 395 (0)| 00:00:05 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=1)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
28495 consistent gets
1309 physical reads
0 redo size
12140483 bytes sent via SQL*Net to client
295442 bytes received via SQL*Net from client
26824 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
402340 rows processed
SQL> select * from t3 where id = 99;
执行计划
----------------------------------------------------------
Plan hash value: 3977267011
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 4034 | 106K| 403 (1)| 00:00:05 |
| 1 | TABLE ACCESS BY INDEX ROWID| T3 | 4034 | 106K| 403 (1)| 00:00:05 |
|* 2 | INDEX RANGE SCAN | IDX_T3_ID | 1614 | | 395 (0)| 00:00:05 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=99)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
466 bytes sent via SQL*Net to client
400 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可以看见上面的执行计划已经是不正确的了。