Oracle优化器要给出一个最优的执行计划,应该也必须知道
sql语句中所引用的底层对象的详细信息,其中比较重要的概念就是:选择率和基数。
选择率( selectivity)和基数( cardinality).官方文档如下:
SELECTIVITY: This measure represents a fraction of rows from a row set. The selectivity is tied to a query predicate, such as last_name='Smith' , or a combination of predicates.
Cardinality:This measure represents the number of rows in a row set.
翻译成中文:a)选择率就是从一个行的集合中返回行数所占的比率 b)基数就是从行集合中返回的预估行数
若相关表没有直方图,选择率就简单的等于 DENSITY,cardinality基数=selectivity选择率 * number of rows 表的行数,下面通过具体的实验进行说明:
对表收集统计信息但不创建直方图
由于上述表没有收集直方图,优化器无法知道列值的具体倾斜情况,因此选择率就简单的等于
DENSITY(1/NUM_DISTINCT)。下面来查看一下基数评估的情况:
基数评估为909,怎么计算出来的呢。
选择率= DENSITY = 0.090909091,num of rows=10000,因此cardinality基数=10000*0.090909091=909
但实际情况呢?
表中skew=1的行数仅为1,评估的基数为909,为什么会出现这种情况,因为skew列存在严重的倾斜,且表对该列也没有收集直方图,造成评估错误。下面对表收集直方图看看基数
可以看到,在收集直方图后基数的评估为1,和实际情况相符。
只有统计信息准确,数据库才能生成理想的执行计划。
选择率( selectivity)和基数( cardinality).官方文档如下:
SELECTIVITY: This measure represents a fraction of rows from a row set. The selectivity is tied to a query predicate, such as last_name='Smith' , or a combination of predicates.
Cardinality:This measure represents the number of rows in a row set.
翻译成中文:a)选择率就是从一个行的集合中返回行数所占的比率 b)基数就是从行集合中返回的预估行数
若相关表没有直方图,选择率就简单的等于 DENSITY,cardinality基数=selectivity选择率 * number of rows 表的行数,下面通过具体的实验进行说明:
点击(此处)折叠或打开
- drop table t;
- create table t as select rownum all_distinct,10000 skew from dual connect by level<=10000;
- update t set skew=all_distinct where rownum<=10;
- commit;
- select skew,count(*) from t group by skew order by skew;
- SKEW COUNT(*)
---------- ----------
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1
10 1
10000 9990
点击(此处)折叠或打开
- exec dbms_stats.gather_table_stats(USER,'T',method_opt=>'for all columns size 1');
- SELECT COLUMN_NAME,NUM_DISTINCT,utl_raw.cast_to_number(LOW_VALUE),utl_raw.cast_to_number(HIGH_VALUE),DENSITY,NUM_NULLS FROM user_tab_col_statistics WHERE table_name='T';
COLUMN_NAME NUM_DISTINCT UTL_RAW.CAST_TO_NUMBER(LOW_VALUE) UTL_RAW.CAST_TO_NUMBER(HIGH_VALUE) DENSITY NUM_NULLS
--------------- ------------ --------------------------------- ---------------------------------- ---------- ----------
ALL_DISTINCT 10000 1 10000 .0001 0
SKEW 11 1 10000 .090909091 0
点击(此处)折叠或打开
- explain plan for select * from t1 where skew=1;
- SQL> SELECT * FROM table(dbms_xplan.display);
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------
- Plan hash value: 1601196873
-
- --------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 909 | 6363 | 7 (0)| 00:00:01 |
- |* 1 | TABLE ACCESS FULL| T | 909 | 6363 | 7 (0)| 00:00:01 |
- --------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------
-
- 1 - filter("SKEW"=1)
选择率= DENSITY = 0.090909091,num of rows=10000,因此cardinality基数=10000*0.090909091=909
但实际情况呢?
点击(此处)折叠或打开
- SQL> select count(*) from t where skew=1;
-
- COUNT(*)
- ----------
- 1
点击(此处)折叠或打开
- exec dbms_stats.gather_table_stats(user,'T',method_opt=>'for columns skew size 11');
- SQL> explain plan for select * from t where skew=1;
-
- Explained.
-
- SQL> select * from table(dbms_xplan.display);
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------
- Plan hash value: 1601196873
-
- --------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 7 | 7 (0)| 00:00:01 |
- |* 1 | TABLE ACCESS FULL| T | 1 | 7 | 7 (0)| 00:00:01 |
- --------------------------------------------------------------------------
-
- Predicate Information (identified by operation id):
- ---------------------------------------------------
-
- PLAN_TABLE_OUTPUT
- --------------------------------------------------------------------------------
-
- 1 - filter("SKEW"=1)
可以看到,在收集直方图后基数的评估为1,和实际情况相符。
只有统计信息准确,数据库才能生成理想的执行计划。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29827284/viewspace-2134398/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29827284/viewspace-2134398/