基本概念
Predicate: 查询中所有的或部分的where子句
Range: 表列中最大值和最小值的范围
NumRows: 表的行数(DBA_TABLES.NUM_ROWS)
Card: Cardinality 满足一个谓词的行数
Sel: 满足需求的行数/可用的总行数
Orig Card: 在用谓词过滤前的行数
Comp Card: 满足谓词的行数
Join Card: Join操作返回的行数
Join Sel: Join Card/笛卡尔join返回的行数
RS: 执行计划中对表或索引等操作所输出的行数
NDV: 列上Distinct值的个数(dba_tab_columns.num_distinct)
NNulls: 列上NULL值的个数(dba_tab_columns.num_nulls)
A4Nulls: NULLS调整
BucketSize: 一个直方图bucket里的行数
Density: 密度dba_tab_columns.density
1. Equality predicate (“col = val”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = (1 / NDV) * A4Nulls
Comp_Card = Orig_Card * Sel
创建测试环境
create table t1 as select * from dba_objects;
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'ZYLONG',
tabname => 'T1',
estimate_percent => 100,
method_opt => 'for all columns size 1',
degree => DBMS_STATS.AUTO_DEGREE,
cascade=>TRUE);
END;
/
查看执行计划 Rows (2)
explain plan for select object_id from t1 where object_name = 'EMP';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 60 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 2 | 60 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='EMP')
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,
b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_NAME
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM
-------------- ----------------- ---------- ------------ ---------- ---------- ---------------
T1 OBJECT_NAME 86290 52015 0 .000019225 NONE
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = (1 / NDV) * A4Nulls
= (1 / NUM_DISTINCT) * A4Nulls
= (1 / 52015) * 1
= .000019225
Comp_Card = round((1/52015)*(86290-0)/86290*86290) = 2
2.Conjunctive equality predicates (“colA = val1 AND colB = val2”)
A4Nulls_A = (Orig_Card_A - NNulls_A) / Orig_Card_A
A4Nulls_B = (Orig_Card_B - NNulls_B) / Orig_Card_B
SelA1 = (1 / NDV_A) * A4Nulls_A
SelB2 = (1 / NDV_B) * A4Nulls_B
Sel = SelA1 * SelB2
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (1)
explain plan for select object_id from t1 where object_name = 'EMP' and OWNER='GYJ';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 36 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 36 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='EMP' AND "OWNER"='GYJ')
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,
b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and (b.coLumn_name=upper('&colname1') OR b.coLumn_name=upper('&colname2'));
Enter value for colname1: OBJECT_NAME
Enter value for colname2: OWNER
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM
-------------- ----------------- ---------- ------------ ---------- ---------- ---------------
T1 OWNER 86290 24 0 .041666667 NONE
T1 OBJECT_NAME 86290 52015 0 .000019225 NONE
选择率和基数的计算过程
A4Nulls_A = (Orig_Card_A - NNulls_A) / Orig_Card_A
= (OBJECT_NAME.NUM_ROWS - OBJECT_NAME.NUM_NULLS) / OBJECT_NAME.NUM_ROWS
= (86290 - 0) / 86290
= 1
A4Nulls_B = (Orig_Card_B - NNulls_B) / Orig_Card_B
= (OWNER.NUM_ROWS - OWNER.NUM_NULLS) / OWNER.NUM_ROWS
= (86290 - 0) / 86290
= 1
SelA1 = (1 / NDV_A) * A4Nulls_A
= (1 / OBJECT_NAME.NUM_DISTINCT) * A4Nulls_A
= (1 / 52015) * 1
= .000019225
SelB2 = (1 / NDV_B) * A4Nulls_B
= (1 / OWNER.NUM_DISTINCT) * A4Nulls_B
= (1 / 24) * 1
= .041666667
Sel = SelA1 * SelB2
= (1 / 52015) * 1 * (1 / 24) * 1
= 8.0105E-07
Comp_Card = ceil((1 / 52015) * 1 * (1 / 24) * 1 * 86290) = 1
3.Disjunctive equality predicates (“colA = val1 OR colB = val2”)
A4Nulls_A = (Orig_Card_A - NNulls_A) / Orig_Card_A
A4Nulls_B = (Orig_Card_B - NNulls_B) / Orig_Card_B
SelA1 = (1 / NDV_A) * A4Nulls_A
SelB2 = (1 / NDV_B) * A4Nulls_B
Sel = SelA1 + SelB2 - (SelA1 * SelB2)
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (3597)
explain plan for select object_id from t1 where object_name = 'EMP' or OWNER='GYJ';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3597 | 126K| 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 3597 | 126K| 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OWNER"='GYJ' OR "OBJECT_NAME"='EMP')
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,
b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and (b.coLumn_name=upper('&colname1') OR b.coLumn_name=upper('&colname2'));
Enter value for colname1: OBJECT_NAME
Enter value for colname2: OWNER
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM
-------------- ----------------- ---------- ------------ ---------- ---------- ---------------
T1 OWNER 86290 24 0 .041666667 NONE
T1 OBJECT_NAME 86290 52015 0 .000019225 NONE
选择率和基数的计算过程
A4Nulls_A = (Orig_Card_A - NNulls_A) / Orig_Card_A
= (OBJECT_NAME.NUM_ROWS - OBJECT_NAME.NUM_NULLS) / OBJECT_NAME.NUM_ROWS
= (86290 - 0) / 86290
= 1
A4Nulls_B = (Orig_Card_B - NNulls_B) / Orig_Card_B
= (OWNER.NUM_ROWS - OWNER.NUM_NULLS) / OWNER.NUM_ROWS
= (86290 - 0) / 86290
= 1
SelA1 = (1 / NDV_A) * A4Nulls_A
= (1 / OBJECT_NAME.NUM_DISTINCT) * A4Nulls_A
= (1 / 52015) * 1
= .000019225
SelB2 = (1 / NDV_B) * A4Nulls_B
= (1 / OWNER.NUM_DISTINCT) * A4Nulls_B
= (1 / 24) * 1
= .041666667
Sel = SelA1 + SelB2 - (SelA1 * SelB2)
= (1 / 52015) * 1 + (1 / 24) * 1 - ((1 / 52015) * 1 * (1 / 24) * 1)
= .041685091
Comp_Card = round(86290 * .041685091) = 3597
4.Disjunctive equality predicates (“colA = val1 OR colA = val2”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = (1 / NDV) * A4Nulls * 2
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (3)
explain plan for select object_id from t1 where object_name = 'EMP' or object_name = 'DEPT';
select * from table(dbms_xplan.display);
SQL>
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 90 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 3 | 90 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='DEPT' OR "OBJECT_NAME"='EMP')
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,
b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_NAME
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM
------------------------------ ------------------------------ ---------- ------------ ---------- ---------- ---------------
T1 OBJECT_NAME 86290 52015 0 .000019225 NONE
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = (1 / NDV) * A4Nulls * 2
= (1 / NUM_DISTINCT) * A4Nulls * 2
= (1 / 52015) * 1 * 2
= .00003845
Comp_Card = round(86290 * .00003845) = 3
5.IN list (“col IN (val1, val2, … , valN)”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = (1 / NDV) * A4Nulls * N
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (7)
explain plan for select object_id from t1 where object_name = 'EMP' or object_name = 'DEPT' or object_name = 'IDX_RANGE' or object_name='NESTLOOP1' ;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 210 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 7 | 210 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_NAME"='DEPT' OR "OBJECT_NAME"='EMP' OR
"OBJECT_NAME"='IDX_RANGE' OR "OBJECT_NAME"='NESTLOOP1')
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,
b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_NAME
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM
------------------------------ ------------------------------ ---------- ------------ ---------- ---------- ---------------
T1 OBJECT_NAME 86290 52015 0 .000019225 NONE
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = (1 / NDV) * A4Nulls * 4
= (1 / NUM_DISTINCT) * A4Nulls * 4
= (1 / 52015) * 1 * 4
= .000076901
Comp_Card = round(86290 * .000076901) = 7
6. Unbounded open ("col > val")
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = ((high_value - val) / (high_value - low_value)) * A4Nulls
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (84319)
explain plan for select object_id from t1 where object_id > '2000';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 84319 | 411K| 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 84319 | 411K| 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID">2000)
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,b.low_value,b.high_value,b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_ID
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY LOW_VALUE HIGH_VALUE HISTOGRAM
----------------- ----------------- ---------- ------------ ---------- ---------- -------------------- ------------------- ---------------
T1 OBJECT_ID 86290 86290 0 .000011589 C103 C3094B59 NONE
LOW_VALUE(最小值) 和 HIGH_VALUE(最大值) 值是RAW类型,可以用 utl_raw 包的 cast_to_number 函数进行类型转换。
SQL> select utl_raw.cast_to_number('C103') LOW_VALUE, utl_raw.cast_to_number('C3094B59') HIGH_VALUE from dual;
LOW_VALUE HIGH_VALUE
---------- ----------
2 87488
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = ((high_value - val) / (high_value - low_value)) * A4Nulls
= ((87488 - 2000) / (87488 - 2)) * 1
= .97716206
Comp_Card = round(86290 * .97716206) = 84319
7.Unbounded open (“col < val”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = ((val - low_value) / (high_value - low_value)) * A4Nulls
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (1971)
explain plan for select object_id from t1 where object_id < '2000';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1971 | 9855 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 1971 | 9855 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"<2000)
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,b.low_value,b.high_value,b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_ID
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY LOW_VALUE HIGH_VALUE HISTOGRAM
----------------- ----------------- ---------- ------------ ---------- ---------- -------------------- ------------------- ---------------
T1 OBJECT_ID 86290 86290 0 .000011589 C103 C3094B59 NONE
LOW_VALUE(最小值) 和 HIGH_VALUE(最大值) 值是RAW类型,可以用 utl_raw 包的 cast_to_number 函数进行类型转换。
SQL> select utl_raw.cast_to_number('C103') LOW_VALUE, utl_raw.cast_to_number('C3094B59') HIGH_VALUE from dual;
LOW_VALUE HIGH_VALUE
---------- ----------
2 87488
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = ((val - low_value) / (high_value - low_value)) * A4Nulls
= ((2000 - 2) / (87488 - 2)) * 1
= .02283794
Comp_Card = round(86290 * .02283794) = 1971
8.Unbounded closed (“col ≥ val”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = ((high_value - val) / (high_value - low_value) + (1 / NDV)) * A4Nulls
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (84320)
explain plan for select object_id from t1 where object_id >= '2000';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 84320 | 411K| 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 84320 | 411K| 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID">=2000)
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,b.low_value,b.high_value,b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_ID
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY LOW_VALUE HIGH_VALUE HISTOGRAM
----------------- ----------------- ---------- ------------ ---------- ---------- -------------------- ------------------- ---------------
T1 OBJECT_ID 86290 86290 0 .000011589 C103 C3094B59 NONE
LOW_VALUE(最小值) 和 HIGH_VALUE(最大值) 值是RAW类型,可以用 utl_raw 包的 cast_to_number 函数进行类型转换。
SQL> select utl_raw.cast_to_number('C103') LOW_VALUE, utl_raw.cast_to_number('C3094B59') HIGH_VALUE from dual;
LOW_VALUE HIGH_VALUE
---------- ----------
2 87488
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = ((high_value - val) / (high_value - low_value) + (1 / NDV)) * A4Nulls
= ((87488 - 2000) / (87488 - 2) + (1 / 86290)) * 1
= .977173649
Comp_Card = round(86290 * .977173649) = 84320
9.Unbounded closed (“col ≤ val”)
A4Nulls = (Orig_Card - NNulls) / Orig_Card
Sel = ((val - low_value) / (high_value - low_value) + (1 / NDV)) * A4Nulls
Comp_Card = Orig_Card * Sel
查看执行计划 Rows (1972)
explain plan for select object_id from t1 where object_id <= '2000';
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------
Plan hash value: 3617692013
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1972 | 9860 | 278 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| T1 | 1972 | 9860 | 278 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"<=2000)
查询统计信息
set feed on
col table_name for a30
col column_name for a30
set line 300
set pages 300
select a.table_name,b.column_name,a.num_rows,b.num_distinct,b.num_nulls,b.density,b.low_value,b.high_value,b.histogram
from user_tables a,user_tab_columns b
where a.table_name=b.table_name
and a.table_name='T1' and b.coLumn_name=upper('&colname');
Enter value for colname: OBJECT_ID
TABLE_NAME COLUMN_NAME NUM_ROWS NUM_DISTINCT NUM_NULLS DENSITY LOW_VALUE HIGH_VALUE HISTOGRAM
----------------- ----------------- ---------- ------------ ---------- ---------- -------------------- ------------------- ---------------
T1 OBJECT_ID 86290 86290 0 .000011589 C103 C3094B59 NONE
LOW_VALUE(最小值) 和 HIGH_VALUE(最大值) 值是RAW类型,可以用 utl_raw 包的 cast_to_number 函数进行类型转换。
SQL> select utl_raw.cast_to_number('C103') LOW_VALUE, utl_raw.cast_to_number('C3094B59') HIGH_VALUE from dual;
LOW_VALUE HIGH_VALUE
---------- ----------
2 87488
选择率和基数的计算过程
A4Nulls = (Orig_Card - NNulls) / Orig_Card
= (NUM_ROWS - NUM_NULLS) / NUM_ROWS
= (86290 - 0) / 86290
= 1
Sel = ((val - low_value) / (high_value - low_value) + (1 / NDV)) * A4Nulls
= ((2000 - 2) / (87488 - 2) + (1 / 86290)) * 1
= .022849529
Comp_Card = round(86290 * .022849529) = 1972