在查询中使用区间谓词 选择率和基数card计算
先开始,>=,<= 这几种无边界情况(所谓无边界 就是一侧没有边界 )
SQL> create table t5 (a int);
表已创建。
SQL> declare
2 begin
3 for i in 1..10000 loop
4 insert into t5 values(i);
5 end loop;
6 commit;
7 end;
8 /
PL/SQL 过程已成功完成。
SQL> execute dbms_stats.gather_table_stats('SYS','T5');
PL/SQL 过程已成功完成。
SQL> select table_name,density,num_nulls,num_distinct from user_tab_col_statisti
cs where table_name='T5';
TABLE_NAME DENSITY NUM_NULLS NUM_DISTINCT
------------------------------ ---------- ---------- ------------
T5 .0001 0 10000
SQL> select num_rows from dba_tables where table_name='T5';
NUM_ROWS
----------
10000
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
会话已更改。
SQL> select count(*) from t5 where a>2;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9999 | 29997 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">2)
SQL> select count(*) from t5 where a>=2;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 10000 | 30000 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">=2)
SQL> select count(*) from t5 where a<2;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<2)
SQL> select count(*) from t5 where a<=2;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 2 | 6 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=2)
>=,<=是闭区间, >
分析a>2 是一个无边 开区间(所谓无边界 就是一侧没有边界 ,a>10 and a<100就是一个有边界的 两边开区间)
a>2选择率=(high_value – limit) / (high_value – low_value)
SQL> select (10000-2)/(10000-1) from dual;
(10000-2)/(10000-1)
-------------------
.99989999
card=num_rows*选择率
SQL> select 10000*.99989999 from dual;
10000*.99989999
---------------
9998.9999~~~~~~~~~~~~~~~可以看到10gr2是向上舍入(ceil)
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 9999 Computed: 9999.00 Non Adjusted: 9999.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142429
Resp_io: 6.00 Resp_cpu: 2142429
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 9999.00 Bytes: 0
分析a>= 是一个无边界,闭区间 ,闭区间的特点就是必须包含闭合值的行(2)
选择率=(high_value – limit) / (high_value – low_value) + 1/num_distinct
SQL> select (10000-2)/(10000-1)+1/10000 from dual;
(10000-2)/(10000-1)+1/10000
---------------------------
.99999999
card=10000*.99999999=10000
SQL> select ceil(10000*.99999999) from dual;
CEIL(10000*.99999999)
---------------------
10000
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 10000 Computed: 10000.00 Non Adjusted: 10000.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142429
Resp_io: 6.00 Resp_cpu: 2142429
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 10000.00 Bytes: 0
分析a<2无边界 开区间
选择率=(limit – low_value) / (high_value – low_value)**********
card=num_rows*选择率
SQL> select (2-1)/(10000-1) from dual
2 ;
(2-1)/(10000-1)
---------------
.00010001
SQL> select 10000*.00010001 from dual;
10000*.00010001
---------------
1.0001
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 1 Computed: 1.00 Non Adjusted: 1.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142429
Resp_io: 6.00 Resp_cpu: 2142429
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 1.00 Bytes: 0
分析a<=2 无边界,闭区间 需要将闭合值的行加进来
选择率=(limit – low_value) / (high_value – low_value)+1/num_distinct
card=num_rows*选择率
SQL> select (2-1)/(10000-1)+1/10000 from dual
2 ;
(2-1)/(10000-1)+1/10000
-----------------------
.00020001
SQL> select 10000*.00020001 from dual
2 ;
10000*.00020001
---------------
2.0001
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 2 Computed: 2.00 Non Adjusted: 2.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142429
Resp_io: 6.00 Resp_cpu: 2142429
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 2.00 Bytes: 0
*******************************
接下来看between and, a>x and a=x and a=x and a<=y ,a>x and a<=y 这几种类型 有边界情况
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
会话已更改。
SQL> select count(*) from t5 where a between 2 and 10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 10 | 30 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=10 AND "A">=2)
SQL> select count(*) from t5 where a>2 and a<10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 8 | 24 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<10 AND "A">2)
SQL> select count(*) from t5 where a>=2 and a<10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9 | 27 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<10 AND "A">=2)
SQL> select count(*) from t5 where a>2 and a<=10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9 | 27 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=10 AND "A">2)
SQL> select count(*) from t5 where a>=2 and a<=10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 10 | 30 | 6 (0)| 00:00:01 |~~~~~~~~~看到与 between and card一样
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=10 AND "A">=2)
分析 a between 2 and 10 就等于>= and <=
选择率=(hight_limit-low_limit)/(hight_value-low_value) +1/num_distinct+1/num_distinct(因为是2个闭区间 所以2个闭合值行都要加进来)
card=num_rows*选择率
SQL> select (10-2)/(10000-1)+1/10000+1/10000 from dual;
(10-2)/(10000-1)+1/10000+1/10000
--------------------------------
.00100008
SQL> select 10000*.00100008 from dual;
10000*.00100008
---------------
10.0008
between and 等同于 〉=and<= 所以 a>=2 and a<=10 card也等于10 计算方法也是一样的
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 10 Computed: 10.00 Non Adjusted: 10.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142929
Resp_io: 6.00 Resp_cpu: 2142929
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 10.00 Bytes: 0
分析a>2 and a<10 有边界,两边都开区间
选择率=(hight_limit-low_limit)/(hight_value-low_value)
card=num_rows*选择率
SQL> select (10-2)/(10000-1) from dual;
(10-2)/(10000-1)
----------------
.00080008
SQL> select 10000*((10-2)/(10000-1)) from dual;
10000*((10-2)/(10000-1))
------------------------
8.00080008
trace
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 8 Computed: 8.00 Non Adjusted: 8.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142879
Resp_io: 6.00 Resp_cpu: 2142879
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 8.00 Bytes: 0
分析 a>=2 and a<10 ,a>2 and a<=10 这两个都是 有边界,半开区间(有一闭合的,所以要加上闭合值的行(闭合值分别是 2和10))
所以他们公式相同~ card相同
选择率=(hight_limit-low_limit)/(hight_value-low_value)+1/num_distinct
card=num_rows*选择率
SQL> select (10-2)/(10000-1)+1/10000 from dual;
(10-2)/(10000-1)+1/10000
------------------------
.00090008
SQL> select 10000*((10-2)/(10000-1)+1/10000) from dual;
10000*((10-2)/(10000-1)+1/10000)
--------------------------------
9.00080008
SINGLE TABLE ACCESS PATH
Column (#1): A(NUMBER)
AvgLen: 4.00 NDV: 10000 Nulls: 0 Density: 1.0000e-004 Min: 1 Max: 10000
Table: T5 Alias: T5
Card: Original: 10000 Rounded: 9 Computed: 9.00 Non Adjusted: 9.00
Access Path: TableScan
Cost: 6.14 Resp: 6.14 Degree: 0
Cost_io: 6.00 Cost_cpu: 2142929
Resp_io: 6.00 Resp_cpu: 2142929
Best:: AccessPath: TableScan
Cost: 6.14 Degree: 1 Resp: 6.14 Card: 9.00 Bytes: 0
针对bind 变量情况
SQL> variable a number
SQL> variable b number
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
会话已更改。
SQL> select count(*) from t5 where a>:a;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">TO_NUMBER(:A))
SQL> select count(*) from t5 where a>=:a;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">=TO_NUMBER(:A))
SQL> select count(*) from t5 where a<:a>
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"
SQL> select count(*) from t5 where a<=:a;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 500 | 1500 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=TO_NUMBER(:A))
SQL> select count(*) from t5 where a between :a and :b;
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 25 | 75 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A)<=TO_NUMBER(:B))
3 - filter("A">=TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))
SQL> select count(*) from t5 where a>:a and a<:b>
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 25 | 75 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A) 3 - filter("A">TO_NUMBER(:A) AND "A"
SQL> select count(*) from t5 where a>=:a and a<=:b;
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 25 | 75 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A)<=TO_NUMBER(:B))
3 - filter("A">=TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))
SQL> select count(*) from t5 where a>:a and a<=:b;
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 25 | 75 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A) 3 - filter("A">TO_NUMBER(:A) AND "A"<=TO_NUMBER(:B))
SQL> select count(*) from t5 where a>=:a and a<:b>
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 25 | 75 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A) 3 - filter("A">=TO_NUMBER(:A) AND "A"
分析以上可以看出 规则很简单
只是按 是否具有边界计算的
无边界(,>=,<=) 选择率为5%
card=num_rows*选择率
SQL> select 10000*.05 from dual
2 ;
10000*.05
----------
500
有边界 (between and,> and = and<= ,> and<= ,>= and选择率=5%*5%
card=num_rows*选择率
SQL> select 10000*.0025 from dual
2 ;
10000*.0025
-----------
25
其它情况
首先是超过边界情况
SQL> set autotrace traceonly explain
SQL> alter session set events '10053 trace name context forever';
会话已更改。
SQL> select count(*) from t5 where a>10001;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">10001)
分析超过边界的时候oracle 按边 字面值来计算
选择率=1/num_distinct
card=10000*1/10000=1
SQL> select count(*) from t5 where a>2 and a<=10002;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9999 | 29997 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">2 AND "A"<=10002)
这个oracle检测出 半部分超出边界
选择率=(hight_limit-low_limit)/(hight_value-low_value)+1/num_distinct oracle并没有用这个公式
SQL> select (10002-2)/(10000-1)+1/10000 from dual;
(10002-2)/(10000-1)+1/10000
---------------------------
1.00020001~~~~~~~~~~~~~~~~~~~~~~没这样计算 返回的card当然也不对(10000*1.00020001)
而是只用了左边 a>2的来计算
(high_value – limit) / (high_value – low_value)
SQL> select 10000*((10000-2)/(10000-1)) from dual;
10000*((10000-2)/(10000-1))
---------------------------
9998.9999 CARD*选择率
SQL> select count(*) from t5 where a>=2 and a<=10002;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 10000 | 30000 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">=2 AND "A"<=10002)
这个一样 只用了左边 右边超过边界没算
a>=2
(high_value – limit) / (high_value – low_value)+1/num_distinct
SQL> select 10000*((10000-2)/(10000-1)+1/10000) from dual;
10000*((10000-2)/(10000-1)+1/10000)
-----------------------------------
9999.9999~~~~~~~~~~~~~~~~~card
SQL> select count(*) from t5 where a>=-3 and a<=10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 10 | 30 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<=10 AND "A">=(-3))
oracle 发现左边 边界太小
只计算了 a<=10
选择率=(limit – low_value) / (high_value – low_value)+1/num_distinct
SQL> select 10000*((10-1)/(10000-1)+1/10000) from dual;
10000*((10-1)/(10000-1)+1/10000)
--------------------------------
10.0009001
SQL> select count(*) from t5 where a>=-3 and a<10;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9 | 27 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A"<10 AND "A">=(-3))
同上边一样 发现边界不对,只用了
a<10
选择率=(limit – low_value) / (high_value – low_value)
SQL> select 10000*(10-1)/(10000-1) from dual;
10000*(10-1)/(10000-1)
----------------------
9.00090009
SQL> select count(*) from t5 where a>10001 and a<=10002;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 1 | 3 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("A">10001 AND "A"<=10002)
当超出了 边界 2边 那么 card=num_rows*1/num_distinct=1
另一种情况 or
SQL> select count(*) from t5 where a>3 or a<10 ;
执行计划
----------------------------------------------------------
Plan hash value: 1231860717
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | TABLE ACCESS FULL| T5 | 9998 | 29994 | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
or的 选择率公式
a>3+a<10-(a>3*a<10)
a>3选择率=(high_value – limit) / (high_value – low_value)=.99979998
a<10选择率=(limit – low_value) / (high_value – low_value)=.00020002
a>3*a<10=.00019998
SQL> select (10000-3)/(10000-1) from dual;
(10000-3)/(10000-1)
-------------------
.99979998
SQL> select (3-1)/(10000-1) from dual;
(3-1)/(10000-1)
---------------
.00020002
SQL> select .99979998*.00020002 from dual;
.99979998*.00020002
-------------------
.00019998
SQL> select .99979998+.00020002-.00019998 from dual;
.99979998+.00020002-.00019998
-----------------------------
.99980002
SQL> select 10000*.99980002 from dual;
10000*.99980002
---------------
9998.0002~~~~~~~~~~~~~~~~~~~card
3个or 是这样的
sel(A) + sel(B) + sel(C) – Sel(A)sel(B) – Sel(B)sel(C) – sel(C)sel(A) + Sel(A)Sel(B)Sel(C)
如果要都是 bind
5% (column < :b1) +5% (column > :b2) -0.25% (column < :b1 and column > :b2) =9.75%
混合类~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~简单看下下面类型
SQL> variable a number
SQL> set autotrace traceonly explain
SQL> select count(*) from t5 where a>2 and a<:a>
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 500 | 1500 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(2 3 - filter("A"2)
SQL> select count(*) from t5 where a>100 and a<:a>
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 495 | 1485 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(100 3 - filter("A"100)
SQL> select count(*) from t5 where a>:a and a<100;
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 5 | 15 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A)<100)
3 - filter("A"<100 AND "A">TO_NUMBER(:A))
SQL> select count(*) from t5 where a>:a and a<=100;
执行计划
----------------------------------------------------------
Plan hash value: 4287340780
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS FULL| T5 | 5 | 15 | 6 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_NUMBER(:A)<100)
3 - filter("A"<=100 AND "A">TO_NUMBER(:A))
上面的 计算采用的是 双谓词计算
选择率=谓词1 选择率+谓词2选择率
例
a>2 and a<:a>A>2无边界开区间 选择率 =(high_value – limit) / (high_value – low_value)=.99989999
SQL> select (10000-2)/(10000-1) from dual;
(10000-2)/(10000-1)
-------------------
.99989999
a<:a>所以a>2 and a<:a>SQL> select 0.05*.99989999 from dual;
0.05*.99989999
--------------
.049995
CARD=NUM_ROWS*选择率
SQL> select 10000*.049995 from dual;
10000*.049995
-------------
499.95
a>:a and a<=100
A<=100是一个无边界 闭区间 选择率=(limit-low_value) / (high_value – low_value)+1/num_distinct=.01000099
SQL> select (100-1)/(10000-1)+1/10000 from dual;
(100-1)/(10000-1)+1/10000
-------------------------
.01000099
a>:a选择率为5%
a>:a and a<=100 选择率=.00050005
SQL> select 0.05*((100-1)/(10000-1)+1/10000) from dual;
0.05*((100-1)/(10000-1)+1/10000)
--------------------------------
.00050005
card=下面值
SQL> select 10000*(0.05*((100-1)/(10000-1)+1/10000)) from dual;
10000*(0.05*((100-1)/(10000-1)+1/10000))
----------------------------------------
5.00049505
SQL>
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/12020513/viewspace-614518/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/12020513/viewspace-614518/