三表 join oracle,Oracle 执行计划(3)-两表连接基数

Oracle 执行计划(3)- 两表连接基数 1 公式: 基数 = 连接选择率 * 过滤条件 1 基数 + 过滤条件 2 的基数 连接选择率 =((num_rows( 表 1)-num_nulls( 表 1 连接字段 ))/num_rows( 表 1))* ((num_rows( 表 2)-num_nulls( 表 2 连接字段 ))/num_rows( 表 2))

Oracle 执行计划(3)-两表连接基数

1公式:

基数=连接选择率*过滤条件1基数+过滤条件2的基数

连接选择率=((num_rows(表1)-num_nulls(表1连接字段))/num_rows(表1))*

((num_rows(表2)-num_nulls(表2连接字段))/num_rows(表2)) /

Greater(num_distinct(表1连接字段),num_distinct(表2连接字段))

create table t1 as

select trunc(dbms_random.value(0,25)) filter1,

trunc(dbms_random.value(0,30)) join1,

lpad(rownum,10) v1,

rpad('x',100) padding1

from all_objects

where rownum<=10000;

create table t2 as

select trunc(dbms_random.value(0,50)) filter2,

trunc(dbms_random.value(0,40)) join2,

lpad(rownum,10) v2,

rpad('x',100) padding2

from all_objects

where rownum<=10000;

select t1.v1,t2.v2

from t1,t2

where t1.join1=t2.join2

and t1.filter=1

and t2.filter2=2

已选择2259行。已用时间: 00: 00: 00.03执行计划----------------------------------------------------------Plan hash value: 2959412835---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2000 | 56000 | 76 (3)| 00:00:01 ||* 1 | HASH JOIN | | 2000 | 56000 | 76 (3)| 00:00:01 ||* 2 | TABLE ACCESS FULL| T2 | 200 | 2800 | 38 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T1 | 400 | 5600 | 38 (3)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------1 - access("T1"."JOIN1"="T2"."JOIN2")2 - filter("T2"."FILTER2"=2)3 - filter("T1"."FILTER"=1)统计信息----------------------------------------------------------1 recursive calls0 db block gets504 consistent gets0 physical reads0 redo size60032 bytes sent via SQL*Net to client2035 bytes received via SQL*Net from client152 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)2259 rows processed

select* from user_tab_col_statistics where table_name='T1'TABLE_NAMECOLUMN_NAMENUM_DISTINCTDENSITYNUM_NULLS

T1FILTER250.040

T1JOIN1300.03333333333333330

T1V1100000.00010

T1PADDING110

TABLE_NAMECOLUMN_NAMENUM_DISTINCTDENSITYNUM_NULLSSAMPLE_SIZE

T2FILTER2500.02010000

T2JOIN2400.025010000

T2V2100000.0001010000

T2PADDING211010000

连接选择率=(10000-0)/10000)*(1000-0)/10000)/greater(30,40)=1/40

连接基数=1/40*(400*200)=2000执行计划当中的 T2 ROWS=200,T1.ROWS=400 HASH JOIN.ROWS=2000|* 1 | HASH JOIN | | 2000 | 56000 | 76 (3)| 00:00:01 ||* 2 | TABLE ACCESS FULL| T2 | 200 | 2800 | 38 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T1 | 400 | 5600 | 38 (3)| 00:00:01 |

2 包含空值情况下

Update t1 set join1=null where mod(to_number(v1),20)=0;

Update t2 set join2=null where mod(to_number(v2),30)=0;

SQL> analyze table t2 compute statistics;

SQL> analyze table t1 compute statistics;

select* from user_tab_col_statistics where table_name='T1'TABLE_NAMECOLUMN_NAMENUM_DISTINCTDENSITYNUM_NULLSSAMPLE_SIZE

T1FILTER250.04010000

T1JOIN1300.033333333333333350010000

T1V1100000.0001010000

T1PADDING11010000

TABLE_NAMECOLUMN_NAMENUM_DISTINCTDENSITYNUM_NULLSSAMPLE_SIZE

T2FILTER2500.02010000

T2JOIN2400.02533310000

T2V2100000.0001010000

T2PADDING211010000

套公式 选择率=((10000-500)/10000)*((10000-333)/10000)/greater(30,40)

=9500/10000*9667/10000/40

=0.95*0.9667/40

=0.022959125

基数=200*400*0.022959125 =1836.73

执行计划:

已选择2042行。

已用时间: 00: 00: 00.03

执行计划

----------------------------------------------------------Plan hash value: 2959412835---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1837 | 51436 | 76 (3)| 00:00:01 ||* 1 | HASH JOIN | | 1837 | 51436 | 76 (3)| 00:00:01 ||* 2 | TABLE ACCESS FULL| T2 | 200 | 2800 | 38 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T1 | 400 | 5600 | 38 (3)| 00:00:01 |---------------------------------------------------------------------------

3 过滤基数

基数=基本选择率*(num_rows-nulls)

Update t1 set filter=null where mod(to_number(v1),50)=0;

Update t2 set filter2=null where mod(to_number(v2),100)=0;

200 rows updated

100 rows updated

T1.filter cardinatitly=1/25*(10000-200)=392

T2.FILTER2 CARDINATILTY=1/50(10000-100)=198

连接基数=392*198*0.022959125=1781.995

已选择2000行。

已用时间: 00: 00: 00.06

执行计划

----------------------------------------------------------Plan hash value: 2959412835---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1782 | 49896 | 76 (3)| 00:00:01 ||* 1 | HASH JOIN | | 1782 | 49896 | 76 (3)| 00:00:01 ||* 2 | TABLE ACCESS FULL| T2 | 198 | 2772 | 38 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T1 | 392 | 5488 | 38 (3)| 00:00:01 |

4 多连接条件

selectt3.v2,t4.v2

from t3,t4

where t3.join1=t4.join2

and t3.join2=t4.join2

连接公式:=(条件1选择率)*(条件2选择率)

不求证了!

5范围连接选择率

1 Where t1.join1

2 Where t2.join1 between t1.join1-1 and t1.join1+1

1选择率=5%固定选择率

2转化成绑定变量格式,固定选择率相乘.5%*5%

6 不等连接选择率

Where t1.join1!=t2.join2

选择率= 1-( t1.join1=t2.join2选择率)

=1-1/40=39/40

7 AND OR 多连接条件

1 where t1.join1=t2.join1 and t1.join2=t2.join2

2 where t1.join1=t2.join1 OR t1.join2=t2.join2

可以参考单表基数的多谓词选择率

1 join1选择率*join2选择率

2 join1选择率+join2选择率- join1选择率*join2选择率

8 三表连接基数选择率

createtable t3 as

selecttrunc(dbms_random.value(0,50)) filter2,

trunc(dbms_random.value(0,30)) join1,

trunc(dbms_random.value(0,50)) join2,

lpad(rownum,10) v2,

rpad('x',100) padding2

from all_objects

where rownum<=10000;

然后重新生存T1和T2表分析后执行语句

selectt1.v1,t2.v2,t3.v2

from t1,t2,t3

where t1.join1=t2.join2

and t2.join2=t3.join1

and t1.filter1=1and t2.filter2=1

1先做T1和T2的选择率和基数

前面已经获得2000

2 T2和T3做连接

套用公式T2和T3

选择率=(10000-0)/10000)*(10000-0)/10000)/greater(40,30)=1/40

基数=1/40*2000*10000=50,0000

注意 2000是第一个连接的基数,10000是T3无过滤条件的基数.

已用时间: 00: 00: 09.42

执行计划

----------------------------------------------------------Plan hash value: 1184213596----------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 500K| 19M| 123 (9)| 00:00:02 ||* 1 | HASH JOIN | | 500K| 19M| 123 (9)| 00:00:02 ||* 2 | HASH JOIN | | 2000 | 56000 | 76 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T2 | 200 | 2800 | 38 (3)| 00:00:01 ||* 4 | TABLE ACCESS FULL| T1 | 400 | 5600 | 38 (3)| 00:00:01 || 5 | TABLE ACCESS FULL | T3 | 10000 | 117K| 39 (3)| 00:00:01 |

----------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - access("T2"."JOIN2"="T3"."JOIN1")

2 - access("T1"."JOIN1"="T2"."JOIN2")

3 - filter("T2"."FILTER2"=1)

4 - filter("T1"."FILTER1"=1)

9 传递闭包

createtable t4 as

selecttrunc(dbms_random.value(0,50)) filter2,

trunc(dbms_random.value(0,40)) join1,

trunc(dbms_random.value(0,40)) join2,

lpad(rownum,10) v2,

rpad('x',100) padding2

from all_objects

where rownum<=10000;

selectt3.v2,t4.v2

from t3,t4

where t3.join1=t4.join1

and t3.join2=t4.join2

and t3.join1=20;

传递闭包是这么回事因为T3.JOIN1=20并且T3.JOIN1=T4.JOIN1则T4.JOIN1=20;

执行计划

----------------------------------------------------------Plan hash value: 920528290---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 52 | 1456 | 78 (3)| 00:00:01 ||* 1 | HASH JOIN | | 52 | 1456 | 78 (3)| 00:00:01 ||* 2 | TABLE ACCESS FULL| T4 | 250 | 3500 | 39 (3)| 00:00:01 ||* 3 | TABLE ACCESS FULL| T3 | 333 | 4662 | 39 (3)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------

1 - access("T3"."JOIN1"="T4"."JOIN1" AND "T3"."JOIN2"="T4"."JOIN2")

2 - filter("T4"."JOIN1"=20)

3 - filter("T3"."JOIN1"=20)

实际上结果集行数是:1554 与52基数相差超大

因为 JOIN1选择率*JOIN2选择率=(10000-0)/10000)*(10000-0)/10000)/greater(30,40)

*(10000-0)/10000)*(10000-0)/10000)/greater(50,40)=1/40*1/50=1/2000

并且因为10G多列完备性检查,选择结果集最小选择率相乘 1/40*1/40=1/1600

基数=1/1600*10000/30*10000/40=52

基数=1/40*10000/30*10000/50=1/40*333*200=1665 与结果集相当.因为该版本没有消除掉连接条件.

本文原创发布php中文网,转载请注明出处,感谢您的尊重!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值