intersetct使用

使用intersetct
select user_id from aaa_log
where statis_date>=20091001
and statis_date<=20091031
intersect
select user_id from aaa_log
where statis_date>=20091001
and statis_date<=20091031

执行计划:
SELECT STATEMENT, GOAL = ALL_ROWS 147881 352550000 11986700000
PX COORDINATOR
PX SEND QC (RANDOM) SYS :TQ10002
INTERSECTION
SORT UNIQUE 73940 352550000 5993350000
PX RECEIVE 3685 352550000 5993350000
PX SEND HASH SYS :TQ10000 3685 352550000 5993350000
PX BLOCK ITERATOR 3685 352550000 5993350000 7 1 32
TABLE ACCESS FULL DW AAA_LOG 3685 352550000 5993350000 7 6849 7840
SORT UNIQUE 73940 352550000 5993350000
PX RECEIVE 3685 352550000 5993350000
PX SEND HASH SYS :TQ10001 3685 352550000 5993350000
PX BLOCK ITERATOR 3685 352550000 5993350000 12 1 32
TABLE ACCESS FULL DW AAA_LOG 3685 352550000 5993350000 12 6849 7840
使用内连接
select distinct user_id
from aaa_log t1, aa_log t2
where 1=1
and t1.user_id=t2.user_id
and t1.statis_date>=20091001
and t1.statis_date<=20091031
and t2.statis_date>=20090901
and t2.statis_date<=20090930

执行计划:
SELECT STATEMENT, GOAL = ALL_ROWS 600780126 3880 131920
PX COORDINATOR
PX SEND QC (RANDOM) SYS :TQ10001 600780126 3880 131920
HASH UNIQUE 600780126 3880 131920
PX RECEIVE 31480377 38997282262394 1.3259075969214E15
PX SEND HASH SYS :TQ10000 31480377 38997282262394 1.3259075969214E15
PX PARTITION HASH ALL 31480377 38997282262394 1.3259075969214E15 6 1 32
HASH JOIN 31480377 38997282262394 1.3259075969214E15
PX PARTITION RANGE ITERATOR 3685 352550000 5993350000 8 215 245
TABLE ACCESS FULL DW AAA_LOG 3685 352550000 5993350000 8 6849 7840
PX PARTITION RANGE ITERATOR 3672 352550000 5993350000 10 185 214
TABLE ACCESS FULL DW AAA_LOG3672 352550000 5993350000 10 5889 6848
观察这个执行计划发现,使用内连接的消耗远远大于使用intersect。究其原因主要是log表中user_id有重复的,这样导致HASH JOIN的时候有笛卡尔积。
在没有重复记录的时候一般是内连接比intersect快。

使用exits
select distinct user_id
from aaa_log t1
where 1=1
and t1.statis_date>=20091001
and t1.statis_date<=20091031
and exists( select 1 from aaa_log t2
where t2.user_id=t1.user_id
and t2.statis_date>=20090901
and t2.statis_date<=20090930
)
执行计划
SELECT STATEMENT, GOAL = ALL_ROWS 11003 3880 131920
PX COORDINATOR
PX SEND QC (RANDOM) SYS :TQ10001 11003 3880 131920
HASH UNIQUE 11003 3880 131920
PX RECEIVE 8003 352550000 11986700000
PX SEND HASH SYS :TQ10000 8003 352550000 11986700000
PX PARTITION HASH ALL 8003 352550000 11986700000 6 1 32
HASH JOIN SEMI 8003 352550000 11986700000
PX PARTITION RANGE ITERATOR 3691 352550000 5993350000 8 215 245
TABLE ACCESS FULL DW AAA_LOG 3691 352550000 5993350000 8 6849 7840
PX PARTITION RANGE ITERATOR 3672 352550000 5993350000 10 185 214
TABLE ACCESS FULL DW AAA_LOG 3672 352550000 5993350000 10 5889 6848

通过比较返现这种情况下使用exits最快,使用interset次之,但是差距不大,使用内连接特别差,对于海量数据可能几天也算不出来。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值