in 和exists
in是把外表和内表作hash 连接,而exists 是对外表作loop 循环,每次loop 循环再对内表进行查询。
如果查询的两个表大小相当,那么用in 和exists 差别不大。
如果两个表中一个较小,一个是大表,则子查询表大的用exists,子查询表小的用in:
e.g
新建测试表
create
table
t1
as
select
seq_t1.nextval
id
,
object_name,owner
from
dba_objects;
create
index
idx_t1
on
t1(
id
);
create
table
t2
as
select
seq_t2.nextval
id
,object_name,owner
from
dba_objects;
begin
for
i
in
1
..100
loop
INSERT
INTO
t2
select
seq_t2.nextval
id
,object_name,owner
from
dba_objects;
COMMIT
;
end
loop
;
end
;
create index index_t2 on t2(id);
/*t2表是大表,in时放里面速度较慢,exists 时放里面速度较快*/
select
count
(*)
from
t1
where
id
in
(
select
id
from
t2);--慢
select
count
(*)
from
t1
where
exists
(
select
1
from
t2
where
t1.id= t2.id);--快
not in ,not exists
1.not in 与 not exists 不等价
e.g
新建测试表
create table t3(id number(1));
insert into t3 values(2);
insert into t3 values(3);
insert into t3 values(4);
insert into t3 values(2);
insert into t3 values(null);
commit;
create index idx_t3 on t3(id);
insert into t3 values(2);
insert into t3 values(3);
insert into t3 values(4);
insert into t3 values(2);
insert into t3 values(null);
commit;
create index idx_t3 on t3(id);
然后再次执行两个SQL:
select * from t1 where not exists (select 1 from t3 where t3.id= t1.id);
select
*
from
t1
where
id
not
in
(
select
id
from
t3);
select
*
from
t1
where
id
not
in
(
1
,
2
,
3
,
4
,
null
);
select
*
from
t1
where
id
<>
1
and
id
<>
2
and
id
<>
3
and
id
<>
4
and
id
<>
null
;
因为id <> null是一个永假式,所以最终查出的结果肯定也就是空了。
由此可以得出结论:只要not in的子查询中包含空值,那么最终的结果就为空
2
.not
in
与
not
exists
的性能
update
t3
set
id
=
9
where
id
is
null
;
/*内表大,外表小时*/
select
count
(*)
from
t2
where
id
not
in
(
select
id
from
t3); --快
但是not in 会造成内表的全表扫描,当内表很大时,性能非常低效
select
count
(*)
from
t2
where
not
exists
(
select
1
from
t3
where
t3.id= t2.id); --慢