先看两个查询。
select count(*) from wxh_tbd where object_id not in (1,2,3);
select count(*) from wxh_tbd where object_id not in (1,2,3);
COUNT(*)
----------
20204
select count(*) from wxh_tbd where object_id not in (1,2,3,null);
----------
20204
select count(*) from wxh_tbd where object_id not in (1,2,3,null);
COUNT(*)
----------
0
很多人会觉得奇怪,怎么多了一个NULL值,第二个查询就没返回了?
第二个查询可以拆分为:
select count(*) from wxh_tbd where object_id!=1 and object_id!=2 and object_id!=3 and object_id!=null;
not in拆分后,是and的关系,只要一个不为true,那么整条表达式都是false,返回为空。
----------
0
很多人会觉得奇怪,怎么多了一个NULL值,第二个查询就没返回了?
第二个查询可以拆分为:
select count(*) from wxh_tbd where object_id!=1 and object_id!=2 and object_id!=3 and object_id!=null;
not in拆分后,是and的关系,只要一个不为true,那么整条表达式都是false,返回为空。
因此对于类似这样的查询
select object_id from wxh_tbd where object_id not in ( select object_id from wxh_tbd1);
只要子查询返回的结果集里有空值,那么整个结果集都是空。因为每个wxh_tbd.OBJECT_ID都是要跟NULL值做比较的,做比较的结果是FALSE.
select object_id from wxh_tbd where object_id not in ( select object_id from wxh_tbd1);
只要子查询返回的结果集里有空值,那么整个结果集都是空。因为每个wxh_tbd.OBJECT_ID都是要跟NULL值做比较的,做比较的结果是FALSE.
无论如何in不存在这样的问题。
select count(*) from wxh_tbd where object_id in (1,2,3,null);
可以拆分成:
select count(*) from wxh_tbd where object_id=1 or object_id=2 or object_id=3 or object_id=null;
跟NOT IN不同,IN拆分后是OR的关系,只要有一个为true,那么表达式就是true,就可以返回结果。
select count(*) from wxh_tbd where object_id in (1,2,3,null);
可以拆分成:
select count(*) from wxh_tbd where object_id=1 or object_id=2 or object_id=3 or object_id=null;
跟NOT IN不同,IN拆分后是OR的关系,只要有一个为true,那么表达式就是true,就可以返回结果。
11G以前not in经常会导致问题。拿上面的查询来说:
select object_id from wxh_tbd where object_id not in ( select object_id from wxh_tbd1);
如果wxh_tbd1.object_id,wxh_tbd.object_id上没有非空约束,那么查询计划会走FILTER,而且往往wxh_tbd1上是全表扫描。走不上NL,HASH.之所以走全表扫描可能基于一个简单的原因,NULL值不包含在索引里。
11G以后呢?
explain plan for
select object_id from wxh_tbd where object_id not in ( select object_id from wxh_tbd1);
Explained.
@display
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4021671869
---------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4021671869
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20414 | 199K| 131 (1)| 00:00:02 |
|* 1 | HASH JOIN ANTI NA | | 20414 | 199K| 131 (1)| 00:00:02 |
| 2 | TABLE ACCESS FULL| WXH_TBD | 20446 | 99K| 65 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| WXH_TBD1 | 20411 | 99K| 65 (0)| 00:00:01 |
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20414 | 199K| 131 (1)| 00:00:02 |
|* 1 | HASH JOIN ANTI NA | | 20414 | 199K| 131 (1)| 00:00:02 |
| 2 | TABLE ACCESS FULL| WXH_TBD | 20446 | 99K| 65 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| WXH_TBD1 | 20411 | 99K| 65 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
---------------------------------------------------
1 - access("OBJECT_ID"="OBJECT_ID")
其实11G的这种优化在我看来有点迟了,早就应该可以实现,无非是加一些IF ELSE判断。扫描WXH_TBD表,扔掉为空的记录,BUILD出HASH表。跟WXH_TBD1表做HASH,如果WXH_TBD1表里存在NULL值,立马终止查询,返回结果空。
其实11G的这种优化在我看来有点迟了,早就应该可以实现,无非是加一些IF ELSE判断。扫描WXH_TBD表,扔掉为空的记录,BUILD出HASH表。跟WXH_TBD1表做HASH,如果WXH_TBD1表里存在NULL值,立马终止查询,返回结果空。
11G即使表上都没NOT NULL约束,也可以走jion了,执行计划出现了 NA关键字,目前只发现可以走hash join.
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | FILTER | | | | |
| 2 | TABLE ACCESS FULL | WXH_TBD | | | |
|* 3 | TABLE ACCESS FULL | WXH_TBD1 | | | |
--------------------------------------------------------------------
11G以前无论如何不能走JION,只能走FILTER.
再看看NOT EXISTS对于空值的返回顺序:
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
|* 1 | FILTER | | | | |
| 2 | TABLE ACCESS FULL | WXH_TBD | | | |
|* 3 | TABLE ACCESS FULL | WXH_TBD1 | | | |
--------------------------------------------------------------------
11G以前无论如何不能走JION,只能走FILTER.
再看看NOT EXISTS对于空值的返回顺序:
select object_id from wxh_tbd where not exists ( select 1 from wxh_tbd1 where wxh_tbd1.object_id=wxh_tbd.object_id);
如果走hash,WXH_TBD为build table最先返回的是空值。道理其实很简单,HASH 表里空值不能被BUILD进去,可以扫描到空值就直接返回结果。如果wxh_tbd1为build table ,空值是混合在记录里的。
如果走hash,WXH_TBD为build table最先返回的是空值。道理其实很简单,HASH 表里空值不能被BUILD进去,可以扫描到空值就直接返回结果。如果wxh_tbd1为build table ,空值是混合在记录里的。
select object_id from wxh_tbd where not exists ( select /*+ swap_join_inputs(wxh_tbd1) */1 from wxh_tbd1 where wxh_tbd1.object_id=wxh_tbd.object_id);
如果走nest loop ,null值是混合在记录里的。
对于not in,如果在wxh_tbd表里存在空值,wxh_tbd1表里没空值,那么整个查询会返回符合条件的结果集,但是空值不会作为结果返回。
对于not exists,如果在wxh_tbd表里存在空值,那么整个查询会返回符合条件的结果集,而且空值会作为结果返回。
对于not exists,如果在wxh_tbd表里存在空值,那么整个查询会返回符合条件的结果集,而且空值会作为结果返回。
理论上,IN 与EXISTS是完全等价的,但是各个ORACLE版本可能执行计划存在差异,(10G以后应该无任何差异了)。
NOT IN 与NOT EXISTS的差异根据有NULL的不同而不同。如果作为条件判断的字段都存在NOT NULL约束,那么这种差异也就不存在了。
NOT IN 与NOT EXISTS的差异根据有NULL的不同而不同。如果作为条件判断的字段都存在NOT NULL约束,那么这种差异也就不存在了。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22034023/viewspace-715280/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/22034023/viewspace-715280/