首先,创建两个表main和sub:
main:
ID | TITLE |
1 | main_1 |
2 | main_2 |
3 | main_3 |
4 | main_4 |
sub:
ID | MAINID | TITLE |
1 | 1 | sub_1 |
2 | 1 | sub_2 |
3 | sub_3 | |
4 | sub_4 |
OK,再来分析。这三个SQL是包含不同功能的:
joins:是用来合并多个表的行。(A join is used to combine rows from multiple tables)。
exist:为条件SQL.它的后面跟一个子查询,如果子查询返回至少一条记录,即为true。(exist condition is considered "to be met" if the subquery returns at least one row)。
in:为条件SQL, column1 in (value1, value2, .... value_n),用于替换 column1=value1 or column1 = value2...。比较常见的是在in后面跟个从其他表中取数的子查询。
虽然三个SQL的功能不同,但常常能达到相同的结果,所以在开发中经常混淆三个的使用。然而,不注意他们的区别的话,可能会遇到麻烦:
--从main中取出有sub孩子(main的id存在于sub的mainid中)的数据
--inner join,需用DISTINCT,由于main可以有多个sub孩子,会返回重复的main表数据
SELECT DISTINCT main.id,main.title
FROM main,sub
WHERE main.id=sub.mainid
--exist,同样的结果
SELECT main.id,main.title
FROM main
WHERE EXIST (SELECT sub.id FROM sub WHERE sub.mainid=main.id)
--in,同样的结果
SELECT main.id,main.title
FROM main
WHERE main.id IN (SELECT sub.mainid FROM sub)
--从main中取出没有sub孩子(main的id不存在于sub的mainid中)的数据
--outter join,正确的结果
SELECT main.id,main.title
FROM main,sub
WHERE main.id=sub.mainid(+) AND sub.id IS NULL
--not exist,正确的结果
SELECT main.id,main.title
FROM main
WHERE NOT EXIST (SELECT sub.id FROM sub WHERE sub.mainid=main.id)
--not in,不是期望的结果。是由于in 值域中有NULL。具体原因仍不太懂……
SELECT main.id,main.title
FROM main
WHERE main.id NOT IN (SELECT sub.mainid FROM sub)
--引用其他人归结的具体原因:
-- IN doesn't get correct results.
-- That's because of how IN treats NULLs and the Three-valued logic
-- NULL is treated as an unknown, so if there's a null in the t2.t1id
-- NOT IN will return either NOT TRUE or NOT UNKNOWN. And neither can be TRUE.
-- when there's a NULL in the t1id column of the t2 table the NOT IN query will always return an empty set.