redgate给出的提示
https://documentation.red-gate.com/codeanalysis/performance-rules/pe019
In theory, EXISTS is faster because the search stops as soon as the condition is true, whereas IN has to collect all sub-query results before testing the condition.
In practice, the query optimizer treats EXISTS and IN the same way whenever it can.
Stack Overflow上的解释
https://stackoverflow.com/questions/24929/difference-between-exists-and-in-in-sql
I'm assuming you know what they do, and thus are used differently, so I'm going to understand your question as: When would it be a good idea to rewrite the SQL to use IN instead of EXISTS, or vice versa.
Is that a fair assumption?
Edit: The reason I'm asking is that in many cases you can rewrite an SQL based on IN to use an EXISTS instead, and vice versa, and for some database engines, the query optimizer will treat the two differently.
For instance:
SELECT * FROM Customers WHERE EXISTS ( SELECT * FROM Orders WHERE Orders.CustomerID = Customers.ID )
can be rewritten to:
SELECT * FROM Customers WHERE ID IN ( SELECT CustomerID FROM Orders )
or with a join:
SELECT Customers.* FROM Customers INNER JOIN Orders ON Customers.ID = Orders.CustomerID
So my question still stands, is the original poster wondering about what IN and EXISTS does, and thus how to use it, or does he ask wether rewriting an SQL using IN to use EXISTS instead, or vice versa, will be a good idea?
Example
使用In
SELECT * FROM dbo.CMS_Transformation WHERE TransformationClassID IN ( SELECT ClassID FROM dbo.CMS_Class WHERE ClassName LIKE '%lisa%' );
使用Exists
SELECT * FROM dbo.CMS_Transformation WHERE EXISTS ( SELECT * FROM dbo.CMS_Class WHERE ClassName LIKE '%lisa%' AND ClassID = CMS_Transformation.TransformationClassID );
对比
执行结果,因为数据量较少,看起来似乎是一样的
扩展
Not Exists 和Not In是完全不同的东西
https://stackoverflow.com/questions/129077/not-in-clause-and-null-values
To state it simply, why does query A return a result but B doesn't?
A: select 'true' where 3 in (1, 2, 3, null)
B: select 'true' where 3 not in (1, 2, null)
This was on SQL Server 2005. I also found that calling set ansi_nulls off
causes B to return a result.
Query A is the same as:
select 'true' where 3 = 1 or 3 = 2 or 3 = 3 or 3 = null
Since 3 = 3 is true, you get a result.
Query B is the same as:
select 'true' where 3 <> 1 and 3 <> 2 and 3 <> null
When ansi_nulls is on, 3 <> null is UNKNOWN, so the predicate evaluates to UNKNOWN, and you don't get any rows.
When ansi_nulls is off, 3 <> null is true, so the predicate evaluates to true, and you get a row.