I have a set of consecutive rows I want to get based upon their primary key, which is an auto-incrementing integer. Assuming that there are no holes, is there any performance between between:
SELECT * FROM `theTable` WHERE `id` IN (n, ... nk);
and:
SELECT * FROM `theTable` WHERE `id` BETWEEN n AND nk;
解决方案
BETWEEN should outperform IN in this case (but do measure and check execution plans, too!), especially as n grows and as statistics are still accurate. Let's assume:
m is the size of your table
n is the size of your range
Index can be used (n is tiny compared to m)
In theory, BETWEEN can be implemented with a single "range scan" (Oracle speak) on the primary key index, and then traverse at most n index leaf nodes. The complexity will be O(n + log m)
IN is usually implemented as a series (loop) of n "range scans" on the primary key index. With m being the size of the table, the complexity will always be O(n * log m) ... which is always worse (neglibile for very small tables m or very small ranges n)
Index cannot be used (n is a significant portion of m)
In any case, you'll get a full table scan and evaluate the predicate on each row:
BETWEEN needs to evaluate two predicates: One for the lower and one for the upper bound. The complexity is O(m)
IN needs to evaluate at most n predicates. The complexity is O(m * n) ... which is again always worse, or perhaps O(m) if the database can optimise the IN list to be a hashmap, rather than a list of predicates.