排序合并连接：行源1和行源2的数据分别排序，然后将两个排序的源表合并，符合连接条件的记录放到结果集中。由于排序需要内存空间，sort merge join对内存有比较大的消耗，如果内存空间(8i为sort_area_size，9i及以上使用PGA)不足，则会使用临时表空间，这样会降低排序合并连接的效率。排序合并连接是最古老的表连接方式之一。
What is the difference between "Sort Merge" and "Hash" Joins. Don't they both do a one
FULL scan each on the joining tables and join them?
I know Sort Merge is used in the case of "ALL ROWS" and Nested Loops in the case of
"FIRST ROWS" hints. How about Has Join? When is it used?
Would really appreciate if you could explain it with a couple of examples.
Thanks in advance.
and we said...
Well, a sort merge of A and B is sort of like this:
read A and sort by join key to temp_a
read B and sort by join key to temp_b
read a record from temp_a
read a record from temp_b
while NOT eof on temp_a and temp_b
if ( temp_a.key = temp_b.key ) then output joined record
elsif ( temp_a.key <= temp_b.key ) read a record from temp_a
elsif ( temp_a.key >= temp_b.key ) read a record from temp_b )
(its more complex then that, the above logic assumed the join key was unique -- we really
need to join every match in temp_a to every match in temp_b but you get the picture)
The hash join is conceptually like:
create a hash table on one of A or B (say A) on the join key creating temp_a.
while NOT eof on B
read a record in b
hash the join key and look up into temp_a by that hash key for matching
output the matches
So, a hash join can sometimes be much more efficient (one hash, not two sorts)
Hash joins are used any time a sort merge might be used in most cases. If you don't see
hash joins going on, perhaps you have hash_join_enabled turned off...