Sql三种链接之Hash Join

     1.Hash Join:

     The hash join executes in two phases: build and probe.  During the build phase, it reads all rows from the first input (often called the left or build input), hashes the rows on the equijoin keys, and creates an in-memory hash table.  During the probe phase, it reads all rows from the second input (often called the right or probe input), hashes these rows on the same equijoin keys, and looks or probes for matching rows in the hash table.  Since hash functions can lead to collisions (two different key values that hash to the same value), we typically must check each potential match to ensure that it really joins.

In pseudo-code:

for each row R1 in the build table
    begin
        calculate hash value on R1 join key(s)
        insert R1 into the appropriate hash bucket
    end
for each row R2 in the probe table
    begin
        calculate hash value on R2 join key(s)
        for each row R1 in the corresponding hash bucket
            if R1 joins with R2
                return (R1, R2)
    end

我们执行如下的SQL语句:

ContractedBlock.gif ExpandedBlockStart.gif View Code
 
   
create table T1 (a int , b int , x char ( 200 ))
create table T2 (a int , b int , x char ( 200 ))
create table T3 (a int , b int , x char ( 200 ))


set nocount on
declare @i int
set @i = 0
while @i < 1000
begin
insert T1 values ( @i * 2 , @i * 5 , @i )
set @i = @i + 1
end


declare @i int
set @i = 0
while @i < 10000
begin
insert T2 values ( @i * 3 , @i * 7 , @i )
set @i = @i + 1
end


declare @i int
set @i = 0
while @i < 100000
begin
insert T3 values ( @i * 5 , @i * 11 , @i )
set @i = @i + 1
end

      上面的SQL语句建立了三个表T1,T2,T3.然后对这三个表分别插入1000,10000,100000条数据。上面的哈希算法讲到在执行哈希连接的时候,会使用first input表做为哈希生成表,对连接条件做哈希键,对first input的各个行做哈希,放在内存中。而second input表将逐行根据连接条件对first input在内存中的哈希表进行哈希匹配。接下啦看两条执行语句:

ContractedBlock.gif ExpandedBlockStart.gif View Code
 
   
-- 连接查询1
select *
from T1 join T2 on T1.a = T2.a

SELECT *
FROM T2 JOIN T1 ON T1.a = T2.a

-- 上面的这两条执行语句都是以T1作为first input表,在内存中根据T1.a生成哈希表
--
而T2作为second input表,这时查询优化器会选择数据行较少的数据表作为first input表

       对于多个表的哈希探测还有另外几种情况:

2011050611123194.png

 Left Deep:前面表生成的内存中的hash表是作为后面的哈希build表。

Right Deep:前面表生成的内存中的hash表式作为后面的哈希Probe表。

Bushy:则同时生成两个内存中的哈希探测表,然后再哈希匹配一次。

ContractedBlock.gif ExpandedBlockStart.gif View Code
 
   
-- 连接查询2
select *
from (T1 join T2 on T1.a = T2.a)
join T3 on T1.b = T3.a

| -- Hash Match(Inner Join, HASH:([LearnDemo].[dbo].[T1].[b])=([LearnDemo].[dbo].[T3].[a]), RESIDUAL:([LearnDemo].[dbo].[T1].[b]=[LearnDemo].[dbo].[T3].[a]))
| -- Hash Match(Inner Join, HASH:([LearnDemo].[dbo].[T1].[a])=([LearnDemo].[dbo].[T2].[a]), RESIDUAL:([LearnDemo].[dbo].[T1].[a]=[LearnDemo].[dbo].[T2].[a]))
| | -- Table Scan(OBJECT:([LearnDemo].[dbo].[T1]))
| | -- Table Scan(OBJECT:([LearnDemo].[dbo].[T2]))
| -- Table Scan(OBJECT:([LearnDemo].[dbo].[T3]))
-- 可以看到这个表就是使用Left Deep的查询方式 ,先以T1作为hash build 表
-- T2作为hash probe表,这两个表hash查询后的结果作为hash build表,T3作为
-- hash probe表,最后在查询出结果
ContractedBlock.gif ExpandedBlockStart.gif View Code
 
   
-- 连接查询3
select *
from (T1 join T2 on T1.a = T2.a)
join T3 on T1.b = T3.a
where T1.a < 100

-- |--Hash Match(Inner Join, HASH:([LearnDemo].[dbo].[T2].[a])=([LearnDemo].[dbo].[T1].[a]), RESIDUAL:([LearnDemo].[dbo].[T1].[a]=[LearnDemo].[dbo].[T2].[a]))
-- |--Table Scan(OBJECT:([LearnDemo].[dbo].[T2]), WHERE:([LearnDemo].[dbo].[T2].[a]<(100)))
-- |--Hash Match(Inner Join, HASH:([LearnDemo].[dbo].[T1].[b])=([LearnDemo].[dbo].[T3].[a]), RESIDUAL:([LearnDemo].[dbo].[T1].[b]=[LearnDemo].[dbo].[T3].[a]))
-- |--Table Scan(OBJECT:([LearnDemo].[dbo].[T1]), WHERE:([LearnDemo].[dbo].[T1].[a]<(100)))
-- |--Table Scan(OBJECT:([LearnDemo].[dbo].[T3]))
-- 这个连接查询使用的是Right Deep

   参考文章地址:http://blogs.msdn.com/b/craigfr/archive/2006/08/10/687630.aspx

转载于:https://www.cnblogs.com/wuxiaoqian726/articles/2038678.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值