Hash Join,又称为Hybrid Hash Join,通常是inner表根据关联字段被hash分布到内存中,outer表根据被hash的关联字段与内存中的hash表做匹配。若inner表太大时,可能会采用overflow到磁盘的方式做关联,此时性能也将会受到影响。
Hash Join又具体可以细分为以下几种类型:
- HYBRID_HASH_JOIN
- HYBRID_HASH_SEMI_JOIN
- HYBRID_HASH_ANTI_SEMI_JOIN
- LEFT_HYBRID_HASH_JOIN
HYBRID_HASH_JOIN
HYBRID_HASH_JOIN,两个子表做数据关联。关联时,会为inner表创建一个hash表,outer表根据关联字段去匹配hash表中的数据。当inner表太大时内存无法满足时会采用overflow到磁盘做处理。
典型语句为等值连接和交叉连接。
例如以下等值关联即HYBRID_HASH_JOIN,
>>explain options 'f' select a.* from DMA_ENTTYPE_LIST a inner join DMA_ENTTYPE_LIST b
+>on (a.ENTNAME = b.ENTNAME) where a.entname like 'A%';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 5.50E+002
2 1 3 hybrid_hash_join 5.50E+002
. . 2 trafodion_scan DMA_ENTTYPE_LIST 1.00E+002
. . 1 trafodion_scan DMA_ENTTYPE_LIST 1.10E+001
详细的关联执行计划如下,
HYBRID_HASH_JOIN ========================== SEQ_NO 3 CHILDREN 2, 1
REQUESTS_IN .............. 1
ROWS_OUT ............... 551
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
memory_quota ........... 800 MB
max_card_est ....... 2,500
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
est_memory_per_instance 260.909 KB
record_length ........ 429
join_type .............. inner
join_method ............ hash
hash_join_predicates ... (TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME =
TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME)
如果是交叉连接,join_method将显示为以下,
join_method ............ hash (cross product)
HYBRID_HASH_SEMI_JOIN
HYBRID_HASH_SEMI_JOIN与HYBRID_HASH_JOIN的不同之处在于当外表的一行数据匹配到内表的多行数据时,只返回一行匹配的数据。而HYBRID_HASH_JOIN则会返回所有的匹配行。
典型语句为IN或EXISTS子查询。
例如以下IN子查询语句,
>>explain options 'f'
+>select a.* from DMA_ENTTYPE_LIST a
+>where a.ENTNAME in (select b.ENTNAME from DMA_ENTTYPE_LIST b) and a.entname like 'A%';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 1.10E+001
2 1 3 hybrid_hash_semi_joi u 1.10E+001
. . 2 trafodion_scan DMA_ENTTYPE_LIST 1.10E+001
. . 1 trafodion_scan DMA_ENTTYPE_LIST 1.00E+002
详细的关联执行计划如下,
HYBRID_HASH_SEMI_JOIN ===================== SEQ_NO 3 CHILDREN 2, 1
REQUESTS_IN .............. 1
ROWS_OUT ................ 11
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
memory_quota ........... 100 MB
max_card_est .......... 50
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
est_memory_per_instance 298.188 KB
record_length ........ 429
join_type .............. inner semi
join_method ............ unique-hash
hash_join_predicates ... (TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME =
TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME)
可以看出, join_type为inner semi,join_method为unique-hash。
HYBRID_HASH_ANTI_SEMI_JOIN
HYBRID_HASH_ANTI_SEMI_JOIN,与HYBRID_HASH_SEMI_JOIN的区别是,outer表只返回没有与inner表中匹配到的数据,而HYBRID_HASH_SEMI_JOIN是返回与inner表匹配到的数据。
典型语句为NOT IN或NOT EXISTS子查询。
例如以下,
>>explain options 'f'
+>select a.* from DMA_ENTTYPE_LIST a
+>where a.ENTNAME not in (select b.ENTNAME from DMA_ENTTYPE_LIST b) and a.entname like 'A%';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 1.10E+001
2 1 3 hybrid_hash_anti_sem 1.10E+001
. . 2 trafodion_scan DMA_ENTTYPE_LIST 1.10E+001
. . 1 trafodion_scan DMA_ENTTYPE_LIST 1.00E+002
--- SQL operation complete.
详细的关联执行计划如下,
HYBRID_HASH_ANTI_SEMI_JOIN ================ SEQ_NO 3 CHILDREN 2, 1
REQUESTS_IN .............. 1
ROWS_OUT ................ 11
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
memory_quota ........... 800 MB
max_card_est ....... 5,000
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
est_memory_per_instance 298.188 KB
record_length ........ 429
join_type .............. inner anti-semi
join_method ............ hash
hash_join_predicates ... (TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME =
TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME)
check_inner_null_expr TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME is null
check_outer_null_expr TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME is null
LEFT_HYBRID_HASH_JOIN
LEFT_HYBRID_HASH_JOIN,与HYBRID_HASH_JOIN的区别在于无论外表是否匹配到内表都会返回外表的相应数据,因此通常用于LEFT JOIN关联。
例如以下LEFT JOIN语句,
>>explain options 'f'
+>select a.* from DMA_ENTTYPE_LIST a left join DMA_ENTTYPE_LIST b on (a.ENTNAME = b.ENTNAME) where a.entname like 'A%';
LC RC OP OPERATOR OPT DESCRIPTION CARD
---- ---- ---- -------------------- -------- -------------------- ---------
3 . 4 root 5.50E+002
2 1 3 left_hybrid_hash_joi 5.50E+002
. . 2 trafodion_scan DMA_ENTTYPE_LIST 1.10E+001
. . 1 trafodion_scan DMA_ENTTYPE_LIST 1.00E+002
--- SQL operation complete.
详细执行计划如下,
LEFT_HYBRID_HASH_JOIN ===================== SEQ_NO 3 CHILDREN 2, 1
REQUESTS_IN .............. 1
ROWS_OUT ............... 550
EST_OPER_COST ............ 0.01
EST_TOTAL_COST ........... 0.01
DESCRIPTION
memory_quota ........... 800 MB
max_card_est ....... 2,500
fragment_id ............ 0
parent_frag ............ (none)
fragment_type .......... master
est_memory_per_instance 298.188 KB
record_length ........ 429
join_type .............. left
join_method ............ hash
hash_join_predicates ... (TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME =
TRAFODION.DAAS_GX.DMA_ENTTYPE_LIST.ENTNAME)
以上join_type显示为left。
到此,我们简单的描述了Hash Join的几种常见类型,以及不同类型的执行计划细节。