hive理解streamtable使用

今天看别人的博客,发现streamtable这个东西,作者描述是:

将大表放在JION的右边,这是就需要指定使用/*+ STREAMTABLE(..) */:

 
 
  1. hive> SELECT /*+ STREAMTABLE(b) */ a.val, b.val, c.val FROM a JOIN b
  2. > ON (a.key = b.key1) JOIN c将大表放在JION的右边,这是就需要指定使用/*+ STREAMTABLE(..) */: hive> SELECT /*+ STREAMTABLE(b) */ a.val, b.val, c.val FROM a JOIN b       > ON (a.key = b.key1) JOIN c ON (c.key = b.key1) ON (c.key = b.key1)
有点懵懂,看完另一个哥们写的才若有所悟
From my understanding, when you have the join happening in map or reduce, the values corresponding to a key from all all table's except one (if two tables are involved in join on same key, then just one table here) are buffered in memory and the left out one is streamed. Usually it is the largest table to be streamed, else the larger data can go into the memory(buffer) and create OOM errors.
This stream table hint is used to specify which table to be streamed. By default it is the table that comes on the right is streamed and the other is buffered. But if you wan't  other  than right table to be streamed you go for this hint.
If you are joining more tables on different keys, then for every join set just specify the larger table on the right of ON condition. No need of stream table hint here.
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值