hive多表关联

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/kejiaming/article/details/52084898
多表join使用说明


select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on a.userId=c.userId
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on b.userId=c.userId
等价于
select * from (select userId from table_a where dt=20160731) a join (select userId from table_b where dt=20160731) b  on a.userId=b.userId join (select userId from table_c where dt=20160731) c on a.userId=c.userId and b.userId=c.userId




总结:
可以把a与b表join关联看成一个新的表table_j,table_j表有两列a.userId,b.userId与c表进行关联






如果是left outer join 效果相同,只不过是将对最后一个on链接条件来说,是与table_j的第一列相连还是与第二列相连而已


等价于table_j的第二列userId(a.userId)与c进行关联,不去管b表是否有能关联上

select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId 
left outer join (select userId from table_c where dt=20160731) c on a.userId=c.userId




等价于table_j的第二列userId(b.userId)与c进行关联,不去管a表是否有能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId 
left outer join (select userId from table_c where dt=20160731) c on b.userId=c.userId




等价于table_j的第一列与第二列userId(b.userId)与c进行关联,要求同时能关联上
select * from 
(select userId from table_a where dt=20160731) a 
left outer join (select userId from table_b where dt=20160731) b  on a.userId=b.userId  
left outer join (select userId from table_c where dt=20160731) c on a.userId=c.userId and b.userId=c.userId

展开阅读全文

没有更多推荐了,返回首页