hive的join和复合数据类型

最新推荐文章于 2022-04-01 11:28:27 发布

昨日飞升

最新推荐文章于 2022-04-01 11:28:27 发布

阅读量805

点赞数

分类专栏： hive

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/qzlzwhx/article/details/38340703

版权

hive 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Hive对一下两个表进行join

hive> select * from table1;

OK

1 a

2 b

3 c

hive> select * from table2;

OK

1 e

2 f

4 d

两个表的格式都一样如下：

hive> desc table1;

OK

id int

char string

Hive不同版本对join的连接是不一样的。左外连接：

Select t1.char , t2.char from table1 t1 left outer join table2 t2 on (t1.id = t2.id)结果如下

a e

b f

c NULL

右外连接：

Select t1.char , t2.char from table1 t1 right outer join table2 t2 on (t1.id = t2.id)结果如下：

a e

b f

NULL d

全连接

Select t1.char , t2.char from table1 t1 full outer join table2 t2 on ( t1.id = t2.id)结果如下：

a e

b f

c NULL

NULL d

Hive的map 端join

加入join两张表的时候，其中一张表非常小（可以放在内存中），可以使用map-site join，即map端join，原理是将其中一张表加载每个mapper的内存中，从而不需要reducer即可完成join。但是map端的join不适合full/right outer join。另外注意一点事hive中对null的处理和一般的sql也是一致的但是在join中null=null是有意义的，是成立的返回时true的。

Hive的复合数据类型

Hive有三种复合类型

Struct，array ， map

1.struct数据类型，通”.”’来查询的数据结构，例如：

数据类型如下：

1,liu:30

2,zhang:30

3,li:20

4,wang:80

创建表

Create table struct_test( id int , info struct<name String , age int>)

row format delimited

fields terminated by ‘,’

collection items terminated by ‘:’

查询：

select info.age from struct_test ;

2.array数据类型，同一类数据放置到放置到一个column中

数据类型如下：

liu,1:2:3:4

zhang,5:6

li,7:8:9

创建表：

create table array_test (name string , student_id_array array<int>)

row format delimited

fields terminated by ‘,’

collecion items terminated by ‘:’

查询：

select student_id_array[1] from array_test;

需要注意的是hive的array数据类型的下表是从1开始的，没有空指针，如果没有值那么为null

3.map数据类型

1 job:80,team:60,person:70

2 job:60,team:80

3 job:90,team:70,person:100

创建表：

create table map_test(id int, perf map< string , int>)

row format delimited

fields terminated by '\t'

collection items terminated by ','

map keys terminated by ':'

查询：

select perf[‘person’] from map_test;

结果：

60

80

70

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
hive的join和复合数据类型

Hive对一下两个表进行join hive> select * from table1;OK1 a2 b3 chive> select * from table2;OK1 e2 f4 d两个表的格式都一样如下： hive> desc table1;OKid
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。