hive数据类型

hive数据类型分为基本数据类型和复杂数据类型(参考官网:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types)

基本数据类型:常用的
int,bigint,string,float,double,boolean,date,timestamp
不做过多解释了
复杂数据类型:
1.array<dataTyle>
eg:
[hadoop@hadoop001 rzData]$ cat hive_array.txt 
zhangsan        beijing,shanghai,tianjin,hangzhou
lisi    changchu,chengdu,wuhan,beijing
hive (hwzhdb)> create table arrayType(
             > name string,
             > locations array<string>
             > )
             > row format delimited fields terminated by '\t'
             > collection items terminated by ','
             > stored as textfile;
hive (hwzhdb)> select * from arrayType;
OK
arraytype.name  arraytype.locations
zhangsan        ["beijing","shanghai","tianjin","hangzhou"]
lisi    ["changchu","chengdu","wuhan","beijing"]
##如果想取array中的数据可以通过数组名[下标]来取
hive (hwzhdb)> select name,locations[0],locations[1],locations[2] from arrayType;
OK
name    _c1     _c2     _c3
zhangsan        beijing shanghai        tianjin
lisi    changchu        chengdu wuhan
Time taken: 0.192 seconds, Fetched: 2 row(s)
##如果想看数组中是否包含某值,可以使用array_contains函数
hive (hwzhdb)> select array_contains(locations,"tianjin") from arrayType;
OK
_c0
true
false
或者
hive (hwzhdb)> select * from arrayType where array_contains(locations,'tianjin');
OK
arraytype.name  arraytype.locations
zhangsan        ["beijing","shanghai","tianjin","hangzhou"]
 第一条记录的数组中包含'tianjin',而第二条记录中不包含
 ##查看数组大小,使用size() 
 hive (hwzhdb)> select name,size(locations) from arraytype;
OK
name    _c1
zhangsan        4
lisi    4
Time taken: 0.062 seconds, Fetched: 2 row(s)


2.map<keyType,valueTyle>
eg:
[hadoop@hadoop001 rzData]$ cat hive_map.txt 
1,zhangsan,father:xiaoming#mother:xiaohuang#brother:xiaoxu,28
2,lisi,father:mayun#mother:huangyi#brother:guanyu,22
3,wangwu,father:wangjianlin#mother:ruhua#sister:jingtian,29
4,mayun,father:mayongzhen#mother:angelababy,26
创建表
hive (hwzhdb)> create table mapType(
             > id int,
             > name string,
             > members map<string,string>,
             > age int
             > )
             > row format delimited fields terminated by ','
             > collection items terminated by '#'
             > map keys terminated by ':';
OK
Time taken: 0.116 seconds
hive (hwzhdb)> load data local inpath '/home/hadoop/data/rzData/hive_map.txt' into table mapType;
Loading data to table hwzhdb.maptype
Table hwzhdb.maptype stats: [numFiles=1, totalSize=222]
OK
Time taken: 0.379 seconds
hive (hwzhdb)> select * from mapType;
OK
maptype.id      maptype.name    maptype.members maptype.age
1       zhangsan        {"father":"xiaoming","mother":"xiaohuang","brother":"xiaoxu"}   28
2       lisi    {"father":"mayun","mother":"huangyi","brother":"guanyu"}        22
3       wangwu  {"father":"wangjianlin","mother":"ruhua","sister":"jingtian"}   29
4       mayun   {"father":"mayongzhen","mother":"angelababy"}   26
Time taken: 0.075 seconds, Fetched: 4 row(s)
##如果想取map中的数据要通过map名['key']来取
hive (hwzhdb)> select id,name,age,members['father'] from mapType;
OK
id      name    age     _c3
1       zhangsan        28      xiaoming
2       lisi    22      mayun
3       wangwu  29      wangjianlin
4       mayun   26      mayongzhen
Time taken: 0.1 seconds, Fetched: 4 row(s)
##如果想获取全部的key可以使用map_keys()传入一个map就能返回一个数组,该数组包含该map的所有key
hive (hwzhdb)> select id,name,age,map_keys(members) from mapType;
OK
id      name    age     _c3
1       zhangsan        28      ["father","mother","brother"]
2       lisi    22      ["father","mother","brother"]
3       wangwu  29      ["father","mother","sister"]
4       mayun   26      ["father","mother"]
Time taken: 0.114 seconds, Fetched: 4 row(s)
##同理如果你想获取全部的value可以使用map_values()传入一个map就能返回一个数组,该数组包含该map的所有value
hive (hwzhdb)> select id,name,age,map_values(members) from mapType;
OK
id      name    age     _c3
1       zhangsan        28      ["xiaoming","xiaohuang","xiaoxu"]
2       lisi    22      ["mayun","huangyi","guanyu"]
3       wangwu  29      ["wangjianlin","ruhua","jingtian"]
4       mayun   26      ["mayongzhen","angelababy"]
Time taken: 0.096 seconds, Fetched: 4 row(s)
##如果想查看map的大小使用size()传入一个map
hive (hwzhdb)> select id,name,age,size(members) from mapType;
OK
id      name    age     _c3
1       zhangsan        28      3
2       lisi    22      3
3       wangwu  29      3
4       mayun   26      2
Time taken: 0.103 seconds, Fetched: 4 row(s)



3.structs<dataype1,datatype2...> 
[hadoop@hadoop001 rzData]$ cat hive_struct.txt 
192.168.1.1#zhangsan:40
192.168.1.2#lisi:50
192.168.1.3#wangwu:60
创建表
hive (hwzhdb)> create table structType(
             > ip string,
             > info struct<name:string,age:int>
             > )
             > row format delimited fields terminated by '#'
             > collection items terminated by ':';
OK
Time taken: 0.115 seconds
hive (hwzhdb)> load data local inpath '/home/hadoop/data/rzData/hive_struct.txt' into table structType;
Loading data to table hwzhdb.structtype
Table hwzhdb.structtype stats: [numFiles=1, totalSize=88]
OK
Time taken: 0.318 seconds
hive (hwzhdb)> select * from structType;
OK
structtype.ip   structtype.info
192.168.1.1     {"name":"zhangsan","age":40}
192.168.1.2     {"name":"lisi","age":50}
192.168.1.3     {"name":"wangwu","age":60}
192.168.1.4     {"name":"zhaoliu","age":70}
Time taken: 0.069 seconds, Fetched: 4 row(s)
##如果想获取struct中的值,直接通过struct字段.属性名的方式即可
hive (hwzhdb)> select ip,info.name,info.age from structType;
OK
ip      name    age
192.168.1.1     zhangsan        40
192.168.1.2     lisi    50
192.168.1.3     wangwu  60
192.168.1.4     zhaoliu 70
Time taken: 0.076 seconds, Fetched: 4 row(s)

总结:
array:
array中只能存放相同数据类型的数据。
array中获取数据用array[下标]的方式获取,下标从0开始。
计算array的大小可以使用size()函数,传入一个array,返回一个int类型的数值。
如果想看array中是否包含某值,可以使用array_contains()函数,传入一个array,一个某值。

map:
map中的key只能是相同的数据类型。
如果要获取map中某key的值,需要通过map名[‘key’]的方式获取。
如果要获取map中所有的key值,需要通过map_keys()传入一个map,然后返回一个array,包含所有的key。
如果要获取map中所有的value,需要通过map_values()传入一个map,然后返回一个array,包含所有的value。
如果要查看map的大小,使用size()函数传入一个map,返回一个int类型的数值。

struct:
struct中的数据类型可以是多种数据类型。
struct中取值通过struct字段.属性名的方式取值。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值