总结:
hive中复杂数据类型包括数组(array)、映射(map)和结构体(struct)
array
COLLECTION ITEMS TERMINATED BY ‘,’ 指定数组中每个元素的分割符
size(列) 该array多少元素
列[下标] 取出array这一列的某个元素
map
COLLECTION ITEMS TERMINATED BY ‘-’ MAP KEYS TERMINATED BY ‘:’; 指定map中每个 kv的分割符及每个kv中k和v的分割符
size(列) 该map多少元素
列[k的值] 取出map某个k的v
struct
列.[k的值] 取出某个k的v
比如user是一个STRUCT类型,那么可以通过user.name得到这个用户的名称。 hive 复杂数据类型
hive-udf 函数地址
array类型
建表创建 array类型指定分隔符
vim / export/ datas/ array. txt
zhangsan beijing, shanghai, tianjin
wangwu shanghai, chengdu, wuhan, haerbin
create database db_complex;
use db_complex;
create table if not exists complex_array(
name string,
work_locations array< string>
)
row format delimited fields terminated by '\t'
COLLECTION ITEMS TERMINATED BY ','
;
load data local inpath '/export/datas/array.txt' into table complex_array;
select name, size( work_locations) as numb from complex_array;
select name, work_locations[ 0 ] , work_locations[ 1 ] from complex_array;
array 方法
array(val1, val2, …)
Creates an array with the given elements.
select array( '唱歌' , '跳舞' , '打球' ) ;
[ "唱歌" , "跳舞" , "打球" ]
map类型
建表创建 map 类型指定分隔符
vim / export/ datas/ map. txt
1 , zhangsan, 唱歌:非常喜欢- 跳舞:喜欢- 游泳:一般般
2 , lisi, 打游戏:非常喜欢- 篮球:不喜欢
create table if not exists complex_map(
id int ,
name string,
hobby map< string, string>
)
row format delimited fields terminated by ','
COLLECTION ITEMS TERMINATED BY '-' MAP KEYS TERMINATED BY ':'
;
load data local inpath '/export/datas/map.txt' into table complex_map;
select name, size( hobby) as numb from complex_map;
select name, hobby[ "唱歌" ] as deep from complex_map;
string→map
str_to_map(text[, delimiter1, delimiter2])
Splits text into key-value pairs using two delimiters. Delimiter1 separates text into K-V pairs, and Delimiter2 splits each K-V pair. Default delimiters are ‘,’ for delimiter1 and ‘:’ for delimiter2. 翻译即该函数默认第一个分隔符为每个KeyValue之间的分隔符
,第二个分隔符为KEY和Value之间的分隔符
,也可以自定义.
select str_to_map( '唱歌=喜欢&跳舞=喜欢' , '&' , '=' ) ;
{"跳舞" :"喜欢" , "唱歌" :"喜欢" }
map 方法
map(key1, value1, key2, value2, …)
Creates a map with the given key/value pairs.
select map( "唱歌" , '喜欢' , "跳舞" , '不喜欢' , "打球" , '喜欢' ) ;
{"唱歌" :"喜欢" , "跳舞" :"不喜欢" , "打球" :"喜欢" }
struct 类型
建表创建 struct类型指定分隔符
vim / export/ datas/ struct. txt
1001 , zhangsan:24
1002 , lisi:25
1003 , xiaoming:26
1004 , dongdong:27
create table complex_struct(
id int
, infostruct< name:string, age:int >
)
row format delimited fields terminated by ","
collection items terminated by ":"
;
load data local inpath "/export/datas/struct.txt" into table complex_struct;
select * from complex_struct;
select info. name, info. age from complex_struct;
struct方法
struct(val1, val2, val3, …)
Creates a struct with the given field values. Struct field names will be col1, col2, …
select struct( '唱歌' , '跳舞' , '打球' ) ;
{"col1" :"唱歌" , "col2" :"跳舞" , "col3" :"打球" }
named_struct方法
named_struct(name1, val1, name2, val2, …)
Creates a struct with the given field names and values. (As of Hive 0.8.0.)
select named_struct( "eventType" , '唱歌' , "eventTime" , '1999-01-01' )
;
{"eventtype" :"唱歌" , "eventtime" :"1999-01-01" }
参考文档