LanguageManual UDF - Apache Hive - Apache Software Foundationhttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDFexplode (array)
select explode(array('A','B','C'));
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.*,t.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
explode (map)
select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value); --和上面一样
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;--和上面一样
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;
posexplode
select posexplode(array('A','B','C'));
select posexplode(array('A','B','C')) as (pos,val);
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view posexplode(array('A','B','C')) tf as pos,val;
inline (array of structs)
inline( ARRAY( STRUCT()[,STRUCT()] - explodes and array and struct into a table
select inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02')));
select inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) as (col1,col2,col3);
select tf.* from (select 0) t lateral view inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) tf;
select tf.* from (select 0) t lateral view inline(array(struct('A',10,date '2015-01-01'),struct('B',20,date '2016-02-02'))) tf as col1,col2,col3;
注意col3 的类型 去掉date了就是string了
stack (values)
stack(n, cols...) - turns k columns into n rows of size k/n each
|
注意stack和inline看着很像 其实不一样
首先inline的入参是array of structs 意思就是首先必须是array ,里面必须是struct
stack 则是 stack(n列,n倍的参数)
select stack (3,1,2,3) ,我想把123 分成3行
select stack (3,1,"cc",2,"love",3,"zbf") 我想每一行有两列, 这个很适合与解析有规律的字符串
T | explode(ARRAY<T> a) | Explodes an array to multiple rows. Returns a row-set with a single column (col), one row for each element from the array. |
Tkey,Tvalue | explode(MAP<Tkey,Tvalue> m) | Explodes a map to multiple rows. Returns a row-set with a two columns (key,value) , one row for each key-value pair from the input map. (As of Hive 0.8.0.). |
int,T | posexplode(ARRAY<T> a) | Explodes an array to multiple rows with additional positional column of int type (position of items in the original array, starting with 0). Returns a row-set with two columns (pos,val), one row for each element from the array. |
T1,...,Tn | inline(ARRAY<STRUCT<f1:T1,...,fn:Tn>> a) | Explodes an array of structs to multiple rows. Returns a row-set with N columns (N = number of top level elements in the struct), one row per struct from the array. (As of Hive 0.10.) |
T1,...,Tn/r | stack(int r,T1 V1,...,Tn/r Vn) | Breaks up n values V1,...,Vn into r rows. Each row will have n/r columns. r must be constant. |
string1,...,stringn | json_tuple(string jsonStr,string k1,...,string kn) | Takes JSON string and a set of n keys, and returns a tuple of n values. This is a more efficient version of the |
string 1,...,stringn | parse_url_tuple(string urlStr,string p1,...,string pn) | Takes URL string and a set of n URL parts, and returns a tuple of n values. This is similar to the |
select json_tuple('{"id":2,"name":"cc2"}',"id","name")
select get_json_object('{"id":2,"name":"cc2"}','$.id'),get_json_object('{"id":2,"name":"cc2"}','$.name')