目录
说明,关于内置的hive的udf函数使用,建议直接参考hive 官方wiki https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode(array)
explode
explode()
takes in an array (or a map) as an input and outputs the elements of the array (map) as separate rows. UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW.
As an example of using explode()
in the SELECT expression list, consider a table named myTable that has a single column (myCol) and two rows:
Array<int> myCol |
---|
[100,200,300] |
[400,500,600] |
Then running the query:
SELECT explode(myCol) AS myNewCol FROM myTable;
will produce:
(int) myNewCol |
---|
100 |
200 |
300 |
400 |
500 |
600 |
The usage with Maps is similar:
SELECT` `explode(myMap) ``AS` `(myMapKey, myMapValue) ``FROM` `myMapTable;
lateral view explod
SELECT * FROM `default`.`testlater`;
pageid | addlist |
---|---|
frontpage | [1,2,3] |
contactpage | [4,5,6] |
select pageid , t1 from testlater lateral view explode(split(regexp_replace(addlist,"\\[|\\]",""),","))a as t1
pageid | t1 |
---|---|
frontpage | 1 |
frontpage | 2 |
frontpage | 3 |
contactpage | 4 |
contactpage | 5 |
contactpage | 6 |
lateral view用于和split, explode等UDTF一起使用,它能够将一行数据拆成多行数据,在此基础上可以对拆分后的数据进行聚合。lateral view首先为原始表的每行调用UDTF,UDTF会把一行拆分成一或者多行,lateral view再把结果组合,产生一个支持别名表的虚拟表。
由此可见,lateral view与explode等udtf就是天生好搭档,explode将复杂结构一行拆成多行,然后再用lateral view做各种聚合。