hive数组排序
hive中关于数组内部排序等函数主要有以下两个:
sort_array
sort_array(array(obj1, obj2,…)) - Sorts the input array in ascending order according to the natural ordering of the array elements.
Example:
SELECT sort_array(array('b', 'd', 'c', 'a')) FROM src LIMIT 1;'a', 'b', 'c', 'd'
Function class:org.apache.hadoop.hive.ql.udf.generic.GenericUDFSortArray
Function type:BUILTIN
sort_array_by
sort_array_by(array(obj1, obj2,…),‘f1’,‘f2’,…,[‘ASC’,‘DESC’]) - Sorts the input tuple array in user specified order(ASC,DESC) by desired field[s] name If sorting order is not mentioned by user then dafault sorting order is ascending
Example:
SELECT sort_array_by(array(struct('g',100),struct('b',200)),'col1','ASC') FROM src LIMIT 1;array(struct('b',200),struct('g',100))
Function class:org.apache.hadoop.hive.ql.udf.generic.GenericUDFSortArrayByField
Function type:BUILTIN
- https://www.iteblog.com/archives/2032.html#sort_array_by
- https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSortArrayByField.java