用Hive处理数据的时候经常会遇到行列互相转换的需求,总结并记录一下行列转行的常见场景和操作语法
其中所有的操作都可以直接复制语句去自己的hive执行,查看结果
前置知识
执行hive或beeline进入,执行
desc function explode;
查看函数说明;
explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns
可以将数组炸开成多行,或者将map炸开成多行多列,是Hive内置的UDTF
split(str, regex) - Splits str around occurances that match regex
按照正则规则去切割字符串
collect_list(x) - Returns a list of objects with duplicates
返回不去重的集合
collect_set(x) - Returns a set of objects with duplicate elements eliminated
返回一个去重的集合
concat_ws(separator, [string | array(string)]+) - returns the concatenation of the strings separated by the separator
返回一个特定分隔符的拼接字符串
max(expr) - Returns the maximum value of expr
返回表达式的最大值
准备数据
创建一个name,subject,score的期末考试成绩表,代表每个同学每个学科的成绩
导入数据也可以自定义文本然后load也行,sql语句一步到位,测试简单方便
- 用于行专列的数据,后称表1
create table school_final_test as
select 'jack' as name, 'english' as subject, 70 as score union all
select 'jack' as name, 'math' as subject, 80 as score union all
select 'jack' as name, 'chinese' as subject, 90 as score union all
select 'tim' as name, 'english' as subject, 10 as score union all
select 'tim' as name, 'math' as subject,