一、测试数据准备
1、创建表单
create table if not exists stu_score(
score_id int,
stu_name string,
stu_subject string,
score int
)
row format delimited fields terminated by ','; -- 指定字段之间的分隔符为逗号
2、装载数据
文件上传到虚拟机: 主机虚拟机互传文件:SCP命令和MobaXterm工具
linux目录:root/data/data.txt
1,张三,语文,80
2,张三,数学,90
3,张三,英语,70
4,张三,生物,85
5,李四,语文,80
6,李四,数学,92
7,李四,英语,76
8,李四,生物,88
9,王五,语文,60
10,王五,数学,82
11,王五,英语,96
12,王五,生物,78
load data local inpath '/root/data/data.txt'
overwrite into table stu_score;
3、检验数据是否装载好
select * from stu_score;
二、行转列
行转列预期结果
1、使用 case when… then…
SELECT
stu_name,
MAX(CASE WHEN stu_subject = '语文' THEN score END) AS chinese,
MAX(CASE WHEN stu_subject = '数学' THEN score END) AS math,
MAX(CASE WHEN stu_subject = '英语' THEN score END) AS english,
MAX(CASE WHEN stu_subject = '生物' THEN score END) AS biology
FROM stu_score
GROUP BY stu_name;
2、使用 if()
SELECT
stu_name,
SUM(IF(stu_subject='语文', score,0)) as chinese,
SUM(IF(stu_subject='数学', score,0)) as math,
SUM(IF(stu_subject='英语' ,score,0)) as english,
SUM(IF(stu_subject='生物', score,0)) as biology
FROM stu_score
GROUP BY stu_name;
3、函数
1、CONCAT(str1, str2)
CONCAT 函数用于将两个或多个字符串连接在一起。
2、CONCAT_WS(separator, str1, str2,…):
CONCAT_WS 函数用于将字符串连接在一起,并使用指定的分隔符将它们分开。
3、COLLECT_SET(col):
COLLECT_SET 函数通常用于处理复杂数据类型,用于收集每个分组内的唯一值,并返回一个集合
三、列转行:
数据准备:
以上面的结果创建临时表
CREATE TEMPORARY TABLE IF NOT EXISTS temp_stu_summary AS
SELECT
stu_name,
SUM(IF(stu_subject='语文', score, 0)) AS chinese,
SUM(IF(stu_subject='数学', score, 0)) AS math,
SUM(IF(stu_subject='英语', score, 0)) AS english,
SUM(IF(stu_subject='生物', score, 0)) AS biology
FROM stu_score
GROUP BY stu_name;
临时表temp_stu_summary
select * from temp_stu_summary
列转行预期结果:
1、使用 union all
SELECT stu_name, '语文' AS subject, chinese AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '数学' AS subject, math AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '英语' AS subject, english AS score FROM temp_stu_summary
UNION ALL
SELECT stu_name, '生物' AS subject, biology AS score FROM temp_stu_summary;
2、函数 EXPLODE
SELECT
stu_name,
A.subject,
A.score
FROM temp_stu_summary
LATERAL VIEW EXPLODE(map("语文", chinese,"数学",math,"英语", english,"生物", biology)) A AS subject, score;
EXPLODE(col):将hive一列中复杂的array或者map结构拆分成多行
LATERAL VIEW
是 Apache Hive 中用于处理复杂数据类型的查询语法。在 Hive 中,复杂数据类型通常包括数组(Array)、映(Map)和结构(Struct)。