Hive SQL经典面试题之行列转换

4 篇文章 0 订阅

哈喽,小伙伴们,欢迎来到小张的频道,今天给大家讲解一道面试中常见的SQL问题------行列转换问题,希望能帮助到小伙伴们。还需要给大家说明的是这是一个系列性文章,如果您想了解更多面试题型,希望大家多多关注小张哦~

1. 多行转多列

1.1 原始数据r2c1.txt

a,c,1
a,d,2
a,e,3
b,c,4
b,d,5
b,e,6

1.2 建表语句

create table test.row2col1(
col1 string,
col2 string,
col3 int
)
row format delimited fields terminated by ',';

1.3 加载数据

load data local inpath '/opt/datas/hive/r2c1.txt' into table row2col1;
+----------------+----------------+----------------+
| row2col1.col1  | row2col1.col2  | row2col1.col3  |
+----------------+----------------+----------------+
| a              | c              | 1              |
| a              | d              | 2              |
| a              | e              | 3              |
| b              | c              | 4              |
| b              | d              | 5              |
| b              | e              | 6              |
+----------------+----------------+----------------+

1.4 期望结果

+-------+----+----+----+--+
| col1  | c  | d  | e  |
+-------+----+----+----+--+
| a     | 1  | 2  | 3  |
| b     | 4  | 5  | 6  |
+-------+----+----+----+--+

1.5 答案

SELECT 
    col1,
    MAX(CASE WHEN col2  ='c' THEN col3 ELSE 0  END ) c,
    MAX(CASE WHEN col2  ='d' THEN col3 ELSE 0  END ) d,
    MAX(CASE WHEN col2  ='e' THEN col3 ELSE 0  END ) e
FROM row2col1 
    GROUP BY col1;

1.6 执行结果

+-------+----+----+----+
| col1  | c  | d  | e  |
+-------+----+----+----+
| a     | 1  | 2  | 3  |
| b     | 4  | 5  | 6  |
+-------+----+----+----+

2. 多行转单列

2.1 原始数据r2c2.txt

a,b,1
a,b,2
a,b,3
c,d,4
c,d,5
c,d,6

2.2 建表语句

create table test.row2col2(
col1 string,
col2 string,
col3 int
)row format delimited fields terminated by ',';

2.3 加载数据

load data local inpath '/opt/datas/hive/r2c2.txt' into table row2col2;
+----------------+----------------+----------------+
| row2col2.col1  | row2col2.col2  | row2col2.col3  |
+----------------+----------------+----------------+
| a              | b              | 1              |
| a              | b              | 2              |
| a              | b              | 3              |
| c              | d              | 4              |
| c              | d              | 5              |
| c              | d              | 6              |
+----------------+----------------+----------------+

2.4 期望结果

+-------+-------+--------+--+
| col1  | col2  |  col3  |
+-------+-------+--------+--+
| a     | b     | 1,2,3  |
| c     | d     | 4,5,6  |
+-------+-------+--------+--+

2.5 答案

SELECT
    col1,
    col2,
    CONCAT_WS(',',COLLECT_SET(CAST(col3 as string))) col3
FROM 
    row2col2
GROUP BY 
    col1,col2;

2.6 执行结果

+-------+-------+--------+
| col1  | col2  |  col3  |
+-------+-------+--------+
| a     | b     | 1,2,3  |
| c     | d     | 4,5,6  |
+-------+-------+--------+

3. 多列转多行

3.1 原始数据c2r1.txt

a,1,2,3
b,4,5,6

3.2 建表语句

create table test.col2row1(
col1 string,
col2 int,
col3 int,
col4 int
)row format delimited fields terminated by ',';

3.3 加载数据

load data local inpath '/opt/datas/hive/c2r1.txt' into table test.col2row1;
+----------------+----------------+----------------+----------------+
| col2row1.col1  | col2row1.col2  | col2row1.col3  | col2row1.col4  |
+----------------+----------------+----------------+----------------+
| a              | 1              | 2              | 3              |
| b              | 4              | 5              | 6              |
+----------------+----------------+----------------+----------------+

3.4 期望结果

+-----------+-----------+-----------+--+
| col1      |    col2   |    col3   |
+-----------+-----------+-----------+--+
| a         | c         | 1         |
| b         | c         | 4         |
| a         | d         | 2         |
| b         | d         | 5         |
| a         | e         | 3         |
| b         | e         | 6         |
+-----------+-----------+-----------+--+

3.5 答案

SELECT 
    col1,
    col2,
    col3
FROM (
      SELECT col1,'c' as col2,col2 as col3 FROM col2row1 UNION 
      SELECT col1,'d' as col2,col3 as col3 FROM col2row1 UNION 
      SELECT col1,'e' as col2,col4 as col3 FROM col2row1
      ) t1
ORDER BY 
    col2,col1;

3.6 执行结果

+-------+-------+-------+
| col1  | col2  | col3  |
+-------+-------+-------+
| a     | c     | 1     |
| b     | c     | 4     |
| a     | d     | 2     |
| b     | d     | 5     |
| a     | e     | 3     |
| b     | e     | 6     |
+-------+-------+-------+

4. 单列转多行

4.1 原始数据c2r2.txt

a b 1,2,3
c d 4,5,6

4.2 建表语句

create table test.col2row2(
col1 string,
col2 string,
col3 string
)row format delimited fields terminated by ' ';

4.3 加载数据

load data local inpath '/opt/datas/hive/c2r2.txt' into table test.col2row2;
+----------------+----------------+----------------+
| col2row2.col1  | col2row2.col2  | col2row2.col3  |
+----------------+----------------+----------------+
| a              | b              | 1,2,3          |
| c              | d              | 4,5,6          |
+----------------+----------------+----------------+

4.4 期望结果

+-------+-------+-------+--+
| col1  | col2  | col3  |
+-------+-------+-------+--+
| a     | b     | 1     |
| a     | b     | 2     |
| a     | b     | 3     |
| c     | d     | 4     |
| c     | d     | 5     |
| c     | d     | 6     |
+-------+-------+-------+--+

4.5 答案

SELECT
    a.col1 col1,
    a.col2 col2,
    b.col3 col3
FROM 
    col2row2 a 
lateral view 
    explode(split(col3,',')) b as col3;

4.6 执行结果

+-------+-------+-------+
| col1  | col2  | col3  |
+-------+-------+-------+
| a     | b     | 1     |
| a     | b     | 2     |
| a     | b     | 3     |
| c     | d     | 4     |
| c     | d     | 5     |
| c     | d     | 6     |
+-------+-------+-------+
  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hive SQL优化面试题通常涉及以下几个方面: 1. 查询优化:了解如何通过索引、分区等技术来优化Hive SQL查询性能。可以提到使用合适的索引、分区和分桶来减少数据扫描量,提高查询效率。 2. 性能调优:理解常见的性能瓶颈和优化策略,例如避免全表扫描、减少数据倾斜、合理设置并行度等。可以提到使用合适的数据类型、避免不必要的数据转换、使用合适的连接方式等来提高性能。 3. EXPLAIN关键字的使用:掌握使用EXPLAIN关键字来分析查询执行计划,了解查询的执行顺序和涉及的操作,从而找到潜在的性能问题并进行优化。 4. 解答优化相关的题目:在面试中可能会遇到一些关于查询优化和性能调优的具体问题,例如如何优化某个特定的查询语句,如何处理大数据量的查询等。在回答时可以结合自己的实际经验和知识来给出合理的解决方案。 综上所述,Hive SQL优化面试题主要涉及查询优化、性能调优、使用EXPLAIN关键字分析查询计划以及解答具体的优化问题。掌握这些知识和技巧可以帮助提升在Hive SQL领域的技能和竞争力。 #### 引用[.reference_title] - *1* *3* [Hive SQL大厂面试题必考大全](https://blog.csdn.net/m0_47256162/article/details/131687792)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [Hive SQL面试题(附答案)](https://blog.csdn.net/a934079371/article/details/122227602)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insertT0,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值