hive经典案列--top N(行转列\列转行)

列转行案列:现在有这样一份数据: exercise_topn.txt
 
需求:
求出每种爱好中,年龄最大的两个人(爱好,年龄,姓名)注意思考一个问题:如果某个爱好中的第二大年龄
有多个相同的怎么办?
 
其中需要注意的是:每一条记录中的爱好有多个值,以"-"分隔
id,姓名,年龄,爱好
id,name,age,favors

1,huangxiaoming,45,a-c-d-f

2,huangzitao,36,b-c-d-e

3,huanglei,41,c-d-e

4,liushishi,22,a-d-e

5,liudehua,39,e-f-d

6,liuyifei,35,a-d-e

思路总结:

1.先把爱好拆分成一条一条的数据,行转列我们可以用explode() + lateral view

2.按照爱好进行分组年龄进行排序取rownum<=2  我们可以用rownum over()

实现

1.select a.id,a.name,a.age, favor_view as favor from exercise_topn a 
alter view explode(split(a.favors,“-”))  favor_view as favor;

转换后的数据:


| 1 | huangxiaoming | 45 | a |
| 1 | huangxiaoming | 45 | c |
| 1 | huangxiaoming | 45 | d |
| 1 | huangxiaoming | 45 | f |
| 2 | huangzitao | 36 | b |
| 2 | huangzitao | 36 | c |
| 2 | huangzitao | 36 | d |
| 2 | huangzitao | 36 | e |
| 3 | huanglei | 41 | c |
| 3 | huanglei | 41 | d |
| 3 | huanglei | 41 | e |
 
SELECT c.id, c.name, c.age, c.favor
FROM (
	SELECT b.id, b.name, b.age, b.favor
		, row_number() OVER (PARTITION BY b.favor ORDER BY b.age DESC) AS rank
	FROM (
		SELECT a.id AS id, a.name AS name, a.age AS age, favor_view.favor
		FROM exercise_topn a
			LATERAL VIEW explode(split(a.favors, '-')) favor_view AS favor
	) b
) c
WHERE c.rank <= 2;
RANK() 生成数据项在分组中的排名,排名相等会在名次中留下空位(比如 1 2 2 4)
DENSE_RANK() 生成数据项在分组中的排名,排名相等会在名次中不会留下空位(比如1,2,2,3)

2.行转列案列

所有数学课程成绩 大于 语文课程成绩的学生的学号
数据结构如下:
id,sid,course,score
1,1,yuwen,43 
2,1,shuxue,55 
3,2,yuwen,77 
4,2,shuxue,88 
5,3,yuwen,98 
6,3,shuxue,65 
7,3,yingyu,80

解决方案:
1.行列转换
SELECT sid
	, max(CASE course
		WHEN 'yuwen' THEN score
		ELSE 0
	END) AS yuwen
	, max(CASE course
		WHEN 'shuxue' THEN score
		ELSE 0
	END) AS shuxue
	, max(CASE course
		WHEN 'yingyu' THEN score
		ELSE 0
	END) AS yingyu
FROM exercise_course
GROUP BY sid;

得到的数据结果为:

+------+--------+---------+---------+ 
| sid | yuwen | shuxue | yingyu | 
+------+--------+---------+---------+ 
| 3 | 98 | 65 | 80 |
| 1 | 43 | 55 | 0 | 
| 2 | 77 | 88 | 0 |
 +------+--------+---------+---------+

2.加过滤条件

SELECT aa.sid
FROM (
	SELECT sid
		, max(CASE course
			WHEN 'yuwen' THEN score
			ELSE 0
		END) AS yuwen
		, max(CASE course
			WHEN 'shuxue' THEN score
			ELSE 0
		END) AS shuxue
		, max(CASE course
			WHEN 'yingyu' THEN score
			ELSE 0
		END) AS yingyu
	FROM exercise_course
	GROUP BY sid
) aa
WHERE aa.shuxue > aa.yuwen;

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Hive中的列转行行转列是通过使用一些特定的内置函数和关键字来实现的。列转行可以使用EXPLODE函数,该函数可以将一个包含复杂结构的数组或者映射拆分成多行。行转列可以使用collect_set函数,该函数将某一列的所有数据转化为一个集合,并且可以使用concat_ws函数将集合中的所有元素以逗号分割连接成一个字符串。此外,为了使用EXPLODE和LATERAL VIEW函数,你可以使用LATERAL VIEW关键字,语法为LATERAL VIEW udtf(expression) tableAlias AS columnAlias。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [hive列转行案例](https://download.csdn.net/download/weixin_38581777/14037437)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [hive行转列列转行](https://blog.csdn.net/qq_24790473/article/details/109710145)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [hive操作(行转列列转行)](https://blog.csdn.net/aiduo3346/article/details/102085019)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值