经典行转列

经典行转列:group by +sum()用法
今天在力扣上面做了一道SQL的题,经典的行转列,题面如下:
*

部门表 Department:
±--------------±--------+ | Column Name | Type |
±--------------±--------+ | id | int | | revenue | int | | month | varchar |
±--------------±--------+ (id, month) 是表的联合主键。 这个表格有关于每个部门每月收入的信息。 月份(month)可以取下列值
[“Jan”,“Feb”,“Mar”,“Apr”,“May”,“Jun”,“Jul”,“Aug”,“Sep”,“Oct”,“Nov”,“Dec”]。
编写一个 SQL 查询来重新格式化表,使得新的表中有一个部门 id 列和一些对应 每个月 的收入(revenue)列。
查询结果格式如下面的示例所示: Department 表:
±-----±--------±------+ | id | revenue | month |
±-----±--------±------+ | 1 | 8000 | Jan | | 2 | 9000 | Jan | | 3 | 10000 | Feb | | 1 | 7000 | Feb | | 1
| 6000 | Mar |
±-----±--------±------+ 查询得到的结果表:
±-----±------------±------------±------------±----±------------+ | id | Jan_Revenue | Feb_Revenue | Mar_Revenue | … | Dec_Revenue |
±-----±------------±------------±------------±----±------------+ | 1 | 8000 | 7000 | 6000 | … | null |
| 2 | 9000 | null | null | … | null |
| 3 | null | 10000 | null | … | null |
±-----±------------±------------±------------±----±------------+ 注意,结果表有 13 列 (1个部门 id 列 + 12个月份的收入列)。

在做这道题的时候,感觉挺简单的,在日常工作中也经常遇到。所以,毫不犹豫的就开始select …了。用的是case 语句:
*

select id, (case month when ‘Jan’ then revenue else null end) as
Jan_Revenue, (case month when ‘Feb’ then revenue else null end) as
Feb_Revenue, (case month when ‘Mar’ then revenue else null end) as
Mar_Revenue, (case month when ‘Apr’ then revenue else null end) as
Apr_Revenue, (case month when ‘May’ then revenue else null end) as
May_Revenue, (case month when ‘Jun’ then revenue else null end) as
Jun_Revenue, (case month when ‘Jul’ then revenue else null end) as
Jul_Revenue, (case month when ‘Aug’ then revenue else null end) as
Aug_Revenue, (case month when ‘Sep’ then revenue else null end) as
Sep_Revenue, (case month when ‘Oct’ then revenue else null end) as
Oct_Revenue, (case month when ‘Nov’ then revenue else null end) as
Nov_Revenue, (case month when ‘Dec’ then revenue else null end) as
Dec_Revenue from Department group by id

但是运行的时候才发现,结果根本不对。查找了相关分享后才知道,原来使用GROUP BY ,sql引擎为GROUP BY子句中的每个元素创建一组结果行,没有聚合的话,只计算第一个条件,后面的都为空了。
所以,更正了SQL语句:
*

select id, sum(case month when ‘Jan’ then revenue else null end) as
Jan_Revenue, sum(case month when ‘Feb’ then revenue else null end) as
Feb_Revenue, sum(case month when ‘Mar’ then revenue else null end) as
Mar_Revenue, sum(case month when ‘Apr’ then revenue else null end) as
Apr_Revenue, sum(case month when ‘May’ then revenue else null end) as
May_Revenue, sum(case month when ‘Jun’ then revenue else null end) as
Jun_Revenue, sum(case month when ‘Jul’ then revenue else null end) as
Jul_Revenue, sum(case month when ‘Aug’ then revenue else null end) as
Aug_Revenue, sum(case month when ‘Sep’ then revenue else null end) as
Sep_Revenue, sum(case month when ‘Oct’ then revenue else null end) as
Oct_Revenue, sum(case month when ‘Nov’ then revenue else null end) as
Nov_Revenue, sum(case month when ‘Dec’ then revenue else null end) as
Dec_Revenue from Department group by id

在评论区中,也有看到其他类似做法,用if函数的:
*

select
id,
max(if(month = ‘Jan’, revenue, null)) as “Jan_Revenue”,
max(if(month = ‘Feb’, revenue, null)) as “Feb_Revenue”,
max(if(month = ‘Mar’, revenue, null)) as “Mar_Revenue”,
max(if(month = ‘Apr’, revenue, null)) as “Apr_Revenue”,
max(if(month = ‘May’, revenue, null)) as “May_Revenue”,
max(if(month = ‘Jun’, revenue, null)) as “Jun_Revenue”,
max(if(month = ‘Jul’, revenue, null)) as “Jul_Revenue”,
max(if(month = ‘Aug’, revenue, null)) as “Aug_Revenue”,
max(if(month = ‘Sep’, revenue, null)) as “Sep_Revenue”,
max(if(month = ‘Oct’, revenue, null)) as “Oct_Revenue”,
max(if(month = ‘Nov’, revenue, null)) as “Nov_Revenue”,
max(if(month = ‘Dec’, revenue, null)) as “Dec_Revenue” from
Department group by id

二者整体思路差不多,但是都有一个相同点,中间数据有聚合,否则计算出的数据只计算第一行:

讲讲case when的原理 当一个单元格中有多个数据时,case when只会提取当中的第一个数据。

以CASE WHEN month=‘Feb’ THEN revenue END
为例,当id=1时,它只会提取month对应单元格里的第一个数据,即Jan,它不等于Feb,所以找不到Feb对应的revenue,所以返回NULL。(可以试试把我上面答案里的sum()统统去掉,执行结果与预期不一样。错就错在当id=1时,Feb_Revenue和Mar_Revenue的值变成了NULL)

那该如何解决单元格内含多个数据的情况呢?答案就是使用聚合函数,聚合函数就用来输入多个数据,输出一个数据的。如SUM()或MAX(),而每个聚合函数的输入就是每一个多数据的单元格。

以SUM(CASE WHEN month=‘Feb’ THEN revenue END)
为例,当id=1时,它提取的Jan、Feb、Mar,从中找到了符合条件的Feb,并最终返回对应的revenue的值,即7000。

作者:xxiao053
链接:https://leetcode-cn.com/problems/reformat-department-table/solution/guan-yu-group-byyu-sumde-pei-he-by-xxiao053/
来源:力扣(LeetCode) 著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
当然,也有看到很多大神对行转列的精彩简洁做法,比如用函数“pivot”:

select
*

  • from (
    select id, revenue, month||’_Revenue’ as month_rev from department ) pivot(
    sum(revenue) for month_rev in(‘Jan_Revenue’, ‘Feb_Revenue’, ‘Mar_Revenue’, ‘Apr_Revenue’, ‘May_Revenue’, ‘Jun_Revenue’,
    ‘Jul_Revenue’, ‘Aug_Revenue’, ‘Sep_Revenue’, ‘Oct_Revenue’,
    ‘Nov_Revenue’, ‘Dec_Revenue’) )

SQL简单,但是想效率极高的输出,还是需要一定积累的呢。
继续学习中。。。。

  • 2
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值