SQL行列问题

最新推荐文章于 2024-06-14 10:14:22 发布

咦，笔记存档

最新推荐文章于 2024-06-14 10:14:22 发布

阅读量401

点赞数

分类专栏： MYSQL 数据分析基础文章标签： sql 数据库

本文链接：https://blog.csdn.net/weixin_41882890/article/details/112204619

版权

数据分析基础同时被 2 个专栏收录

14 篇文章 13 订阅

订阅专栏

MYSQL

4 篇文章 0 订阅

订阅专栏

日常工作中，为了让数据的可读性更强，经常会对数据格式进行转化操作。总结一下日常工作中遇到的关于行列操作问题。

单行拆分成多行

在这里插入图片描述

--创建测试数据
CREATE TABLE fwj.customer
(
id STRING,
name STRING,
mobiles STRING);

INSERT INTO fwj.customer
SELECT '1','jim','139,177,158' FROM system.dual;

-- 解法一，不建议选择
SELECT a.id,a.name,substr(a.mobiles,1,3) mobiles FROM fwj.customer a 
UNION ALL
SELECT a.id,a.name,substr(a.mobiles,5,3) mobiles FROM fwj.customer a 
UNION ALL
SELECT a.id,a.name,substr(a.mobiles,9,3) mobiles FROM fwj.customer a 

-- 解法二
SELECT b.id,a.mobiles,b.name FROM (
SELECT explode(split(t.mobiles,',')) mobiles FROM fwj.customer t )a ,  fwj.customer b;
-- 优化，但不行。explode 这类UDTF函数不支持和其他字段一块被select。
SELECT t.id,t.name,explode(split(t.mobiles,',')) mobiles FROM fwj.customer t 

-- 解法三，可以理解成在一次查询中 
-- 先生成了一个视图 mob 包含了行转列后的数据，之后从mob 中取出转换后的数据，
-- 其他字段仍旧从原表中取。
SELECT a.id,a.name,mob.mobile  
FROM fwj.customer a lateral view explode(split(a.mobiles,',')) mob AS mobile;

split(str,sep):
该函数的作用是拆分指定分隔符分割的字符串，返回一个列表。

SELECT split(a.mobiles,',') FROM  fwj.customer a;

result:
["139","177","158"]

explode(arr):
该函数是一个表生成函数。输入一个列表参数，将列表中的每个值都转换为一行。

SELECT explode(plit(a.mobiles,',')) mobiles FROM  fwj.customer a;

result:

mobiles
139
177
158

行列互换

测试集

/*创建数据库测试表*/
CREATE TABLE [Scores]
     (
       [ID] INT IDENTITY(1, 1)   PRIMARY KEY ,
       [Student] VARCHAR(20) ,
       [Subject] VARCHAR(30) ,
       [Score] FLOAT
     )

 /*插入数据库测试数据信息*/
  INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'test001', '语文', '90' )
 INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'test001', '英语', '85' )
 INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'text002', '语文', '90' )
 INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'text002', '英语', '80' )
 INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'test003', '语文', '95' )
 INSERT  INTO Scores
         ( Student, Subject, Score )
 VALUES  ( 'test003', '英语', '85' )

1. case when …then else …end 用法，行列转换

SELECT  Student AS '姓名' ,
         MAX(CASE Subject
               WHEN '语文' THEN Score
               ELSE 0
             END) AS '语文' ,--如果这个行是“语文”，就选此行作为列
        MAX(CASE Subject
               WHEN '英语' THEN Score
               ELSE 0
             END) AS '英语'
FROM    Scores
 GROUP BY Student
 ORDER BY Student

2. pivot(聚合函数(要转成列值的列名) for 要转换的列 in（目标列名）

SELECT  Student AS '姓名' ,
        AVG(语文) AS '语文' ,
        AVG(英语) AS '英语'
FROM Scores PIVOT( AVG(Score) FOR Subject IN ( 语文, 英语 ) )as NewScores
GROUP BY Student
ORDER BY Student ASC

拓展题

将上表转化为下表的格式
在这里插入图片描述

-- 解法一，实用性不如解法二，但可能图二无法满足，推荐使用解法二
select t1.item,t1.[time]as startdate,min(t2.[time])as endtime
from t1,t1 as t2
where t1.type='start'
and t1.type<>t2.type
and t2.[time]>t1.[time]
group by t1.item,t1.[time]

-- 解法二
Select  * From T1;
With S1 As (Select SeqNo, Item,Type,IIf(Type='Start', Time,Null) sTime,
    IIf(Type='End', Time,Null) eTime, 
    Lag(Type,1,'0') Over (Partition By Item Order By Time) LastType  From T1 ),
 S2 As ( Select SeqNo,Item,Type,sTime,eTime,LastType,
   Sum(IIF(Type='Start',1,IIF(Type='End' And LastType='Start',0,1)))
    Over (Partition By Item Order By SeqNo) Gn  From S1 )
Select Item,Gn,Max(sTime) sTime,Max(eTime) eTime  From S2 Group By Item,Gn

解法二计算过程：
1.Time 的类型设置为 Time(0)
2.S2代码思路：累加分组编号 …，避免万一中间没有（漏掉了） Start 记录，却有 End 记录的情况。如果是 Start 直接加1 ，如果是End 看下前面是不是 Start ，是就不加了，如果是第1条记录，还是加 1。
在这里插入图片描述