行转列(PIVOT)原理与实现
行转列的概念
行转列是将数据表中的行数据转换为列数据的过程,通常用于将多行数据按照某个字段的值聚合后,作为新的列展示。
实现方法
方法一:使用CASE WHEN + 聚合函数
SELECT
student_id,
MAX(CASE WHEN subject = '数学' THEN score ELSE NULL END) AS '数学',
MAX(CASE WHEN subject = '语文' THEN score ELSE NULL END) AS '语文',
MAX(CASE WHEN subject = '英语' THEN score ELSE NULL END) AS '英语'
FROM
student_scores
GROUP BY
student_id;
方法二:使用IF + 聚合函数(MySQL特有)
SELECT
student_id,
MAX(IF(subject = '数学', score, NULL)) AS '数学',
MAX(IF(subject = '语文', score, NULL)) AS '语文',
MAX(IF(subject = '英语', score, NULL)) AS '英语'
FROM
student_scores
GROUP BY
student_id;
方法三:动态SQL实现(适用于列不固定的情况)
SET @sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT('MAX(IF(subject = ''', subject, ''', score, NULL)) AS ''', subject, '''')
) INTO @sql
FROM student_scores;
SET @sql = CONCAT('SELECT student_id, ', @sql, ' FROM student_scores GROUP BY student_id');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
性能优化建议
为分组字段和条件字段添加索引
对于大数据集,考虑使用临时表分步处理
动态SQL虽然灵活,但要注意SQL注入风险
列转行(UNPIVOT)原理与实现
概念
列转行是将数据表中的列数据转换为行数据的过程,通常用于将多列数据转换为键值对的形式展示。
实现方法
方法一:使用UNION ALL
SELECT student_id, '数学' AS subject, 数学 AS score FROM student_scores_pivot WHERE 数学 IS NOT NULL
UNION ALL
SELECT student_id, '语文' AS subject, 语文 AS score FROM student_scores_pivot WHERE 语文 IS NOT NULL
UNION ALL
SELECT student_id, '英语' AS subject, 英语 AS score FROM student_scores_pivot WHERE 英语 IS NOT NULL
ORDER BY student_id, subject;
方法二:使用CROSS JOIN + JSON函数(MySQL 5.7+)
SELECT
t.student_id,
j.subject,
j.score
FROM
student_scores_pivot t
CROSS JOIN JSON_TABLE(
CONCAT('[',
CONCAT_WS(',',
CONCAT('{"subject":"数学","score":', IFNULL(数学, 'null'), '}'),
CONCAT('{"subject":"语文","score":', IFNULL(语文, 'null'), '}'),
CONCAT('{"subject":"英语","score":', IFNULL(英语, 'null'), '}')
),
']'),
'$[*]' COLUMNS(
subject VARCHAR(20) PATH '$.subject',
score DECIMAL(10,2) PATH '$.score'
)
) AS j
WHERE j.score IS NOT NULL;
方法三:使用存储过程动态生成(适用于列不固定的情况)
DELIMITER //
CREATE PROCEDURE unpivot_student_scores()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE col_name VARCHAR(255);
DECLARE cur CURSOR FOR
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'student_scores_pivot'
AND COLUMN_NAME != 'student_id';
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
SET @sql = '';
OPEN cur;
read_loop: LOOP
FETCH cur INTO col_name;
IF done THEN
LEAVE read_loop;
END IF;
IF @sql != '' THEN
SET @sql = CONCAT(@sql, ' UNION ALL ');
END IF;
SET @sql = CONCAT(@sql,
'SELECT student_id, ''', col_name, ''' AS subject, ',
col_name, ' AS score FROM student_scores_pivot WHERE ',
col_name, ' IS NOT NULL');
END LOOP;
CLOSE cur;
SET @sql = CONCAT(@sql, ' ORDER BY student_id, subject');
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END //
DELIMITER ;
CALL unpivot_student_scores();
实际应用场景分析
行转列的典型场景
学生成绩单展示(每个学生一行,各科成绩作为列)
销售报表(每个销售员一行,各月销售额作为列)
用户行为统计(每个用户一行,各种行为次数作为列)
列转行的典型场景
ETL过程中的数据规范化
数据准备用于图表绘制
将宽表转换为适合统计分析的窄表
注意事项
1.处理NULL值:使用COALESCE或IFNULL函数处理转换过程中的NULL值
2.性能优化:对大表操作时,添加适当的索引并考虑分批处理
3.动态列处理:对于列不固定的情况,建议使用存储过程封装动态SQL
4.数据类型一致性:确保转换后的数据类型适合后续计算和展示
5.安全考虑:动态SQL要防止SQL注入,对输入进行严格校验