pivot函数-行转列
pivot 函数是Spark-sql 、Oracle,Sql Server特有关键词,即旋转,将指列的字段值,旋转成为多个列。并且可以指定某些列成为旋转列的聚合值,将行数据转换为列数据。
案例一
将图1的数据格式转换为图2的格式
图1
图2
1 pivot函数实现
with order_info_pivot as (
select 'zhang3' as uname , 'tv' as product, 22 as age , 'bj' as city , 3000 as amount union all
select 'li4' as uname , 'notebook' as product, 41 as age, 'bj' as city , 8000 as amount union all
select 'wang5' as uname, 'phone' as product, 32 as age, 'sh' as city , 4000 as amount union all
select 'zhao6' as uname, 'notebook' as product, 22 as age, 'sz' as city, 3000 as amount union all
select 'zhang3' as uname , 'phone' as product, 22 as age, 'bj' as city, 3000 as amount union all
select 'li4' as uname, 'tv' as product, 41 as age, 'sz' as city, 4000 as amount
)
select *
from
(select uname ,age,product,amount
from order_info_pivot
)
pivot (sum(amount) as amount for product in('notebook','phone','tv'));
pivot原理说明
把整个表整理成3种列:维度列、旋转列、聚合列
1) pivot语法
SELECT *
FROM
(
SELECT <columns>
FROM <table>
)
PIVOT
(
<aggregate function>(<column being aggregated>)
FOR <column to pivot>
IN (<pivoted columns>)
);
2)旋转列
原来是某个列的行值,现在要从行值要转为列的字段(上例的product)
3) 聚合列
对于旋转后的新列,要聚合的值(上例的amount,除了旋转列和聚合列,默认都是维度列
2 case when实现
with order_info_pivot as (
select 'zhang3' as uname , 'tv' as product, 22 as age , 'bj' as city , 3000 as amount union all
select 'li4' as uname , 'notebook' as product, 41 as age, 'bj' as city , 8000 as amount union all
select 'wang5' as uname, 'phone' as product, 32 as age, 'sh' as city , 4000 as amount union all
select 'zhao6' as uname, 'notebook' as product, 22 as age, 'sz' as city, 3000 as amount union all
select 'zhang3' as uname , 'phone' as product, 22 as age, 'bj' as city, 3000 as amount union all
select 'li4' as uname, 'tv' as product, 41 as age, 'sz' as city, 4000 as amount
)
select uname,age,city,
max(case when product = 'notebook' then amount else null end) as notebook,
max(case when product = 'phone' then amount else null end) as phone,
max(case when product = 'tv' then amount else null end) as tv
from order_info_pivot
group by uname,age,city
案例二
将图1的数据格式转换为图3的格式
图3
1 pivot实现
with order_info_pivot as (
select 'zhang3' as uname , 'tv' as product, 22 as age , 'bj' as city , 3000 as amount union all
select 'li4' as uname , 'notebook' as product, 41 as age, 'bj' as city , 8000 as amount union all
select 'wang5' as uname, 'phone' as product, 32 as age, 'sh' as city , 4000 as amount union all
select 'zhao6' as uname, 'notebook' as product, 22 as age, 'sz' as city, 3000 as amount union all
select 'zhang3' as uname , 'phone' as product, 22 as age, 'bj' as city, 3000 as amount union all
select 'li4' as uname, 'tv' as product, 41 as age, 'sz' as city, 4000 as amount
)
select * from
(
select product,age,city,amount from order_info_pivot
) oi
pivot (
sum(amount) as amount ,avg(age) as age for city in ('bj','sh','sz' )
);
2 case when 实现
with order_info_pivot as (
select 'zhang3' as uname , 'tv' as product, 22 as age , 'bj' as city , 3000 as amount union all
select 'li4' as uname , 'notebook' as product, 41 as age, 'bj' as city , 8000 as amount union all
select 'wang5' as uname, 'phone' as product, 32 as age, 'sh' as city , 4000 as amount union all
select 'zhao6' as uname, 'notebook' as product, 22 as age, 'sz' as city, 3000 as amount union all
select 'zhang3' as uname , 'phone' as product, 22 as age, 'bj' as city, 3000 as amount union all
select 'li4' as uname, 'tv' as product, 41 as age, 'sz' as city, 4000 as amount
)
select product,
sum(case when city = 'bj' then amount else null end) as bj_amount,
avg(case when city = 'bj' then age else null end) as bj_age,
sum(case when city = 'sh' then amount else null end) as sh_amount,
avg(case when city = 'sh' then age else null end) as sh_age,
sum(case when city = 'sz' then amount else null end) as sz_amount,
avg(case when city = 'sz' then age else null end) as sz_age
from order_info_pivot
group by product
unpivot函数-列转行
UNPIVOT 是 SQL 中用于将列数据转换为行数据的操作,与 PIVOT 操作相反。它可以帮助你将列数据“解透视”,从而更容易处理和分析。
案例三
将图2的数据格式转换为图4的数据格式
图4
1 unpivot实现
with order_info_unpivot as (
select 'zhang3' as uname,22 as age,null as notebook,3000 as phone,3000 as tv union all
select 'zhao6' as uname,22 as age,3000 as notebook,null as phone,null as tv union all
select 'li4' as uname,41 as age,8000 as notebook,null as phone,4000 as tv union all
select 'wang5' as uname,32 as age,null as notebook,4000 as phone,null as tv
)
select uname,product,age,amount
from order_info_unpivot
unpivot
(
amount for product in(notebook,phone,tv)
)
使用说明
- amount 将原始表的值(各个产品的销量)赋值给amount
- for product:将列名称notebook,phone,tv转换为行数据,赋值给product列
- in(notebook,phone,tv):指定需要进行转换的列
2 union all实现
with order_info_unpivot as (
select 'zhang3' as uname,22 as age,null as notebook,3000 as phone,3000 as tv union all
select 'zhao6' as uname,22 as age,3000 as notebook,null as phone,null as tv union all
select 'li4' as uname,41 as age,8000 as notebook,null as phone,4000 as tv union all
select 'wang5' as uname,32 as age,null as notebook,4000 as phone,null as tv
)
select uname,
'notebook' as product,
age,
notebook as amount
from order_info_unpivot
where notebook is not null
union all
select uname,
'phone' as product,
age,
phone as amount
from order_info_unpivot
where phone is not null
union all
select uname,
'tv' as product,
age,
tv as amount
from order_info_unpivot
where tv is not null;
3 总结
通过以上案例可以看出无论是使用pivot函数将数据行转列,还是使用unpivot函数将数据列转行,都很简洁和易于理解。