参考文章:https://blog.csdn.net/qq_25221835/article/details/82762416
语法格式:row_number() over(partition by 分组列 order by 排序列 desc)
row_number() over()分组排序功能,这里使用的是impala函数引擎测试
在使用 row_number() over()函数时候,over()里头的分组以及排序的执行晚于 where 、group by、 order by 的执行。
例一:
表数据:
create table if not EXISTS TEST_ROW_NUMBER_OVER(
id INT ,
name string ,
age int ,
salary int
);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(1,'a',10,8000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(1,'a2',11,6500);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(2,'b',12,13000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(2,'b2',13,4500);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(3,'c',14,3000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(3,'c2',15,20000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(4,'d',16,30000);
insert into TEST_ROW_NUMBER_OVER(id,name,age,salary) values(5,'d2',17,1800);
一次排序:对查询结果进行排序(无分组)
select id,name,age,salary,row_number()over(order by salary desc) rn
from TEST_ROW_NUMBER_OVER t
结果:
进一步排序:根据id分组排序
select id,name,age,salary,row_number()over(PARTITION by id order by salary desc) rank
from TEST_ROW_NUMBER_OVER t
结果:
再一次排序:找出每一组中序号为一的数据
with a as(
select id,name,age,salary,row_number()over(PARTITION by id order by salary desc) rank
from TEST_ROW_NUMBER_OVER t
)
SELECT * from a where a.rank<2
或者
SELECT * FROM(
select id,name,age,salary,row_number()over(PARTITION by id order by salary desc) rank
from TEST_ROW_NUMBER_OVER
)A where A.rank <2
结果:
排序找出年龄在13岁到16岁数据,按salary排序
select id,name,age,salary,row_number()over( order by salary desc) rank
from TEST_ROW_NUMBER_OVER t
where age BETWEEN 13 and 16
结果:
结果中 rank 的序号,其实就表明了 over(order by salary desc) 是在where age between and 后执行的