需求:找出雇员(emp)表,各个部门工资前三个员工信息
来源:Oracle,db2,Sql server等数据库,这类型数据库被金融公司使用,做一些报表。
hive相关支出:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
select ename, deptno,sal,
ROW_NUMBER() OVER(PARTITION BY deptno ORDER BY sal desc) row_num
from tb_emp
结果
ename deptno sal row_num
KING 10 5000.0 1
CLARK 10 2450.0 2
MILLER 10 1300.0 3
SCOTT 20 3000.0 1
FORD 20 3000.0 2
JONES 20 2975.0 3
ADAMS 20 1100.0 4
SMITH 20 800.0 5
BLAKE 30 2850.0 1
ALLEN 30 1600.0 2
TURNER 30 1500.0 3
MARTIN 30 1250.0 4
WARD 30 1250.0 5
JAMES 30 950.0 6
问题:20号部门两人薪资一样
使用rank
select ename, deptno,sal,
RANK() OVER(PARTITION BY deptno ORDER BY sal desc) row_num
from tb_emp
问题:直接从1跳到了3
解决:使用DENSE_RANK
select ename, deptno,sal,
DENSE_RANK() OVER(PARTITION BY deptno ORDER BY sal desc) row_num
from tb_emp
结果:
ename deptno sal row_num
KING 10 5000.0 1
CLARK 10 2450.0 2
MILLER 10 1300.0 3
SCOTT 20 3000.0 1
FORD 20 3000.0 1
JONES 20 2975.0 2
ADAMS 20 1100.0 3
SMITH 20 800.0 4
BLAKE 30 2850.0 1
ALLEN 30 1600.0 2
TURNER 30 1500.0 3
MARTIN 30 1250.0 4
WARD 30 1250.0 4
JAMES 30 950.0 5