数据信息
0: jdbc:hive2://Hbase:10000> select empno,ename,sal,deptno from t_employee;
+--------+---------+-------+---------+--+
| empno | ename | sal | deptno |
+--------+---------+-------+---------+--+
| 7369 | SMITH | 800 | 20 |
| 7499 | ALLEN | 1600 | 30 |
| 7521 | WARD | 1250 | 30 |
| 7566 | JONES | 2975 | 20 |
| 7654 | MARTIN | 1250 | 30 |
| 7698 | BLAKE | 2850 | 30 |
| 7782 | CLARK | 2450 | 10 |
| 7788 | SCOTT | 1500 | 20 |
| 7839 | KING | 5000 | 10 |
| 7844 | TURNER | 1500 | 30 |
| 7876 | ADAMS | 1100 | 20 |
| 7900 | JAMES | 950 | 30 |
| 7902 | FORD | 3000 | 20 |
| 7934 | MILLER | 1300 | 10 |
+--------+---------+-------+---------+--+
练习
1.查看每个部门员工的薪资排名:
语句:
select empno,ename,sal,rank() over(partition by deptno order by sal desc) from t_employee;
分析:rank() over(partition by deptno order by sal desc):查看每个部门内员工的薪资,并进行排序。partition by后的字段是哪个,就表示查看哪个信息,并对此字段的对应信息进行操作。
结果:
0: jdbc:hive2://Hbase:10000> select empno,ename,deptno,sal,rank() over(partition by deptno order by sal desc) from t_employee;
+--------+---------+---------+-------+----------------+--+
| empno | ename | deptno | sal | rank_window_0 |
+--------+---------+---------+-------+----------------+--+
| 7839 | KING | 10 | 5000 | 1 |
| 7782 | CLARK | 10 | 2450 | 2 |
| 7934 | MILLER | 10 | 1300 | 3 |
| 7902 | FORD | 20 | 3000 | 1 |
| 7566 | JONES | 20 | 2975 | 2 |
| 7788 | SCOTT | 20 | 1500 | 3 |
| 7876 | ADAMS | 20 | 1100 | 4 |
| 7369 | SMITH | 20 | 800 | 5 |
| 7698 | BLAKE | 30 | 2850 | 1 |
| 7499 | ALLEN | 30 | 1600 | 2 |
| 7844 | TURNER | 30 | 1500 | 3 |
| 7654 | MARTIN | 30 | 1250 | 4 |
| 7521 | WARD | 30 | 1250 | 4 |
| 7900 | JAMES | 30 | 950 | 6 |
+--------+---------+---------+-------+----------------+--+
2.查询员工信息,展示该员工所在部门薪资排名?
此题和第一题的结果差不多,只是第1题是有序的,并且薪资相同的会进行并列排名。
而第2题只是查询每个员工在该部门中薪资排第几。
语句:
select e1.empno,e1.ename,e1.sal,e1.deptno,
sum(1) over (partition by e1.deptno order by e1.sal desc rows between unbounded preceding and current row )
from t_employee e1 order by e1.deptno;
无限的 前面 无限的 后面
rows between unbounded preceding and unbounded following
无限的 当前行
rows between unbounded preceding and current row
当前行 无限的 后续的
rows between current row and unbounded following
前1个 后一个
rows between -1 preceding and 1 following
前1个 后一个
rows between -1 preceding and 1 following
分析:该语句的思想是,对当前员工所在部门进行薪资排序,并且查看这一行之前的数据为几条,也就是当前员工在该部门的薪资排行。 查询的结果如果有相同的薪资,也会进行排序,默认为字典排序。
结果:
+-----------+-----------+---------+------------+---------------+--+
| e1.empno | e1.ename | e1.sal | e1.deptno | sum_window_0 |
+-----------+-----------+---------+------------+---------------+--+
| 7839 | KING | 5000 | 10 | 1 |
| 7782 | CLARK | 2450 | 10 | 2 |
| 7934 | MILLER | 1300 | 10 | 3 |
| 7876 | ADAMS | 1100 | 20 | 4 |
| 7369 | SMITH | 800 | 20 | 5 |
| 7788 | SCOTT | 1500 | 20 | 3 |
| 7566 | JONES | 2975 | 20 | 2 |
| 7902 | FORD | 3000 | 20 | 1 |
| 7654 | MARTIN | 1250 | 30 | 4 |
| 7521 | WARD | 1250 | 30 | 5 |
| 7499 | ALLEN | 1600 | 30 | 2 |
| 7698 | BLAKE | 2850 | 30 | 1 |
| 7900 | JAMES | 950 | 30 | 6 |
| 7844 | TURNER | 1500 | 30 | 3 |
+-----------+-----------+---------+------------+---------------+--+
3.查询员工信息,展示该员工所在部门的平均薪资
语句:
select e.empno,e.ename,e.deptno,avg(e.sal) over(partition by e.deptno) from t_employee e;
分析:该语句是查询,每个员工的信息和他所在部门的平均薪资。
开窗函数:avg(e.sal) over(partition by e.deptno) 查看部门的平均薪资。
结果:
+----------+----------+-----------+---------------+--+
| e.empno | e.ename | e.deptno | avg_window_0 |
+----------+----------+-----------+---------------+--+
| 7934 | MILLER | 10 | 2916.666667 |
| 7839 | KING | 10 | 2916.666667 |
| 7782 | CLARK | 10 | 2916.666667 |
| 7876 | ADAMS | 20 | 1875 |
| 7788 | SCOTT | 20 | 1875 |
| 7369 | SMITH | 20 | 1875 |
| 7566 | JONES | 20 | 1875 |
| 7902 | FORD | 20 | 1875 |
| 7844 | TURNER | 30 | 1566.666667 |
| 7499 | ALLEN | 30 | 1566.666667 |
| 7698 | BLAKE | 30 | 1566.666667 |
| 7654 | MARTIN | 30 | 1566.666667 |
| 7521 | WARD | 30 | 1566.666667 |
| 7900 | JAMES | 30 | 1566.666667 |
+----------+----------+-----------+---------------+--+