mysql分组排序并取各分组中前几个数据
需求
今天碰到了一个需求,要求是取mysql数据表中每月完成任务数排名前十的员工信息。数据表的字段如下:
table:employee_work_cnt
id varchar 数据表id
employee_id varchar 员工id
work_num int 完成任务数
month varchar 月份信息
字符编码集:utf8mb4(重点,要考的)
思考实现
看到需求首先想到的就是一些关键字group by、order by,但是经过尝试很明显无法达到想要的结果。这个时候就开始百度了(感谢百度,感谢各位大佬)。果然在一位前辈那找到了思路和办法(原文),于是我有了以下sql。
set @row_num=0;
set @month_group='';
select
e.id,e.month,e.employee_id,e.work_num,w.row_number
from employee_work_cnt as e
inner join
(
select
id,@month_group:=month,
case when @month_group = month
then @row_num:=@row_num+1
else @row_num:=1
end as row_number
from employee_work_cnt
order by month,work_num desc
) as w
on e.id = w.id
where w.row_number < 11
基本思路就是首先将所有数据按照月份和完成任务数两个字段进行降序排序得到排序序号和id,这里的变量*@row_num用来标记分组组内序号,变量@month_group*记录分组month的变化。然后将排序结果根据id与原有数据内联,最后按照排序序号过滤即可得到结果。总结就是六个字先排序后过滤。
sql在mysql图形化界面运行通过,结果完美。那么写入java代码:
//获取jdbc连接对象connection
Connection jdbcConnection = MySQLConnectionUtil.getJDBCConnection();
//sql语句
String sql = "select " +
"e.id,e.month,e.employee_id,e.work_num,w.row_number " +
"from employee_work_cnt as e " +
"inner join " +
"( " +
"select " +
"id,@month_group:=month, " +
"case when @month_group = month " +
"then @row_num:=@row_num+1 " +
"else @row_num:=1 " +
"end as row_number " +
"from employee_work_cnt " +
"order by month,work_num desc " +
") as w " +
"on e.id = w.id " +
"where w.row_number < 11";
try {
Statement statement1 = jdbcConnection.createStatement();
//执行声明sql变量的sql语句
statement1.execute("set @row_num=0;");
statement1.execute("set @month_group='';");
//执行查询语句
PreparedStatement statement = jdbcConnection.prepareStatement(sql);
ResultSet resultSet = statement.executeQuery();
while (resultSet.next()) {
String id = resultSet.getString(1);
String month = resultSet.getString(2);
String employeeId = resultSet.getString(3);
int workNum = resultSet.getInt(4);
int rowNum = resultSet.getInt(5);
System.out.println(id + "," + month + "," + employeeId + "," +
workNum + "," + rowNum);
}
//关闭相关资源,这里应该写在finally语句块里,偷懒了
resultSet.close();
statement.close();
statement1.close();
jdbcConnection.close();
}catch (Exception e) {
e.printStackTrace();
}
执行,报错:java.sql.SQLException: Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation ‘=’ ,果然没有那么顺利,这里报错指的是两者字符编码集不一致造成了错误。考虑到sql语句已经在图像化界面运行正常并得到了想要的运行结果,那么sql语句是没有问题的。字符集让我想到了在数据连接url后面经常添加的设置 &characterEncoding=UTF-8 ,添加运行,果然恢复正常并得到了结果,需求解决,完美。
总结
实现分组排序并取前几个的数据,其实实现的思路就是先给表内数据分组排序并标记各排序的序号,然后依据此排序序号过滤数据即可得到想要结果,而在hive、Oracle等数据库中都有类似 row_number 的组内排序函数,但是mysql中并没有,所以实现起来多个得到组内排序序号的步骤。这里是使用sql变量去实现这样的逻辑,或许也可以考虑自定义函数实现 row_number 逻辑。