需求:获取分组后,组内排名第一或最后的数据行(完整信息)。
1.利用子查询
,desc 或者asc,然后GROUP BY 某个维度例如userId,
再select,会默认取分组后的排序的第一条数据
。
2.利用子查询,然后在外层查询进行GROUP BY排序,再使用DISTINCT 某关键字去重。
失效原因:
order by在子查询中被忽略优化了。
A “table” (and subquery in the FROM clause too) is - according to the SQL standard - an unordered set of rows. Rows in a table (or in a subquery in the FROM clause) do not come in any specific order. That’s why the optimizer can ignore the ORDER BY clause that you have specified. In fact, SQL standard does not even allow the ORDER BY clause to appear in this subquery (we allow it, because ORDER BY … LIMIT … changes the result, the set of rows, not only their order). You need to treat the subquery in the FROM clause, as a set of rows in some unspecified and undefined order, and put the ORDER BY on the top-level SELECT.
大致意思:SQL标准中,table的定义是一个未排序的数据集合,而一个SQL子查询是一个临时的table,根据这个定义,子查询中的order by会被忽略。同时,官方回复也给出了解决方案:将子查询的order by移动到最外层的select语句中。
解决方案
1.Order By从子查询中移至到外层查询
2.采用 row_number() over + partition by去排序分组,可以保留数据完整信息
3.使用limit 10000000 ,limit+大数 强行排序
网上都说Mysql5.7,在子查询的ORDER BY子句后面,必须加上LIMIT 10000000,实测不加LIMIT 结果确实会有问题
未加Limit,没有排序效果
加上后跟预期正常