mysql派生表合并

前几天笔者在CR一段sql时,发现一处逻辑问题,原sql如下:

 select user_id, max(create_time) as create_time,is_success
 from user_login_log
 group by user_id

这里是想取分组后最新的一条记录,但事实上is_success字段与最新时间并不是同一条记录
于是,改写如下:

select user_id,create_time,is_success  from (select * from user_login_log  order by user_id, create_time desc) b 
group by b.user_id;

然鹅经过验证后,发现数据并不准确。
通过explain查看此sql的执行计划,发现只有一行记录,咨询dba后,是由于派生表合并造成。
派生表合并是mysql在5.7版本作的优化,在5.6版本中查看上面语句的执行计划,会有两行记录,其中一行为Derived表,即派生表。
具体操作如下:

  • Merge the derived table into the outer query block。(意即将派生表合并到外部查询中)
  • Materialize the derived table to an internal temporary table。(将派生表按内部临时表实现)
    举个栗子
优化前:SELECT * FROM (SELECT * FROM t1) AS derived_t1;
优化后:SELECT * FROM t1;

或者

优化前:
  SELECT *
  FROM t1 JOIN (SELECT t2.f1 FROM t2) AS derived_t2 ON t1.f2=derived_t2.f1
  WHERE t1.f1 > 0;
优化后:
  SELECT t1.*, t2.f1
  FROM t1 JOIN t2 ON t1.f2=t2.f1
  WHERE t1.f1 > 0;

在合并后,优化器将在外部查询快中执行order by子句,但是需满足一定条件,否则将忽略order by子句。

The optimizer propagates an ORDER BY clause in a derived table or view reference to the outer query block if these conditions are all true:
 - The outer query is not grouped or aggregated.(外部查询未分组或聚合)
 - The outer query does not specify DISTINCT, HAVING, or ORDER BY.(外部查询未指定distinct,having,order by)
 - The outer query has this derived table or view reference as the only source in the FROM clause.(外部查询将此派生表或试图作为from子句的唯一来源)
Otherwise, the optimizer ignores the ORDER BY clause.

而上面改写的sql中,外部查询就用到了group by子句,因此导致order by 子句被忽略。

如何解决此问题呢?
可以关闭派生表合并状态

SET optimizer_switch = 'derived_merge=off';

但是这个影响范围太大了,而且存在即合理,mysql之所以加入派生表合并是想减少查询开销,派生类是个临时表,开辟一个临时表的同时还要维护排序或者分组等等,都会影响效率,所以尽量不要去修改此参数。

mysql推荐使用如下方法来规避派生表合并

It is possible to disable merging by using in the subquery any constructs that prevent merging, although these are not as explicit in their effect on materialization. Constructs that prevent merging are the same for derived tables and view references:

 - Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
 - DISTINCT
 - GROUP BY
 - HAVING
 - LIMIT
 - UNION or UNION ALL
 - Subqueries in the select list
 - Assignments to user variables
 - Refererences only to literal values (in this case, there is no underlying table)

最终sql修改如下:

SELECT user_id,create_time,is_success  FROM  (select * from user_login_log  order by user_id, create_time DESC limit 100000) b 
GROUP BY b.user_id;
或者
SELECT user_id,create_time,is_success  FROM  (select distinct user_id,create_time,is_success from user_login_log  order by user_id, create_time DESC limit 100000) b 
GROUP BY b.user_id;

当然,limit的效率较distinct会高一些,而且order by limit 子句会使用优先队列排序算法。
下图为优化前后的查询结果:
在这里插入图片描述
在这里插入图片描述
类似的,在oracle中的概念叫做子查询展开。

mysql英文手册:

https://dev.mysql.com/doc/refman/5.7/en/upgrading-from-previous-series.html
https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
https://dev.mysql.com/doc/refman/5.7/en/rewriting-subqueries.html
https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html
https://dev.mysql.com/doc/refman/5.7/en/semijoins.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值