[待解决]MySQL 优化-连接查询改嵌套查询一例

最新推荐文章于 2020-09-22 17:57:18 发布

orangleliu

最新推荐文章于 2020-09-22 17:57:18 发布

阅读量848

点赞数

分类专栏： MySQL 文章标签： mysql 优化连接子查询原理

本文为orangleliu (http://blog.csdn.net/orangleliu/)原创文章，自由传播, 文章转载请声明, 多谢。

本文链接：https://blog.csdn.net/orangleliu/article/details/54644739

版权

MySQL 专栏收录该内容

28 篇文章 0 订阅

订阅专栏

线上抓了一条慢查询日志，经过改写之后，速度从7s提升到0.1s以下，但是没有真正明白其中的原理，先记录之，以后再来解析。

前置条件

环境：MySQL5.6.14，Centos6.6
表结构

mysql> SHOW CREATE TABLE count_statusbyhour \G
*************************** 1. row ***************************
       Table: count_statusbyhour
Create Table: CREATE TABLE `count_statusbyhour` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `username` varchar(30) NOT NULL,
  `acctstarttime` int(11) NOT NULL,
  `acctstoptime` int(11) NOT NULL,
  `acctinputoctets` bigint(20) NOT NULL,
  `acctoutputoctets` bigint(20) NOT NULL,
  `addtime` datetime NOT NULL,
  PRIMARY KEY (`id`),
  KEY `count_status_by_hours_adt` (`addtime`)
) ENGINE=InnoDB AUTO_INCREMENT=73424410 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

数据在30w行左右，用来保存用户2天内的流量数据。

业务SQL和优化

优化前

需要优化的SQL, 一般执行时间在7～8秒之间

mysql> EXPLAIN SELECT sum(s1.acctinputoctets - s2.acctinputoctets), sum(s1.acctoutputoctets-s2.acctoutputoctets) 
FROM count_statusbyhour s1  LEFT JOIN count_statusbyhour s2 ON s1.username = s2.username WHERE  s1.addtime > '2017-01-20 21:55:01'  
AND s1.addtime < '2017-01-20 22:05:01' AND s2.addtime BETWEEN '2017-01-20 20:55:01' AND '2017-01-20 21:25:01' \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: s2
         type: range
possible_keys: count_status_by_hours_adt
          key: count_status_by_hours_adt
      key_len: 5
          ref: NULL
         rows: 7841
        Extra: Using index condition
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: s1
         type: range
possible_keys: count_status_by_hours_adt
          key: count_status_by_hours_adt
      key_len: 5
          ref: NULL
         rows: 7862
        Extra: Using index condition; Using where; Using join buffer (Block Nested Loop)
2 rows in set (0.00 sec)

这个SQL的业务作用：统计前一段时间所有用户流量使用情况，也就是用户流量使用的总和。

优化后

修改之后的SQL, 执行时间0.1s以下左右

mysql> EXPLAIN
    -> SELECT sum(s3.acctinputoctets - s4.acctinputoctets), sum(s3.acctoutputoctets-s4.acctoutputoctets) FROM
    -> (SELECT acctinputoctets, acctoutputoctets, addtime, username FROM count_statusbyhour s1 WHERE s1.addtime > '2017-01-20 21:55:01' AND s1.addtime < '2017-01-20 22:05:01') s3 LEFT JOIN
    -> (SELECT acctinputoctets, acctoutputoctets, addtime, username FROM count_statusbyhour s2 WHERE s2.addtime BETWEEN '2017-01-20 20:55:01' AND '2017-01-20 21:25:01') s4 ON  s3.username=s4.username \G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 7862
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived3>
         type: ref
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 92
          ref: s3.username
         rows: 10
        Extra: NULL
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: s2
         type: range
possible_keys: count_status_by_hours_adt
          key: count_status_by_hours_adt
      key_len: 5
          ref: NULL
         rows: 7841
        Extra: Using index condition
*************************** 4. row ***************************
           id: 2
  select_type: DERIVED
        table: s1
         type: range
possible_keys: count_status_by_hours_adt
          key: count_status_by_hours_adt
      key_len: 5
          ref: NULL
         rows: 7862
        Extra: Using index condition
4 rows in set (0.00 sec)

查询计划能看到优化，但是具体的IO估算，查询代价估算不明白。

profiling

针对优化前的SQL进行 profiling

mysql> set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> SELECT sum(s1.acctinputoctets - s2.acctinputoctets), sum(s1.acctoutputoctets-s2.acctoutputoctets) FROM count_statusbyhour s1 LEFT JOIN count_statusbyhour s2 ON s1.username = s2.username WHERE  s1.addtime > '2017-01-20 21:55:01' AND s1.addtime < '2017-01-20 22:05:01' AND s2.addtime BETWEEN '2017-01-20 20:55:01' AND '2017-01-20 21:25:01';
+----------------------------------------------+----------------------------------------------+
| sum(s1.acctinputoctets - s2.acctinputoctets) | sum(s1.acctoutputoctets-s2.acctoutputoctets) |
+----------------------------------------------+----------------------------------------------+
|                                   6773957325 |                                  24018528747 |
+----------------------------------------------+----------------------------------------------+
1 row in set (7.57 sec)

mysql> show profile cpu,block io  for query 1;
+----------------------+----------+----------+------------+--------------+---------------+
| Status               | Duration | CPU_user | CPU_system | Block_ops_in | Block_ops_out |
+----------------------+----------+----------+------------+--------------+---------------+
| starting             | 0.000112 | 0.000000 |   0.000000 |            0 |             0 |
| checking permissions | 0.000013 | 0.000000 |   0.000000 |            0 |             0 |
| checking permissions | 0.000010 | 0.000000 |   0.000000 |            0 |             0 |
| Opening tables       | 0.000026 | 0.000000 |   0.000000 |            0 |             0 |
| init                 | 0.000063 | 0.000000 |   0.000000 |            0 |             0 |
| System lock          | 0.000019 | 0.000000 |   0.000000 |            0 |             0 |
| optimizing           | 0.000031 | 0.000000 |   0.000000 |            0 |             0 |
| statistics           | 0.000138 | 0.000000 |   0.000000 |            0 |             0 |
| preparing            | 0.000055 | 0.000000 |   0.000000 |            0 |             0 |
| executing            | 0.000011 | 0.000000 |   0.000000 |            0 |             0 |
| Sending data         | 7.568715 | 7.559851 |   0.019997 |            0 |          3480 |
| end                  | 0.000017 | 0.000000 |   0.000000 |            0 |             0 |
| query end            | 0.000006 | 0.000000 |   0.000000 |            0 |             0 |
| closing tables       | 0.000008 | 0.000000 |   0.000000 |            0 |             0 |
| freeing items        | 0.000014 | 0.000000 |   0.000000 |            0 |             0 |
| logging slow query   | 0.000044 | 0.000000 |   0.000000 |            0 |             8 |
| cleaning up          | 0.000038 | 0.000000 |   0.000000 |            0 |             0 |
+----------------------+----------+----------+------------+--------------+---------------+
17 rows in set, 1 warning (0.00 sec)

从 profiling 的结果来看，几乎所有时间都花费在 Sending data 的 Block_ops_out 操作上，Sending data 的含义不是把数据从MySQL服务发送给客户端，而是说把数据从 引擎层 发送给 Server层 。应该是读取大量的数据导致 sending data 时间非常长。

引用下叶老师的PPT

长时间的Sending data
• 从引擎层读取数据返回给Server端的状态
• 长时间存在的原因
    – 没适当的索引，查询效率低 
    – 读取大量数据，读取缓慢 
    – 系统负载高，读取缓慢

长时间的Sending data
• 怎么办
   – 加上合适的索引
   – 或者改写SQL，提高效率
   – 增加LIMIT限制每次读取数据量 
   – 检查&升级I/O设备性能

在看看看更详细的

mysql> show  profile all for query 1 \G

*************************** 11. row ***************************
             Status: Sending data
           Duration: 7.568715
           CPU_user: 7.559851
         CPU_system: 0.019997
  Context_voluntary: 294
Context_involuntary: 796
       Block_ops_in: 0
      Block_ops_out: 3480
      Messages_sent: 0
  Messages_received: 0
  Page_faults_major: 0
  Page_faults_minor: 0
              Swaps: 0
    Source_function: exec
        Source_file: sql_executor.cc
        Source_line: 187

CONTEXT_VOLUNTARY and CONTEXT_INVOLUNTARY indicate how many voluntary and involuntary context switches occurred.
BLOCK_OPS_IN and BLOCK_OPS_OUT indicate the number of block input and output operations

读取的block太多，上下文切换的次数多，更深层次的理解还没明白，从执行计划上看用到了如下的特性。

MySQL新特性 Index Condition Pushdown