开发人员给了一个sql ,结构如下delete from B where ID in (select NID from H where guid='xxx');
内部sql满足条件的结果集只有一条,但是整个删除操作执行了将近1分钟,如果是将结果集放在括号里或者将in改为= ,执行的速度可以实现毫秒级别
但是如果内部查询结果集多于一行,采用第一种方案的话需要更改程序,后来又试了一种更改为join,速度也是极快。
测试表,t1.id上有索引,t2.id无索引
mysql> select * from t1; mysql> select * from t2;
+------+------+----------+ +------+---------+
| id | name | class_id | | id | name |
+------+------+----------+ +------+---------+
| 1 | aa | NULL | | 2 | myname2 |
| 2 | aa | NULL | | 6 | myname5 |
| 3 | dd | NULL | +------+---------+
| 6 | cc | NULL | 2 rows in set (0.01 sec)
+------+------+----------+
4 rows in set (0.00 sec)
使用子查询及改为join后的执行计划
mysql> explain delete from t1 where id in (select id from t2 where name='aa');
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | DELETE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
+----+--------------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
2 rows in set (0.00 sec)
mysql> explain delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa';
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 2 | 50.00 | Using where |
| 1 | DELETE | t1 | NULL | ref | idx_id | idx_id | 5 | const | 1 | 100.00 | Using where |
+----+-------------+-------+------------+------+---------------+--------+---------+-------+------+----------+-------------+
2 rows in set (0.01 sec)
对于子查询的执行计划可以看出先对t1进行全表扫描,然后执行select id from t2 where name='aa' and t1.id=t2.id ,如果有值则删除t.* where id=t1.id
而对于改为join的sql来说,优化器会很智能的选取小表来作为驱动表,然后再走索引删除t1.* , 而对于子查询官方文档解释为由外向内执行
为了更加直观的看两种方式的执行过程,打开回话级别的profiling
mysql> show profiles;
+----------+------------+------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+------------------------------------------------------------------------------+
| 3 | 0.00137075 | delete from t1 where id in (select id from t2 where name='aa') |
| 4 | 0.00211725 | explain delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa' |
| 5 | 0.00132050 | delete t1.* from t1 inner join t2 where t1.id=t2.id and t2.name='aa' |
+----------+------------+------------------------------------------------------------------------------+
mysql> show profile for query 3 mysql> show profile for query 5
-> ; -> ;
+----------------------+----------+ +--------------------------------+----------+
| Status | Duration | | Status | Duration |
+----------------------+----------+ +--------------------------------+----------+
| starting | 0.000388 | | starting | 0.000360 |
| checking permissions | 0.000026 | | checking permissions | 0.000013 |
| checking permissions | 0.000008 | | checking permissions | 0.000007 |
| Opening tables | 0.000105 | | checking permissions | 0.000004 |
| init | 0.000152 | | init | 0.000005 |
| System lock | 0.000083 | | Opening tables | 0.000048 |
| updating | 0.000084 | | init | 0.000048 |
| optimizing | 0.000031 | | deleting from main table | 0.000022 |
| statistics | 0.000083 | | System lock | 0.000028 |
| preparing | 0.000052 | | optimizing | 0.000043 |
| executing | 0.000013 | | statistics | 0.000144 |
| Sending data | 0.000114 | | preparing | 0.000144 |
| executing | 0.000009 | | executing | 0.000009 |
| Sending data | 0.000017 | | Sending data | 0.000246 |
| executing | 0.000005 | | deleting from reference tables | 0.000073 |
| Sending data | 0.000019 | | end | 0.000012 |
| executing | 0.000006 | | end | 0.000010 |
| Sending data | 0.000018 | | query end | 0.000016 |
| end | 0.000019 | | closing tables | 0.000015 |
| query end | 0.000020 | | freeing items | 0.000037 |
| closing tables | 0.000021 | | cleaning up | 0.000039 |
| freeing items | 0.000054 | +--------------------------------+----------+
| cleaning up | 0.000046 | 21 rows in set, 1 warning (0.00 sec)
+----------------------+----------+
23 rows in set, 1 warning (0.01 sec)
我第一眼关注的是两条语句senting data的次数,子查询对应的sending data是4次,子查询先对外部表进行全表扫描,结果集是4行,然后进行循环遍历拿出每一行与内部查询进行关联,共执行了4次内部查询,并且每次都对内部查询的结果集做一下判断是否有值,如果有值则再进行删除
小小的记录一下,在优化器的探索之路上慢慢爬