之前用mysql一直也没觉得有特别慢的感觉,最近发现新开发的系统有个页面打开速度非常慢,有时候1分钟都打不开。查了一下系统,定位到是一条sql语句执行慢造成的。该sql如下:SELECT
COUNT(1)ASvalue
, document.sourceType ASlable
FROM
document
WHERE
document.id
IN
(SELECT
document_id
FROM
subject_document
WHERE
subject_id = 345
)
GROUPBY
document.sourceType
粗略看一下,真没觉得有非常严重的问题,只是本来该用内连接的写成了IN。查看执行计划结果如下:
+----+--------------------+------------------+------+-------------------------------------------+---------------------------+---------+------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------------+------+-------------------------------------------+---------------------------+---------+------------+--------+----------------------------------------------+
| 1 | PRIMARY | document | ALL | NULL | NULL | NULL | NULL | 117287 | Using where; Using temporary; Using filesort |
| 2 | DEPENDENT SUBQUERY | subject_document | ref | uk_subject_id_document_id,idx_document_id | uk_subject_id_document_id | 10 | const,func | 1 | Using where; Using index |
+----+--------------------+------------------+------+-------------------------------------------+---------------------------+---------+------------+--------+----------------------------------------------+
2 rows in set (0.41 sec)
嵌套的那个查询是用了索引的,但是外层没有用上索引,所以导致查询速度严重下降。
(
select * from tbl_account where exists(select * from tbl_character where fld_btLevel>40 and tbl_account.fld_loginid=tbl_character.fld_dwLogId);
select * from tbl_account where fld_loginid in(select fld_dwLogId from tbl_character where fld_btLevel>40);
select * from tbl_character where exists(select * from tbl_account where fld_btLevel>40 and tbl_account.fld_loginid=tbl_character.fld_dwLogId);
select * from tbl_character where fld_dwLogId in(select fld_loginid from tbl_account where fld_btLevel>40);
难怪前两条比后两条的速度快多了,当时就是tbl_account的fld_loginid是索引,而tbl_character的fld_dwLogId不是索引
)
对比一下改成内连接后的执行计划:
+----+-------------+-------+--------+-------------------------------------------+---------------------------+---------+----------------------+------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------------+---------------------------+---------+----------------------+------+-----------------------------------------------------------+
| 1 | SIMPLE | sd | ref | uk_subject_id_document_id,idx_document_id | uk_subject_id_document_id | 5 | const | 455 | Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | pscms.sd.document_id | 1 | |
+----+-------------+-------+--------+-------------------------------------------+---------------------------+---------+----------------------+------+-----------------------------------------------------------+
2 rows in set (0.05 sec)
这就可以用上索引,所以查询速度非常快。
由于这个程序是从一个oracle数据库上代码移植过来的,所以特别去看了一下oracle的执行速度。结果在oracle上执行速度非常快,忘记看执行计划了,但显然是利用上索引。看来oracle的优化器做的要强大许多。
最后查看了一下自己的历史代码,里面还是有一些地方用到了IN。上面那个嵌套查询里面的子查询结果就算为空,居然也要执行很久。那如果直接写成 IN (1)呢?结果是速度又变得很快了,这要再不快,IN条件真一无是处了。查看了一下这个的执行计划:
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | document | range | PRIMARY | PRIMARY | 4 | NULL | 2 | Using where; Using temporary; Using filesort |
+----+-------------+----------+-------+---------------+---------+---------+------+------+----------------------------------------------+
1 row in set (0.00 sec)
这里又用上了索引,因此速度又变快了。
总结一下,数据库的查询速度和索引有很大关系,正确利用索引可以有效的加快查询速度,另外还要多用执行计划去分析。