select count(*) from message_message where id in(select message_id from message_message2tag where tag_id=111187) and (category=9 or category=1)
这条SQL执行了6分钟,子查询里面的数据有6K条,但感觉不应该这么慢啊,explain看看什么原因:
mysql> explain select count(*) from message_message where id in(select message_id from message_message2tag where tag_id=111187) and (category=9 or category=1) \G;
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: message_message
type: range
possible_keys: idx_cat_add
key: idx_cat_add
key_len: 2
ref: NULL
rows: 2003342
Extra: Using where; Using index
*************************** 2. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: message_message2tag
type: index_subquery
possible_keys: message_id_ed6c39b8,tag_id_3747b463
key: message_id_ed6c39b8
key_len: 4
ref: func
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
这里面重要的指标是type, Explain的type显示的是访问类型,结果值从好到坏依次是:
system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
一般来说,得保证查询至少达到range级别,最好能达到ref,否则就可能会出现性能问题。可以看到这条sql的两个type分别是:range和index_subquery。
优化方法:用join代替子查询
mysql> explain select count(*) from message_message mm, message_message2tag mt where mm.id=mt.message_id and mt.tag_id=111187 and (category=9 or category=1) \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: mt
type: ref
possible_keys: message_id_ed6c39b8,tag_id_3747b463
key: tag_id_3747b463
key_len: 4
ref: const
rows: 7270
Extra:
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: mm
type: eq_ref
possible_keys: PRIMARY,idx_cat_add
key: PRIMARY
key_len: 4
ref: zcwdb.mt.message_id
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
type分别是: ref,eq_ref,再次执行花费时间:
mysql> select count(*) from message_message mm, message_message2tag mt where mm.id=mt.message_id and mt.tag_id=111187 and (category=9 or category=1);
+----------+
| count(*) |
+----------+
| 1728 |
+----------+
1 row in set (0.05 sec)
坑爹啊,性能相差1w倍啊!
Explain的Extra信息也相当重要,如果此信息显示Using filesort或者Using temporary的话,噩梦即将开始,不过也不尽然,比如说在一个WHERE ... ORDER BY ... 类型的查询里,很多时候我们无法创建一个兼顾WHERE和ORDER BY的索引,此时如果按照WHERE来确定索引,那么在ORDER BY时,就必然会引起Using filesort,文件排序是好是坏需要仔细判断,说白了就是看是先过滤再排序划算,还是先排序再过滤划算,正确答案取决与数据分布的情况,具体的情况可以参考Using index for ORDER BY vs restricting number of rows 。
参考:http://ldd600.iteye.com/blog/1462480
show variables like "max_userconnections"