问题描述
问题语句:
Select a.user_id, a.control_group_type From vtmp_log_node_1606 a , vm_taobao_shop_prop_61559109 vm_taobao_shop_prop_61559109,uni_customer_plat p where vm_taobao_shop_prop_61559109.grade >
1.0
and a.user_id = p.uni_id and vm_taobao_shop_prop_61559109.customerno=p.customerno;
|
现象:测试组发现这个语句从来就没有执行完过,节点放了一晚上也跑不完。
其中,vtmp_log_node_1606,vm_taobao_shop_prop_61559109均为视图。
解决过程
第一步:执行分析
与postgresql完全类似,只要看到慢的SQL,第一件事就是explain一下。
EXPLAIN
Select
a.user_id, a.control_group_type
From
vtmp_log_node_1606 a , vm_taobao_shop_prop_61559109 vm_taobao_shop_prop_61559109,uni_customer_plat p
where
vm_taobao_shop_prop_61559109.grade > 1.0
and
a.user_id = p.uni_id
and
vm_taobao_shop_prop_61559109.customerno=p.customerno;
|
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 1089938 | Using temporary; Using filesort |
1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 1065570 | Using where; Using join buffer |
1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 244 | ccms_base.a.customerno,const | 1 | Using index |
1 | SIMPLE | uni_customer | eq_ref | PRIMARY,index_uni_id | PRIMARY | 194 | ccms_base.p.uni_id | 1 | Using index |
注意到Extra字段有Using temporary; Using filesort,直觉告诉我们看到file字样都要留意,觉得这里一定是一个问题点。怀疑是索引没有用到,无脑流的分析一下表:
analyze table uni_customer_plat;
|
但是结果并不理想。分析完后查询解析并没有变化。
看一下线程状态:
show full processlist;
|
发现执行这条语句的线程卡在Copying to tmp table。而有注意到之前的分析,表p(uni_customer_plat)有Using temporary。于是uni_customer_plat一定是有问题的,看一下索引:
show index in uni_customer_plat;
|
结果触目惊心:
Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
uni_customer_plat | 0 | PRIMARY | 1 | plat_code | A | 18 | NULL | NULL | BTREE | |||
uni_customer_plat | 0 | PRIMARY | 2 | customerno | A | 1070885 | NULL | NULL | BTREE |
uni_id列用于连接,但是没有建索引。果断加上索引并且无脑流分析一下:
create index uni_customer_plat_uni_id on uni_customer_plat ( uni_id );
create index uni_customer_plat_uni_id on uni_customer_plat ( uni_id );
|
再次对语句执行分析:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
1 | SIMPLE | p | index | uni_customer_plat_uni_id | uni_customer_plat_uni_id | 194 | NULL | 1055755 | Using index; Using temporary; Using filesort |
1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 1065570 | Using where; Using join buffer |
1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 244 | ccms_base.a.customerno,const | 1 | Using index |
1 | SIMPLE | uni_customer | eq_ref | PRIMARY,index_uni_id | PRIMARY | 194 | ccms_base.p.uni_id | 1 | Using index |
不错,果然用上了index。但是讨厌的file还在!看到filesort,到底是哪里sort了呢?起先怀疑是那个distinct,可以去掉后依然存在。看来在这个SQL里是找不到了。那么,只能去view里检查了(其实一般都要先去view里面看看)。
show create view vtmp_log_node_1606;
|
View | Create View | character_set_client | collation_connection |
vtmp_log_node_1606 | CREATE ALGORITHM=UNDEFINED DEFINER=`ccms`@`localhost` SQL SECURITY DEFINER VIEW `vtmp_log_node_1606` AS select `vw_node_uni_id_all`.`uni_id` AS `user_id`,`vw_node_uni_id_all`.`control_group_type` AS `control_group_type` from `vw_node_uni_id_all` | utf8 | utf8_general_ci |
额。。。发现居然还套了一个视图。vw_node_uni_id_all,继续往下跟
show create view vw_node_uni_id_all;
|
View | Create View | character_set_client | collation_connection |
vw_node_uni_id_all | CREATE ALGORITHM=UNDEFINED DEFINER=`ccms`@`%` SQL SECURITY DEFINER VIEW `vw_node_uni_id_all` AS select `uni_customer`.`uni_id` AS `uni_id`,(1) AS `control_group_type` from `uni_customer` order by `uni_customer`.`uni_id`,(1) | utf8 | utf8_general_ci |
嗯,终于发现了,这里有一个order by。可是事实上,从业务分析,这里是不需要order by的,并且这个视图做的事情也是没什么意义的,于是直接废掉,新建一个视图vtmp_log_node_1606_n来做测试:
create VIEW `vtmp_log_node_1606_n` AS select `uni_id` AS `user_id`, -
1
AS `control_group_type` from uni_customer;
|
5 | root | localhost | ccms_base | Query | 27 | Sending data | Select a.user_id, a.control_group_type From vtmp_log_node_1606_n a , vm_taobao_shop_prop_61559109 vm_taobao_shop_prop_61559109,uni_customer_plat p where vm_taobao_shop_prop_61559109.grade > 1.0 and a.user_id = p.uni_id and vm_taobao_shop_prop_61559109.customerno=p.customerno | 69 | 0 | 1 |
看状态,直接就开始Sending data了。。。那个讨厌的Copy to tmp table直接消失。不过这个Sending data依然持续了很长时间没有结束,等不急了,继续优化。还是看现在的查询解析:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
1 | SIMPLE | uni_customer | index | PRIMARY,index_uni_id | index_uni_id | 194 | NULL | 1076977 | Using index |
1 | SIMPLE | p | ref | uni_customer_plat_uni_id | uni_customer_plat_uni_id | 194 | ccms_base.uni_customer.uni_id | 1 | Using index |
1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 907681 | Using where; Using join buffer |
1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 244 | ccms_base.a.customerno,const | 1 | Using index |
发现a(vtmp_log_node_1606)有一个907681行的where条件扫描没有用到任何key。观察a引用到的表uni_customer,发现索引都是ok的。于是有怀疑到vm_taobao_shop_prop_61559109视图。看看这个视图:
vm_taobao_shop_prop_61559109 | CREATE ALGORITHM=UNDEFINED DEFINER=`ccms`@`%` SQL SECURITY DEFINER VIEW `vm_taobao_shop_prop_61559109` AS select `a`.`customerno` AS `customerno`,`a`.`grade` AS `grade`,`b`.`valide_date` AS `valide_date`,`b`.`expire_date` AS `expire_date`,`a`.`trade_amount` AS `trade_amount`,`a`.`trade_count` AS `trade_count`,(curdate() - cast(`a`.`last_trade_time` as date)) AS `no_buy_days` from (`plt_taobao_crm_member` `a` left join `twb_vip_customer` `b` on(((`a`.`dp_id` = '61559109') and (`a`.`status` = 'normal') and (`b`.`shop_id` = '61559109') and (`a`.`customerno` = `b`.`customerno`)))) | utf8 | utf8_general_ci |
引用到了两个表,其中plt_taobao_crm_member的customerno字段是要a做连接的。于是再看看这个表的情况,发现index是有的(联合主键)
Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
plt_taobao_crm_member | 0 | PRIMARY | 1 | dp_id | A | 18 | NULL | NULL | BTREE | |||
plt_taobao_crm_member | 0 | PRIMARY | 2 | customerno | A | 1212018 | NULL | NULL | BTREE |
奇怪。。。做做查询测试,
select * from plt_taobao_crm_member where customerno =
'.com小菲'
;
|
结果:1 row in set (0.51 sec)
一条记录居然要半秒!!!貌似是索引根本没起作用。分析一下表,没用。。。于是继续无脑流在customerno字段创建一个索引
create index plt_taobao_crm_member_customerno on plt_taobao_crm_member ( customerno );
|
再次执行索引查询,结果:1 row in set (0.00 sec)。感觉又回到了地球上。
现在再跑一下原来的大SQL的查询解析:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
1 | SIMPLE | p | index | uni_customer_plat_uni_id | uni_customer_plat_uni_id | 194 | NULL | 1080412 | Using index; Using temporary |
1 | SIMPLE | a | ref | plt_taobao_crm_member_customerno | plt_taobao_crm_member_customerno | 152 | ccms_base.p.customerno | 1 | Using where |
1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 244 | ccms_base.a.customerno,const | 1 | Using index |
1 | SIMPLE | uni_customer | eq_ref | PRIMARY,index_uni_id | PRIMARY | 194 | ccms_base.p.uni_id | 1 | Using index |
注意,这个是带了distinct的。实际执行一下,得到结果
71127 rows in set (13.30 sec)
至此,一个不可能完成的SQL在非常正常的时间执行完了。
总结
查询优化,就是根据explain的结果一步步抽丝剥茧,不断的尝试,直到得到期望的执行计划。
在这个过程里,最需要的就是耐心,仔细的观察每一个explain输出。同时也要对业务有比较好的理解。
当你优化的不耐烦的时候,就在心里不断对自己重复:我就是一个SQL优化机器人,我生下来就是为了优化SQL的。