mysql怎么避免全局扫描,如何避免对此mysql查询进行全表扫描?

explain

select

*

from

zipcode_distances z

inner join

venues v

on z.zipcode_to=v.zipcode

inner join

events e

on v.id=e.venue_id

where

z.zipcode_from='92108' and

z.distance <= 5

I'm trying to find all "events at venues within 5 miles of zipcode 92108", however, I am having a hard time optimizing this query.

Here is what the explain looks like:

id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra

1, SIMPLE, e, ALL, idx_venue_id, , , , 60024,

1, SIMPLE, v, eq_ref, PRIMARY,idx_zipcode, PRIMARY, 4, comedyworld.e.venue_id, 1,

1, SIMPLE, z, ref, idx_zip_from_distance,idx_zip_to_distance,idx_zip_from_to, idx_zip_from_to, 30, const,comedyworld.v.zipcode, 1, Using where; Using index

I'm getting a full table scan on the "e" table, and I can't figure out what index I need to create to get it to be fast.

Any advice would be appreciated

Thank you

解决方案

Based on the EXPLAIN output in your question, you already have all the indexes the query should be using, namely:

CREATE INDEX idx_zip_from_distance

ON zipcode_distances (zipcode_from, distance, zipcode_to);

CREATE INDEX idx_zipcode ON venues (zipcode, id);

CREATE INDEX idx_venue_id ON events (venue_id);

(I'm not sure from your index names whether idx_zip_from_distance really includes the zipcode_to column. If not, you should add it to make it a covering index. Also, I've included the venues.id column in idx_zipcode for completeness, but, assuming it's the primary key for the table and that you're using InnoDB, it will be included automatically anyway.)

However, it looks like MySQL is choosing a different, and possibly suboptimal, query plan, where it scans through all events, finds their venues and zip codes, and only then filters the results on distance. This could be the optimal query plan, if the cardinality of the events table was low enough, but from the fact that you're asking this question I assume it's not.

One reason for the suboptimal query plan could be the fact that you have too many indexes which are confusing the planner. For instance, do you really need all three of those indexes on the zipcode table, given that the data it stores is presumably symmetric? Personally, I'd suggest only the index I described above, plus a unique index (which can also be the primary key, if you don't have an artificial one) on (zipcode_to, zipcode_from) (preferably in that order, so that any occasional queries on zipcode_to=? can make use of it).

However, based on some testing I did, I suspect the main issue why MySQL is choosing the wrong query plan comes simply down to the relative cardinalities of your tables. Presumably, your actual zipcode_distances table is huge, and MySQL isn't smart enough to realize quite how much the conditions in the WHERE clause really narrow it down.

If so, the best and simplest fix may be to simply force MySQL to use the indexes you want:

select

*

from

zipcode_distances z

FORCE INDEX (idx_zip_from_distance)

inner join

venues v

FORCE INDEX (idx_zipcode)

on z.zipcode_to=v.zipcode

inner join

events e

FORCE INDEX (idx_venue_id)

on v.id=e.venue_id

where

z.zipcode_from='92108' and

z.distance <= 5

With that query, you should indeed get the desired query plan. (You do need FORCE INDEX here, since with just USE INDEX the query planner could still decide to use a table scan instead of the suggested index, defeating the purpose. I had this happen when I first tested this.)

Ps. Here's a demo on SQLize, both with and without FORCE INDEX, demonstrating the issue.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值