I have two questions:
As my title, which one is more efficient for running large query on large volume of data?
I've looked at MySQL doc which explains about workbench's performance in https://www.mysql.com/products/workbench/performance/ however I cannot seem to find any resource that specifically talks about efficiency difference between running query on cmd and query on workbench.
How to optimize this query
select
r.user_id,
k.id as kickscooter_id,
st_astext(k.location) as location,
k.created_at,
k.serial_number,
k_st.serial_number as states_serial_number,
st_astext(k_st.gps) as gps_location,
k_st.gps_updated_at,
r.start_time,
r.end_time
from kickscooters k
join rents r
on k.id= r.kickscooter_id
join kickscooter_states_190614 k_st
on k.serial_number = k_st.serial_number
order by r.rent_date
limit 999;
I've learned that creating an index allows mysql to sort things quickly therefore I've added index by
ALTER TABLE `tablename` ADD INDEX `indexname` (`columnname`);
As suggested in comments I've executed
analyze
since my server is MariaDB.
which gave me ERROR CODE 2013: LOST connection to server during query.
When I ran
explain
It worked and outputs:
id select_type table type possible_keys
1 SIMPLE k_st ALL kickscooter_states_190614_serial_number_date_index
1 SIMPLE k ref PRIMARY,kickscooters_serial_number_unique,kickscooters_serial_number_index
1 SIMPLE r ref rents_kickscooter_id_foreign
-table continued
/ key key_len ref rows extra
null null null 192818947 Using temporary; Using filesort
kickscooters_serial_number_unique 27 kickgoing_db.k_st.serial_number 1
rents_kickscooter_id_foreign 4 kickgoing_db.k.id 143
解决方案
Based on the Explain plan, optimizer is not able to use any index for ORDER BY rent. So try the following:
Ensure that an index exists on the rent_date column of the rents table. This index will be used to optimize the ORDER BY clause. It can be a single column index, or a multi-column one (used in other scenarios). But, in case of multi-column one, you need to ensure that the rent column is the first column in the index order.
Ensure that an index exists on the id column of the kickscooters table. Details about single-column / multi-column index remains the same as in the point #1.
Ensure that an index exists on the serial_number column of the kickscooter_states_190614 table. Details about single-column / multi-column index remains the same as in the point #1.
Now, after ensuring these indexes, try your original query. Most likely, optimizer should be able to optimize the Join Order. Besides, the above query, you can enforce join order by using STRAIGHT_JOIN optimizer hint. So, try the following query as well, and benchmark between the two of them:
select
r.user_id,
k.id as kickscooter_id,
st_astext(k.location) as location,
k.created_at,
k.serial_number,
k_st.serial_number as states_serial_number,
st_astext(k_st.gps) as gps_location,
k_st.gps_updated_at,
r.start_time,
r.end_time
from kickscooters k
straight_join rents r
on k.id= r.kickscooter_id
straight_join kickscooter_states_190614 k_st
on k.serial_number = k_st.serial_number
order by r.rent_date
limit 999;