经过explain排查,一段很长的sql中的瓶颈sql为:
SELECT scf.specode, scf.new_num AS ending_num, scf.new_price AS endingPrice, scf.new_fee AS endingFee
FROM (
SELECT scf.specode, scf.new_num, scf.new_price, scf.new_fee
FROM sku_cost_flow scf
WHERE scf.specode IN (
......
)
AND scf.time <= UNIX_TIMESTAMP('2023-06-30 23:59:59')
ORDER BY time DESC, id DESC
LIMIT 100000000
) scf
GROUP BY scf.specode
这里子查询 limit 10000000 实在是太大了,也不知道之前的是怎么写的,主要还是查出每个specode对应的最近时间的一些数据,
那我们可以直接查 最近 (也就是max)的时间 然后再做一个连接查询
这里由于时间相同的情况下会有不同数据,我们连表时再加一个查询最大id的子查询
修改后的sql:
SELECT scf.specode, scf.new_num AS ending_num, scf.new_price AS endingPrice, scf.new_fee AS endingFee
FROM (
SELECT specode, MAX(time) AS max_time
FROM sku_cost_flow
WHERE
time < UNIX_TIMESTAMP('2023-06-30 23:59:59')
AND specode IN (
...
)
GROUP BY specode
) scf_max
JOIN sku_cost_flow scf
ON scf.specode = scf_max.specode
AND scf.time = scf_max.max_time
AND scf.id = (
SELECT MAX(id)
FROM sku_cost_flow
WHERE specode = scf_max.specode
AND time = scf_max.max_time
)
优化结果:查询时间从一开始的7s -> 0.4s 扫描行数700w ->3w
如果有更好的方案,欢迎评论