目的:
现在表里有130w数据 ,取每个站点的最新时间的记录 ,同个站点,有多个记录
数据量
表索引
原sql
SELECT
t2.W_S_HEIGHT,
t2.W_N_HEIGHT,
t2.CREATE_TIME,
t2.STATION_ID
FROM
( SELECT max( t.CREATE_TIME ) CREATE_TIME, STATION_ID FROM t_water t GROUP BY t.STATION_ID ) t1
INNER JOIN t_water t2 ON t1.CREATE_TIME = t2.CREATE_TIME
AND t1.STATION_ID = t2.STATION_ID
第一次尝试
建立复合索引,防止回表,结果人家都没走索引,甚至跟没加索引时,查询速度差不多,索性直接删了索引
第二次尝试
优化t1这个子查询
优化前:
SELECT max( t.CREATE_TIME ) CREATE_TIME, STATION_ID FROM t_water t GROUP BY t.STATION_ID
优化后
SELECT
CREATE_TIME,
STATION_ID
FROM
( SELECT CREATE_TIME,STATION_ID FROM t_water ORDER BY create_time DESC) t
GROUP BY
STATION_ID
速度确实快了点 , 但是也发现了一个问题 , 这个问题我直接放连接 , 不做过多阐述
链接: https://blog.csdn.net/lglaljj123/article/details/79864188.
第三次尝试
准备将这个字段STATION_ID进行操作,决定使用mysql的crc32将这个UUID转化为数字
添加新的字段
ALTER TABLE t_water ADD COLUMN STATION_ID_HASH int unsigned NOT NULL DEFAULT 0;
进行更新操作
UPDATE t_water t1
LEFT JOIN ( SELECT crc32( STATION_ID ) AS hashvalue, STATION_ID FROM t_water WHERE STATION_ID IS NOT NULL GROUP BY STATION_ID ) c ON t1.STATION_ID = c.STATION_ID
SET t1.STATION_ID_HASH = c.hashvalue;
原sql t1子查询最终结果
SELECT max( t.CREATE_TIME ) CREATE_TIME,STATION_ID_HASH FROM t_water t WHERE STATION_ID IS NOT NULL GROUP BY t.STATION_ID_HASH
快了非常多,现在再对原sql进行修改下
SELECT
t2.W_S_HEIGHT,
t2.W_N_HEIGHT,
t2.CREATE_TIME,
t2.STATION_ID
FROM
( SELECT max( t.CREATE_TIME ) CREATE_TIME,STATION_ID_HASH FROM t_water t WHERE STATION_ID IS NOT NULL GROUP BY t.STATION_ID_HASH) t1,
t_water t2
WHERE
t1.STATION_ID_HASH = crc32( t2.STATION_ID )
AND t1.CREATE_TIME = t2.CREATE_TIME
还算在接受范围内,与之前的4.1s相比较差不多快了四倍