记一次MySQL批量更新SQL优化，性能提升几十个数量级

wumi0002

已于 2024-04-25 10:58:18 修改

阅读量5.7k

点赞数 2

分类专栏： MySQL 文章标签： mysql sql

于 2022-03-01 19:54:50 首次发布

本文链接：https://blog.csdn.net/wumi0002/article/details/123193492

版权

MySQL 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

一、问题背景

原始需求是这样的，100个计数器硬件设备每隔5秒给服务器上报计数信息，服务端收到消息后先将上报的元数据存储起来，再每隔10s批量对之前上报的所有的元数据按时间段进行批量累加统计更新（题外话，为什么要每隔10s处理一次，而不是上报一次就累加一次？是为了不频繁操作数据库）

经过优化之后，批量更新SQL从58109ms耗时下降到了124ms。

二、实现方案历程

在看实现方案前，我们先看一下表的设计，其余的业务相关字段大家可以先忽略，主要就是要对相应时间段的finish_count做累加计算。当前表aps_produce_statistics中有73w条数据。

CREATE TABLE `aps_produce_statistics` (
`produce_id` bigint(20) NOT NULL COMMENT '生产任务id',
`line_id` bigint(20) NOT NULL COMMENT '产线id',
`shift` tinyint(2) NOT NULL COMMENT '班次 0-白班 1-夜班',
`time_segment` tinyint(2) NOT NULL COMMENT '所在时间段，1-10',
`plan_count` int(10) DEFAULT NULL COMMENT '时间段内的计划数',
`finish_count` int(10) NOT NULL COMMENT '完成量',
`plan_id` bigint(20) NOT NULL COMMENT '计划id',
`order_id` bigint(20) NOT NULL COMMENT '订单id',
`produce_time` date DEFAULT NULL COMMENT '生产时间'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

项目用的是springboot+mybatis框架。

2.1 方案1

将多个update语句通过<foreach/> 连接起来，生成的SQL就是以；分割的多个update语句。这样的效率是最差的，每执行一个update就要进行全表扫描，如果循环200次，那么耗时就是58109ms。

<update id="updateProduceTimeCount">
	<foreach collection="list" item="item" separator=";">
		update
			aps_produce_statistics
		set
			finish_count = finish_count + #{item.finishCount}
		WHERE
			  produce_id = #{item.produceId} and time_segment = #{item.timeSegment}
	</foreach>
</update>

update
	aps_produce_statistics
set
	finish_count = finish_count + 2
WHERE
	  produce_id = 1 and time_segment = 1;
update
	aps_produce_statistics
set
	finish_count = finish_count + 3
WHERE
	  produce_id = 1 and time_segment = 2;

2.2 方案2

方案1每次都要进行全表扫描，且进行了200次循环，共扫描了200*73w=14600w的数据，速度是可想而知的，所以这次我们给表加上联合索引，联合索引的字段就是where后的两个条件子段produce_id + time_segment。加了索引之后的建表语句如下:

CREATE TABLE `aps_produce_statistics` (
`produce_id` bigint(20) NOT NULL COMMENT '生产任务id',
`line_id` bigint(20) NOT NULL COMMENT '产线id',
`shift` tinyint(2) NOT NULL COMMENT '班次 0-白班 1-夜班',
`time_segment` tinyint(2) NOT NULL COMMENT '所在时间段，1-10',
`plan_count` int(10) DEFAULT NULL COMMENT '时间段内的计划数',
`finish_count` int(10) NOT NULL COMMENT '完成量',
`plan_id` bigint(20) NOT NULL COMMENT '计划id',
`order_id` bigint(20) NOT NULL COMMENT '订单id',
`produce_time` date DEFAULT NULL COMMENT '生产时间',
KEY `proId_timeseg` (`produce_id`,`time_segment`) COMMENT '生产任务id加时间段的联合索引'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

SQL还是像方案1一样，这次耗时下降到了 4819ms。

2.3 方案3

虽然加了索引之后，性能提升了不少，但还是需要将近5s，显然还是要继续优化的。思路就是先创建一张临时表，然后将要更新的数据批量插入到这张表里面，接着联合aps_produce_statistics和临时表进行批量update，只执行一次update指令，减少了扫描的行数。最终耗时下降到了124ms。

<update id="updateProduceTimeCount">
	create temporary table statistics_tmp(produce_id bigint(20), time_segment tinyint(2), count int(10), KEY `proId_timeseg` (`produce_id`,`time_segment`));
	insert into statistics_tmp values
	<foreach collection="list" item="item" separator="," close=";">
		(#{item.produceId}, #{item.timeSegment}, #{item.finishCount})
	</foreach>
	update aps_produce_statistics st, statistics_tmp temp set st.finish_count=st.finish_count + temp.count where st.produce_id=temp.produce_id and st.time_segment = temp. time_segment;
</update>