mysql批量提交效率优化

xhuiting

已于 2023-10-18 11:16:06 修改

阅读量712

点赞数

分类专栏：调优数据库文章标签： mysql sql java

于 2023-02-21 16:47:50 首次发布

本文链接：https://blog.csdn.net/xhuiting/article/details/129144989

版权

数据库同时被 2 个专栏收录

15 篇文章 0 订阅

订阅专栏

调优

4 篇文章 0 订阅

订阅专栏

文章讨论了在Java中进行数据批量提交时，批处理的实际执行情况。通过Wireshark分析发现，即使使用PreparedStatement的executeBatch()方法，SQL语句在数据库端并未真正合并为单个请求。为实现真正的批处理，需要在建立连接时设置MySQLJDBC连接参数rewriteBatchedStatements为true。优化后，3000条数据的分发时间从4000毫秒降低到400毫秒，显著提升了性能。PostgreSQL也有类似参数reWriteBatchedInserts，需设为true以启用批量处理功能。

摘要由CSDN通过智能技术生成

在java代码中，写数据的批量提交的时候都是如下代码。首先，获取连接conn，再通过PreparedStatement 去执行批处理操作。

@Service
public class ServerOfMysql
        extends ServerAbs implements ServerInter{
    @Override
    public void connect(String classname, String drivername, String serverIP, int port, String dbname, String userName, String password, String unicode)
            throws Exception {
        String[] driveandurl = getUrl("Mysql", serverIP, port, dbname, userName, password, unicode);
        driveandurl[1] = driveandurl[1] + "&serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true";
        connect(classname,driveandurl[0], driveandurl[1], userName, password);
    }
}

ps.executeBatch();

我们都以为该方法帮我们封装好，将批量执⾏的⼀组 sql Insert 语句,改写为一条 batched 语句 insert into table02 (col1,col2) values (value1,value2),(value1,value2),(value1,value2), 并通过一次请求发送给数据库服务器的。

那么事实又是如何呢？

通过wireshark的分析抓到的包可以发现，sql是分开的。每次都是独立的。截取出来看如下。

这并不是我们需要的批处理。那么改如何优化呢？

就是再获取连接conn时设置rewriteBatchedStatements参数为true。

@Service
public class ServerOfMysql
        extends ServerAbs implements ServerInter{
    @Override
    public void connect(String classname, String drivername, String serverIP, int port, String dbname, String userName, String password, String unicode)
            throws Exception {
        String[] driveandurl = getUrl("Mysql", serverIP, port, dbname, userName, password, unicode);
        driveandurl[1] = driveandurl[1] + "&serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true&rewriteBatchedStatements=true";
        connect(classname,driveandurl[0], driveandurl[1], userName, password);
    }
}

现在才是我们想要的批处理sql。

优化前，3000条的数据分发耗时4000毫秒左右。

优化后，3000条的数据分发耗时400毫秒左右。

核心：

Mysql 提供了其特有的 JDBC 连接参数 rewriteBatchedStatements，当把该参数置为 true 时, mysql jdbc 驱动会在客户端重写用户提交的原始 SQL，并将重写后的 SQL “send the batched statements in a single request”。

postgreSql 与oracle是默认批量的。但是pg 在9.4.1208 版本后，又提供了参数 reWriteBatchedInserts，该参数默认值为 FALSE;要想支持批量，要改成true。