项目场景:
因为业务原因,需要插入大量数据一批数据十几万问题描述:
之前是mybatis里面foreach循环拼接的insert语句,执行起来很慢<foreach collection="xx" item="xxx" open="x" separator="x" close="x">
#{xxxx},#{xxxx},#{xxxx},#{xxxx},#{xxxx},#{xxxx}
</foreach>
原因分析:
mybatis在解析数据拼接的时候还是比较耗时,加上一条语句过长之后导致的执行缓慢,并没有真正意义上的批处理;
把拼接的数据分批一次拼接执行一万条也耗时厉害,就想用原生jdbc试一下
解决方案:
一共找了3个解决方案
1,原生jdbc 插入
2,spring管理的jdbc 插入
3,sqlSessionFactory获取数据库连接 插入
1,原生jdbc
package com.company;
import java.security.SecureRandom;
import java.sql.*;
/**
* @Author: xxxx
* @Date: 2021/3/10 11:38
*/
public class test {
public static void Test() {
String url = "jdbc:mysql://x.x.x.x:3306/xxxx?rewriteBatchedStatements=true&characterEncoding=utf8";
String user = "root";
String password = "xxxx";
Connection conn = null;
PreparedStatement pstm = null;
ResultSet rt = null;
try {
Long startTime = System.currentTimeMillis();
Class.forName("com.mysql.cj.jdbc.Driver");
conn = DriverManager.getConnection(url, user, password);
conn.setAutoCommit(false);
String sql = "INSERT INTO userinfo(uid,uname,uphone,uaddress,uname2,uname3,uname4,uname5,uname6,uname7,uname8,uname9,uname10,uname11,uname12,uname13,uname14,uname15,uname16) VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)";
pstm = conn.prepareStatement(sql);
SecureRandom rand = new SecureRandom();
int e = 100000;
for (int i = 1; i <= e; i++) {
pstm.setInt(1, i);
pstm.setString(2, "姓名" + i);
pstm.setString(3, "123456789");
pstm.setString(4, "银河系-太阳-地球" + i + "号");
pstm.setString(5, "姓名2" + i);
pstm.setString(6, "姓名3" + i);
pstm.setString(7, "姓名4" + i);
pstm.setString(8, "姓名5" + i);
pstm.setString(9, "姓名6" + i);
pstm.setString(10, "姓名7" + i);
pstm.setString(11, "姓名8" + i);
pstm.setString(12, "姓名9" + i);
pstm.setString(13, "姓名10" + i);
pstm.setString(14, "姓名11" + i);
pstm.setString(15, "姓名12" + i);
pstm.setString(16, "姓名13" + i);
pstm.setString(17, "姓名14" + i);
pstm.setString(18, "姓名15" + i);
pstm.setString(19, "姓名16" + i);
pstm.addBatch();
}
Long startTime2 = System.currentTimeMillis();
System.out.println("拼接" + e + "条数据;总用时:" + (startTime2 - startTime));
pstm.executeBatch();
conn.commit();
Long endTime = System.currentTimeMillis();
System.out.println("插入" + e + "条数据;总用时:" + (endTime - startTime));
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
} finally {
if (pstm != null) {
try {
pstm.close();
} catch (SQLException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
if (conn != null) {
try {
conn.close();
} catch (SQLException e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
}
}
public static void main(String[] args) {
Test();
}
}
已连接到目标 VM, 地址: ''127.0.0.1:53568',传输: '套接字''
拼接100000条数据;总用时:1854
插入100000条数据;总用时:7278
与目标 VM 断开连接, 地址为: ''127.0.0.1:53568',传输: '套接字''
进程已结束,退出代码为 0
注意:
String url = “jdbc:mysql://x.x.x.x:3306/xxxx?rewriteBatchedStatements=true&characterEncoding=utf8”;
rewriteBatchedStatements=true 需要开启批处理
Class.forName(“com.mysql.cj.jdbc.Driver”); 驱动包就用 com.mysql.cj.jdbc.Driver
别用com.mysql.jdbc.Driver 这个过时的了,低版本批处理无效
说明:
如果插入的列很少,比如3,4列,那么10万条就1-2秒;但是基本上实际生产中基本都是10-20列;
所以我就把之前别人几列的数据扩展到20列试了一下,基本是符合预期,时间翻了很多倍;
时长和数据列,数据条数成正比;
后续:
之后使用spring管理的jdbc测试批处理,发现速度慢了10倍以上,目前原因还不明,还在探索中
原生的jdbc10w条数据才2-3秒,用这个1w条就要3-4秒
2,spring管理的jdbc
public void Test() {
try {
// DefaultTransactionDefinition def = new DefaultTransactionDefinition();
// def.setIsolationLevel(TransactionDefinition.ISOLATION_READ_COMMITTED);
// def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);
// DataSourceTransactionManager transactionManager = new DataSourceTransactionManager(jdbctemplate.getDataSource());
// TransactionStatus status = transactionManager.getTransaction(def);
String sql = "INSERT INTO userinfo(uname,uphone,uaddress) VALUES(?,?,?)";
Long startTime = System.currentTimeMillis();
int e = 10000;
SecureRandom rand = new SecureRandom();
ArrayList<Integer> ist = new ArrayList();
for (int i = 1; i <= e; i++) {
ist.add(i);
}
jdbctemplate.batchUpdate(sql, new BatchPreparedStatementSetter() {
@Override
public void setValues(PreparedStatement ps, int i) throws SQLException {
ps.setInt(1, ist.get(i));
ps.setString(1, "姓名" + ist.get(i));
ps.setString(2, "12345678901" );
ps.setString(3, "地址地址地址");
}
@Override
public int getBatchSize() {
return e;
}
});
// 持久化所有数据
// transactionManager.commit(status);
Long endTime = System.currentTimeMillis();
System.out.println("插入" + e + "条数据OK;用时:" + (endTime - startTime));
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
使用上这个方式记要注入数据源,DataSource是项目的数据源
<!-- 获取该对象调用execute方法,可以执行sql语句 -->
<bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
<property name="dataSource" ref="DataSource"></property>
</bean>
插入10000条数据;用时:3607
3,sqlSessionFactory获取数据库连接
上面使用spring管理的jdbc性能比原生jdbc慢了10倍,后来改用sqlSessionFactory获取数据库连接方式;10w数据大概17秒左右,虽然还是比原生的jdbc慢但是还是可以接受了;
private void handlerReceiveableBatches(List<xxxxx> list) {
if(null != list && list.size() > 0) {
SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);
RuleRaleInstMapper mapper = sqlSession.getMapper(xxxxMapper.class);
long startTime = System.currentTimeMillis();
int pointLimit = 10000;
for (int i = 0; i < list.size(); i++) {
if (i > 0 && i % pointLimit == 0) {
sqlSession.flushStatements();
sqlSession.commit();
}
mapper.insert(list.get(i));
}
sqlSession.flushStatements();
sqlSession.commit();
sqlSession.clearCache();
sqlSession.close();
logger.info("********耗时:{}毫秒", System.currentTimeMillis() - startTime);
}
}
********耗时:17090毫秒
最后附上测试的表结构
CREATE TABLE `userinfo` (
`uid` int(11) NOT NULL AUTO_INCREMENT,
`uname` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uphone` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uaddress` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname2` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname3` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname4` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname5` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname6` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname7` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname8` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname9` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname10` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname11` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname12` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname13` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname14` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname15` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`uname16` varchar(255) COLLATE utf8_bin DEFAULT NULL,
PRIMARY KEY (`uid`)
) ENGINE=InnoDB AUTO_INCREMENT=1000001 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;