Mybatis 批量插入

原创已于 2022-01-29 17:08:22 修改 · 1w 阅读

77 ·

CC 4.0 BY-SA版权

文章标签：

#mysql #批量插入 #foreach #BATCH #大量数据

于 2022-01-27 18:04:45 首次发布

人在江湖之Mybatis 专栏收录该内容

12 篇文章

订阅专栏

insert, update 和 delete

前文我们说到了select标签，以及一些复杂查询的处理。本文我们主要讨论一下Mybatis的批量插入操作。在这之前，我们还是得先了解insert, update 和 delete标签。

<insert
  id="insertAuthor"
  parameterType="domain.blog.Author"
  flushCache="true"
  statementType="PREPARED"
  keyProperty=""
  keyColumn=""
  useGeneratedKeys=""
  timeout="20">

<update
  id="updateAuthor"
  parameterType="domain.blog.Author"
  flushCache="true"
  statementType="PREPARED"
  timeout="20">

<delete
  id="deleteAuthor"
  parameterType="domain.blog.Author"
  flushCache="true"
  statementType="PREPARED"
  timeout="20">

属性	描述
id	在命名空间中唯一的标识符，可以被用来引用这条语句。
parameterType	将会传入这条语句的参数的类全限定名或别名。这个属性是可选的，因为 MyBatis 可以通过类型处理器（TypeHandler）推断出具体传入语句的参数，默认值为未设置（unset）。
flushCache	将其设置为 true 后，只要语句被调用，都会导致本地缓存和二级缓存被清空，默认值：（对 insert、update 和 delete 语句）true。
timeout	这个设置是在抛出异常之前，驱动程序等待数据库返回请求结果的秒数。默认值为未设置（unset）（依赖数据库驱动）。
statementType	可选 STATEMENT，PREPARED 或 CALLABLE。这会让 MyBatis 分别使用 Statement，PreparedStatement 或 CallableStatement，默认值：PREPARED。
useGeneratedKeys	（仅适用于 insert 和 update）这会令 MyBatis 使用 JDBC 的 getGeneratedKeys 方法来取出由数据库内部生成的主键（比如：像 MySQL 和 SQL Server 这样的关系型数据库管理系统的自动递增字段），默认值：false。
keyProperty	（仅适用于 insert 和 update）指定能够唯一识别对象的属性，MyBatis 会使用 getGeneratedKeys 的返回值或 insert 语句的 selectKey 子元素设置它的值，默认值：未设置（unset）。如果生成列不止一个，可以用逗号分隔多个属性名称。
keyColumn	（仅适用于 insert 和 update）设置生成键值在表中的列名，在某些数据库（像 PostgreSQL）中，当主键列不是表中的第一列的时候，是必须设置的。如果生成列不止一个，可以用逗号分隔多个属性名称。
databaseId	如果配置了数据库厂商标识（databaseIdProvider），MyBatis 会加载所有不带 databaseId 或匹配当前 databaseId 的语句；如果带和不带的语句都有，则不带的会被忽略。

普通的修改和删除都非常简单，这里我们不再赘述。我们主要来讨论讨论插入操作。

新建一张测试表t_my_emp

在这里插入图片描述

单条数据的插入

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE mapper
        PUBLIC "-//mybatis.org//DTD Mapper 3.0//EN"
        "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
<!-- namespace我们对应到了我们的EmpMapper接口 -->
<mapper namespace="com.yyoo.mybatis.mapper.MyEmpMapper">

    <insert id="insert" parameterType="MyEmp">
        insert into t_my_emp(name,age,sex)
        values(#{name},#{age},#{sex})
    </insert>

</mapper>

public class Demo6 {

    public static void main(String[] args) throws IOException {

        String resouce = "mybatis-config.xml";
        InputStream in = Resources.getResourceAsStream(resouce);
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(in);
        
        SqlSession session = sqlSessionFactory.openSession();
        try{
            MyEmpMapper empMapper = session.getMapper(MyEmpMapper.class);
            long start = System.currentTimeMillis();
            int num = 100;
            for (int i = 0;i < num;i++) {
                Random random = new Random();
                MyEmp myEmp = new MyEmp();
                myEmp.setName(AutoNameUtil.autoSurAndName());
                myEmp.setAge(random.nextInt(50) + 15);// 15岁及以上
                myEmp.setSex(random.nextInt(2));
                empMapper.insert(myEmp);
            }

            session.commit();
            long end = System.currentTimeMillis();
            System.out.println("直接insert执行时间："+(end - start));

        }catch (Exception e){
            e.printStackTrace();
            session.rollback();
        }finally {
            session.close();
        }

    }



}

因为数据库设置了id自增（mysql）所以id属性可以不用传。但如果我们的id是使用的序列来实现的，每次插入前需要查询一次序列的值，再或者我们的id每次插入时是现有的最大id+1怎么办？

使用selectKey标签

    <insert id="insert" parameterType="MyEmp">
        <!-- keyProperty主键的字段名称，order=before表示在insert之前执行 -->
        <selectKey keyProperty="id" resultType="int" order="BEFORE" statementType="PREPARED">
            select ifnull(max(id),0) + 1 from t_my_emp
        </selectKey>
        insert into t_my_emp(id,name,age,sex)
        values(#{id},#{name},#{age},#{sex})
    </insert>

    public static void main(String[] args) throws IOException {

        String resouce = "mybatis-config.xml";
        InputStream in = Resources.getResourceAsStream(resouce);
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(in);
        SqlSession session = sqlSessionFactory.openSession();
        try{
            MyEmpMapper empMapper = session.getMapper(MyEmpMapper.class);

            Random random = new Random();
            MyEmp myEmp = new MyEmp();
            myEmp.setName(AutoNameUtil.autoSurAndName());
            myEmp.setAge(random.nextInt(50)+15);// 15岁及以上
            myEmp.setSex(random.nextInt(2));
			// 不用设置id
            empMapper.insert(myEmp);
            session.commit();// 提交事务，否则插入不成功

        }catch (Exception e){
            e.printStackTrace();
            session.rollback();
        }finally {
            session.close();
        }

    }

这种写法在批量插入的时候就会在通过java代码来循环执行empMapper.insert。

AutoNameUtil类是随机生成姓名的工具类，请查看Java生成随机常用汉字或姓名一文

使用foreach标签批量插入

    <insert id="insertForeach">
        insert into t_my_emp(name,age,sex)
        values
        <foreach item="myEmp" collection="myEmpList" separator=",">
        (#{myEmp.name},#{myEmp.age},#{myEmp.sex})
        </foreach>
    </insert>

public interface MyEmpMapper {

    int insert(MyEmp myEmp);

    int insertForeach(@Param("myEmpList") List<MyEmp> myEmpList);
}

public class Demo8 {
    public static void main(String[] args) throws IOException {

        String resouce = "mybatis-config.xml";
        InputStream in = Resources.getResourceAsStream(resouce);
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(in);
        
        SqlSession session = sqlSessionFactory.openSession();
        try{
            MyEmpMapper empMapper = session.getMapper(MyEmpMapper.class);

            int num = 100;
            List<MyEmp> list = new ArrayList<>();
            for(int i = 0; i < num; i++) {
                Random random = new Random();
                MyEmp myEmp = new MyEmp();
                myEmp.setName(AutoNameUtil.autoSurAndName());
                myEmp.setAge(random.nextInt(50) + 15);// 15岁及以上
                myEmp.setSex(random.nextInt(2));
                list.add(myEmp);
            }

            long startTime = System.currentTimeMillis();
            empMapper.insertForeach(list);
            session.commit();
            long endTime = System.currentTimeMillis();
            System.out.println("批量执行时间："+(endTime - startTime));

        }catch (Exception e){
            e.printStackTrace();
            session.rollback();
        }finally {
            session.close();
        }

    }
}

轻松插入100000条数据。而且速度还很快。有兴趣可以去试试循环执行Mapper.insert方法的插入，看看时间的差距。

以上插入时id值我们使用的mysql的自增id，此时不能使用selectKey标签了。如果不想使用mysql的自增id，这样大批量插入的情况下请使用分布式id。

使用ExecutorType.BATCH

public class Demo9 {
    public static void main(String[] args) throws IOException {

        String resouce = "mybatis-config.xml";
        InputStream in = Resources.getResourceAsStream(resouce);
        SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(in);
        
        SqlSession session = sqlSessionFactory.openSession(ExecutorType.BATCH);
        try{
            MyEmpMapper empMapper = session.getMapper(MyEmpMapper.class);
            long start = System.currentTimeMillis();
            int num = 1000000;
            for (int i = 0;i < num;i++) {
                Random random = new Random();
                MyEmp myEmp = new MyEmp();
                myEmp.setName(AutoNameUtil.autoSurAndName());
                myEmp.setAge(random.nextInt(50) + 15);// 15岁及以上
                myEmp.setSex(random.nextInt(2));
                empMapper.insert(myEmp);
            }

			// 立即执行更新，执行后清除BATCH模式缓存的sql语句
            session.flushStatements();
            session.commit();
            long end = System.currentTimeMillis();
            System.out.println("直接insert执行时间："+(end - start));

        }catch (Exception e){
            e.printStackTrace();
            session.rollback();
        }finally {
            session.close();
        }

    }
}

springBoot下设置BATCH模式

@Bean("sqlSessionTemplate")
    public SqlSessionTemplate sqlSessionTemplate(SqlSessionFactory factory) {
        // 使用上面配置的Factory ，并且设置Template为BATCH方式
        SqlSessionTemplate template = new SqlSessionTemplate(factory, ExecutorType.BATCH);
        return template;
    }

执行时间对比

执行方式	100条	1000条	1w条	10w条	100w条
循环insert	475ms	1054ms	4342ms	29350ms	279170ms
foreach插入	413ms	499ms	976ms	2913ms	报错(sql语句太大了)
BATCH插入	385ms	583ms	2255ms	14492ms	149447ms

为了结果的准确性，每次执行前我都将已经插入的数据删除重新插入。

从结果可以看出，foreach 和 BATCH方式在大量数据插入时效率较高。但是foreach在生成的sql语句太长（也就是字符串打大小超过限制之后会报异常），而BATCH的方式不会出现异常。注意：我们的表结构非常简单，实际情况下，表字段可能比较多，那么foreach方式可能在10w条的时候就会报错了。