mybatis批量插入10w条数据的优化

批量插入数据的业务场景我们经常会遇到,众所周知,批量操作是很耗时的,因此本次就来讨论如何优化

创建基础工程和表
1.自行搭建springboot项目并引入mybatis-plus
2.创建表,这里我暂时只给两个字段
3.此处还引入hutool,用于生成id

<dependency>
  <groupId>cn.hutool</groupId>
    <artifactId>hutool-all</artifactId>
    <version>5.7.18</version>
</dependency>
<dependency>
    <groupId>org.projectlombok</groupId>
    <artifactId>lombok</artifactId>
</dependency>

表结构

CREATE TABLE `student` (
  `id` varchar(36) COLLATE utf8mb4_general_ci NOT NULL,
  `name` varchar(255) COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

实体类

@Data
@TableName
public class Student {
    
    private String id;
    
    private String name;
}

创建好项目和数据库后,添加测试代码
请求controller

@RestController
public class TestController {
    @Autowired
    private IStudentService service;

    @GetMapping("insert/{number}")
    public void insert(@PathVariable Integer number) {
        service.insert(number);
        System.out.println(number + " 条插入完成");
    }

}

service类

public interface IStudentService  extends IService<Student> {
    void insert(Integer number);
}

1.使用mybatis-plus自带的批量插入功能

@Service
@RequiredArgsConstructor
public class IStudentServiceImpl extends ServiceImpl<StudentMapper, Student> implements IStudentService {

    @Override
    @Transactional
    public void insert(Integer number) {
        List<Student> list = new ArrayList<>();
        for (int i = 0; i < number; i++) {
            Student course = new Student();
            course.setName("name" + i);
            course.setId(String.valueOf(IdUtil.getSnowflakeNextId()));
            list.add(course);
        }
        //使用mybatis-plus
        mp(list);
    }

    private void mp(List<Student> list) {
        System.out.println("mybatis-plus操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        super.saveBatch(list);
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("mybatis-plus操作,耗时" + total);
    }
}
 

在controller处执行请求,插入100000条
在这里插入图片描述
结果
在这里插入图片描述
10秒钟,有点久
翻看源码后发现它是一条条for循环插入的。。。好家伙
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
这是把10w条数据分批插入,每批是1000条循环?

看来大数据的批插入不能用这个方法,否则10w次的io,客户气得半夜跑过来给你一砖头

现在使用另一种方法
批插入肯定是要访问数据库的,这是避免不了的,因此我们可以把访问的次数减少,那不就好了?也就是减少io,减少sqlSession会话次数

如下,在mapper重新生成了sql语句

    @Insert("<script>" +
            "INSERT INTO `student`(`id`, `name`) " +
            "VALUES " +
            "<foreach collection= 'list' item= 'item' separator= ','> " +
            "(#{item.id},#{item.name})" +
            "</foreach></script>")
    void batch(@Param("list") List<Student> list);

也就是10w条语句改成一条,一次性执行,减少循环过程
代码如下

@Service
@RequiredArgsConstructor
public class IStudentServiceImpl extends ServiceImpl<StudentMapper, Student> implements IStudentService {

    @Override
    @Transactional
    public void insert(Integer number) {
        List<Student> list = new ArrayList<>();
        for (int i = 0; i < number; i++) {
            Student course = new Student();
            course.setName("name" + i);
            course.setId(String.valueOf(IdUtil.getSnowflakeNextId()));
            list.add(course);
        }
        //使用mybatis-plus
//        mp(list);
        //使用自定义的sql批处理
        consumer(list);
    }

    private void mp(List<Student> list) {
        System.out.println("mybatis-plus操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        super.saveBatch(list);
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("mybatis-plus操作,耗时" + total);
    }

    /**
     * 使用自定义的sql批操作
     * @param list
     */
    private void consumer(List<Student> list) {
        System.out.println("使用自定义的sql批操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        baseMapper.batch(list);
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("使用自定义的sql批操作,耗时" + total);
    }
}

执行结果
在这里插入图片描述
快很多了
后续的测试发现,插入的内容太多,就会出现问题
在这里插入图片描述
我把上面的name值换成和id一样的取值,增加了name值的长度,因此sql语句长度也增加了
在这里插入图片描述

### Error updating database.  Cause: com.mysql.cj.jdbc.exceptions.PacketTooBigException: Packet for query is too large (5,300,043 > 4,194,304). You can change this value on the server by setting the 'max_allowed_packet' variable.
### The error may exist in org/example/student/mapper/StudentMapper.java (best guess)
### The error may involve org.example.student.mapper.StudentMapper.batch-Inline
### The error occurred while setting parameters

我肯定不能随便的改数据库配置的

行吧,来终极大杀器,多线程
将sql批处理拆分成多个线程,一个线程处理3w条(记得加事务)


    private void thread(List<Student> list1) {
        System.out.println("多线程操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        int max = 3;
        final CountDownLatch countDownLatch = new CountDownLatch(max);
        ExecutorService fixedThreadPool = Executors.newFixedThreadPool(max);
        task(fixedThreadPool, list1.subList(0, 30000), countDownLatch);
        task(fixedThreadPool, list1.subList(30000, 60000), countDownLatch);
        task(fixedThreadPool, list1.subList(60000, 100000), countDownLatch);
        try {
            countDownLatch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        fixedThreadPool.shutdown();
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("多线程操作,耗时" + total);
    }

    private void task(ExecutorService fixedThreadPool, List<Student> list, CountDownLatch countDownLatch) {
        System.out.println("本次插入条数为 = " + list.size());
        fixedThreadPool.execute(() -> {
            baseMapper.batch(list);
            countDownLatch.countDown();
        });
    }

在这里插入图片描述
完整代码如下

@Service
@RequiredArgsConstructor
public class IStudentServiceImpl extends ServiceImpl<StudentMapper, Student> implements IStudentService {

    @Override
    @Transactional
    public void insert(Integer number) {
        List<Student> list = new ArrayList<>();
        for (int i = 0; i < number; i++) {
            Student course = new Student();
//            course.setName("name" + i);
//            course.setId(String.valueOf(IdUtil.getSnowflakeNextId()));
//            list.add(course);
            //下面的代码会增加sql语句长度
            long snowflakeNextId = IdUtil.getSnowflakeNextId();
            course.setName("name" + snowflakeNextId);
            course.setId(String.valueOf(snowflakeNextId));
            list.add(course);
        }
        //使用mybatis-plus
//        mp(list);
        //使用自定义的sql批处理
//        consumer(list);
        //使用多线程同时操作
        thread(list);
    }

    private void mp(List<Student> list) {
        System.out.println("mybatis-plus操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        super.saveBatch(list);
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("mybatis-plus操作,耗时" + total);
    }

    /**
     * 使用自定义的sql批操作
     * @param list
     */
    private void consumer(List<Student> list) {
        System.out.println("使用自定义的sql批操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        baseMapper.batch(list);
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("使用自定义的sql批操作,耗时" + total);
    }

    private void thread(List<Student> list1) {
        System.out.println("多线程操作--start");
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        int max = 3;
        final CountDownLatch countDownLatch = new CountDownLatch(max);
        ExecutorService fixedThreadPool = Executors.newFixedThreadPool(max);
        task(fixedThreadPool, list1.subList(0, 30000), countDownLatch);
        task(fixedThreadPool, list1.subList(30000, 60000), countDownLatch);
        task(fixedThreadPool, list1.subList(60000, 100000), countDownLatch);
        try {
            countDownLatch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        fixedThreadPool.shutdown();
        stopWatch.stop();
        long total = stopWatch.getTotal(TimeUnit.SECONDS);
        System.out.println("多线程操作,耗时" + total);
    }

    private void task(ExecutorService fixedThreadPool, List<Student> list, CountDownLatch countDownLatch) {
        System.out.println("本次插入条数为 = " + list.size());
        fixedThreadPool.execute(() -> {
            baseMapper.batch(list);
            countDownLatch.countDown();
        });
    }

}

当然,还有其他方法,例如生成sql文件然后由mysql执行,还有就是使用存储过程,这里暂不试了

  • 5
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值