Java导出excel数据优化

最新推荐文章于 2025-04-18 16:08:34 发布

ZH.KE

最新推荐文章于 2025-04-18 16:08:34 发布

阅读量739

点赞数 6

文章标签： spring boot

本文链接：https://blog.csdn.net/qq_45543606/article/details/136059225

版权

大数据excel导出解决方案

场景和问题

管理后台需要提供一个导出脑电记录生成excel文件功能，数据量大约在50万.针对此需求功能，面对以下几个问题：

1.查询数据库把全部相关脑电记录导出，如果一次性导出，内存会溢出，需要批量导出。

2.批量导出，如果采用分页查询，需要考察是否出现深分页带来的性能问题。

3.采用多线程处理

4.采用阿里巴巴开源的EasyExcel组件生成excel文件

源码地址： https://gitee.com/zhihui-ke/excel-load-demo.git（完整mock数据和测试例子）

测试结果

模拟50万条数据为例子

串行化执行大概37秒
双线程执行大概30秒
数据库查询采用覆盖索引大概15秒
数据库查询使用id分页大概5秒（前提id有序）

过程

串行执行

 public void generateExcel() {
        long startTime = System.currentTimeMillis();
        String filePath = "F://data//excel//" + System.currentTimeMillis() + ".xlsx";
        int batchNum = 1000;
        int total = (int) this.count();
        int current = 1;
        int index = 1;
        try (ExcelWriter excelWriter = EasyExcel.write(filePath, EegRecordMockExcel.class).build()) {
            // 这里注意 如果同一个sheet只要创建一次
            WriteSheet writeSheet = EasyExcel.writerSheet("模板").build();
            while (total > 0) {
                int offset = (current - 1) * batchNum;
                batchNum = total > batchNum ? batchNum : total;
                
                // select t.* from t_eeg_record_mock t limit #{offset}, #{size}
                List<EegRecordMock> data = eegRecordMockMapper.listForPage(offset, batchNum);
                
                excelWriter.write(data, writeSheet);
                
                total -= batchNum;
                current++;
            }

        }
        
    }

分析：由于easyExcel不支持并发同步写入，因此不能开启多线程同时写入。而且经过分析，主要耗时是数据库查询。第一步，开双线程，采用观察生产者模式，一个线程主要获取数据，另一个线程监听，数据准备时写入excel文件。主要采用SynchronousQueue队列实现线程之间的通讯。

双线程同步执行

 public void generateExcelAsync() throws InterruptedException {

        SynchronousQueue<List<EegRecordMock>> queue = new SynchronousQueue<>();
        String filePath = "F://data//excel//" + System.currentTimeMillis() + ".xlsx";
        ExecutorService executorService = Executors.newFixedThreadPool(2);
        int batchNum0 = 1000;
        int total0 = (int) this.count();
        int num = total0 / batchNum0;
        num = total0 % batchNum0 == 0 ? num : num + 1;
        CountDownLatch countDownLatch = new CountDownLatch(num);
        long startTime = System.currentTimeMillis();
		// 生产者线程
        executorService.submit(() -> {

            int batchNum = batchNum0;
            int total = total0;
            int current = 1;
            
            while (total > 0) {
                int offset = (current - 1) * batchNum;
                batchNum = total > batchNum ? batchNum : total;
                int size = batchNum;
				// select t.* from t_eeg_record_mock t limit #{offset}, #{size}
                List<EegRecordMock> data = eegRecordMockMapper.listForPage(offset, size);
                try {
                    queue.put(data); 
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }

                total -= batchNum;
                current++;
            }


        });
		// 消费者线程
        executorService.submit(() -> {

            try (ExcelWriter excelWriter = EasyExcel.write(filePath, EegRecordMockExcel.class).build()) {
                WriteSheet writeSheet = EasyExcel.writerSheet("模板").build();
                try {
                    while (countDownLatch.getCount() > 0) {
                        List<EegRecordMock> data = queue.take();
                        excelWriter.write(data, writeSheet);
                        countDownLatch.countDown();
                    }

                } catch (InterruptedException e) {
                    e.printStackTrace();
                } catch (Exception e) {
                    log.error("写入出错： {}", e.getMessage());
                }
            }
        });
     
        countDownLatch.await();
    }

优化效果不是特别明显，大约时间减少7-8秒左右。性能瓶颈是深分页查询比较慢。因此获取数据优化

覆盖索引深分页优化

# 原本
select t.* from t_eeg_record_mock t limit #{offset}, #{size}

# 优化后
select t.* from t_eeg_record_mock t, (select id from t_eeg_record_mock limit #{offset}, #{size}) t2 where t.id = t2.id order by t.id asc

结果：大概时间是15秒左右

在id有序情况下，采用id分页

# 生产者优化代码
executorService.submit(() -> {

            int batchNum = batchNum0;
            int total = total0;
            int current = 1;
            AtomicReference<Integer> idIndex = new AtomicReference<>(0);

            while (total > 0) {
                int offset = (current - 1) * batchNum;
                batchNum = total > batchNum ? batchNum : total;
                int size = batchNum;
                int maxId = idIndex.get();
                // select t.* from t_eeg_record_mock t where t.id > #{id} order by id asc limit #{size}
                List<EegRecordMock> data = eegRecordMockMapper.listBetweenId(maxId, size);
                //  获取上一次分页的最大id，用于下次过滤查询
                idIndex.set(data.get(data.size() - 1).getId());
                try {
                    queue.put(data);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }

                total -= batchNum;
                current++;
            }

        });

结果：时间大概是在5秒左右。