Spring Boot大数据导出：自动拆分Excel文件，实现高效异步处理

竹林幽深

于 2024-07-01 08:49:35 发布

阅读量145

点赞数

文章标签： spring boot 大数据 excel

原文链接：https://mp.weixin.qq.com/s/Q0SfOVYVjXEHWpOJtEaL3g

版权

原创新生代码农新生代码农 2024-05-28 07:04 安徽

点击关注公众号，更多资讯及时推送↓

新生代码农

码龄6年，分享Java相关体系，开源项目，业务开发中的常见问题和前沿科技资讯

227篇原创内容

公众号

在处理大规模数据导出时，常常面临Excel文件过大导致性能下降的问题。本文将介绍如何在SpringBoot项目中高效地导出千万级数据到Excel文件，自动拆分成多个文件，确保导出过程异步进行，不影响正式环境的查询。我们还将探讨是否应该使用从库来查询数据，以减轻主库的压力。

准备工作

依赖和配置

首先，我们需要在项目中引入必要的依赖，包括SpringBoot、Spring Data JPA、异步处理相关的依赖，以及用于生成Excel文件的Apache POI库。

在pom.xml中添加以下依赖：

<dependencies>    <!-- Spring Boot Starter Data JPA -->    <dependency>        <groupId>org.springframework.boot</groupId>        <artifactId>spring-boot-starter-data-jpa</artifactId>    </dependency>    <!-- Spring Boot Starter Async -->    <dependency>        <groupId>org.springframework.boot</groupId>        <artifactId>spring-boot-starter-web</artifactId>    </dependency>    <!-- Apache POI for Excel -->    <dependency>        <groupId>org.apache.poi</groupId>        <artifactId>poi</artifactId>    </dependency>    <dependency>        <groupId>org.apache.poi</groupId>        <artifactId>poi-ooxml</artifactId>    </dependency></dependencies>

在application.properties中进行必要的数据库配置，以及异步任务执行器的配置：

# Database configurationspring.datasource.url=jdbc:mysql://localhost:3306/yourdatabasespring.datasource.username=yourusernamespring.datasource.password=yourpassword# Async configurationspring.task.execution.pool.core-size=10spring.task.execution.pool.max-size=20spring.task.execution.pool.queue-capacity=500spring.task.execution.thread-name-prefix=Async-XXXXX

实现数据查询和拆分

使用从库进行查询

为了减轻主库的查询压力，我们建议在架构上使用读写分离，查询操作由从库处理。这样可以确保主库的写操作性能不受影响。

@Servicepublic class DataService {    @Autowired    private DataRepository dataRepository;    public List<Data> fetchData(int offset, int limit) {        return dataRepository.findAll(PageRequest.of(offset, limit)).getContent();    }}

数据分批查询的策略

为了防止一次性查询大量数据导致内存溢出，我们采用分页查询的方式，每次查询一部分数据进行处理。

@Servicepublic class DataExportService {    @Autowired    private DataService dataService;    @Async    public void exportData() {        int pageSize = 10000;        int pageNumber = 0;        List<Data> dataBatch;        do {            dataBatch = dataService.fetchData(pageNumber, pageSize);            if (!dataBatch.isEmpty()) {                // 导出数据到Excel                exportToExcel(dataBatch, pageNumber);            }            pageNumber++;        } while (!dataBatch.isEmpty());    }}

实现异步导出

异步任务配置

我们通过@EnableAsync注解启用异步任务，并配置一个任务执行器。

@Configuration@EnableAsyncpublic class AsyncConfig implements AsyncConfigurer {    @Override    public Executor getAsyncExecutor() {        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();        executor.setCorePoolSize(10);        executor.setMaxPoolSize(20);        executor.setQueueCapacity(500);        executor.setThreadNamePrefix("Async-");        executor.initialize();        return executor;    }}

导出任务的实现

使用@Async注解将导出任务标记为异步执行。

@Servicepublic class DataExportService {    @Autowired    private DataService dataService;    @Async    public void exportData() {        // 数据查询和导出的逻辑    }}

生成和拆分Excel文件

使用Apache POI处理Excel

public void exportToExcel(List<Data> dataBatch, int batchNumber) {    Workbook workbook = new XSSFWorkbook();    Sheet sheet = workbook.createSheet("Data");    int rowNum = 0;    for (Data data : dataBatch) {        Row row = sheet.createRow(rowNum++);        row.createCell(0).setCellValue(data.getId());        row.createCell(1).setCellValue(data.getName());        // 其他数据列    }    try (FileOutputStream fos = new FileOutputStream("data_batch_" + batchNumber + ".xlsx")) {        workbook.write(fos);    } catch (IOException e) {        e.printStackTrace();    }}

文件拆分逻辑

根据查询到的数据批次，将数据分成多个Excel文件，避免单个文件过大。

总结

通过使用SpringBoot和Apache POI，我们可以高效地导出大规模数据到Excel文件，并通过分页查询和异步处理，确保系统的性能不受影响。为了进一步优化性能，可以考虑使用从库进行数据查询，减轻主库压力。

需要面试BAT，码农这里整理一份面试资料《1000道互联网Java工程师面试题 485页》，覆盖了Java核心技术、JVM、Java并发、SSM、微服务、数据库、数据结构等等。

获取方式：点“在看”，关注公众号并回复 Java 领取，更多内容陆续奉上。

PS：因公众号平台更改了推送规则，如果不想错过内容，记得读完点一下“在看”，加个“星标”，这样每次新文章推送才会第一时间出现在你的订阅列表里。

点“在看”支持码农呀，谢谢啦

竹林幽深

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Spring Boot大数据导出：自动拆分Excel文件，实现高效异步处理

通过使用SpringBoot和Apache POI，我们可以高效地导出大规模数据到Excel文件，并通过分页查询和异步处理，确保系统的性能不受影响。为了进一步优化性能，可以考虑使用从库进行数据查询，减轻主库压力。需要面试BAT，码农这里整理一份面试资料《
复制链接

扫一扫