EasyExcel实现百万级数据的导入导出
一、前言
通过SpringBoot集成EasyExcel实现百万级数据的导入导出
二、准备工作
1. 数据库
create table employee
(
employee_id bigint not null
primary key,
employee_name varchar(100) not null,
age int not null,
gender varchar(10) not null,
create_time datetime not null
);
2. 导入数据(使用存储过程向MySQL中加入100w条数据)
DELIMITER //
drop procedure IF EXISTS InsertTestData;
CREATE PROCEDURE InsertTestData()
BEGIN
DECLARE counter INT DEFAULT 0;
DECLARE gender VARCHAR(10);
WHILE counter < 1000000 DO
IF counter % 2 = 0 THEN
SET gender = '男';
ELSE
SET gender = '女';
END IF;
INSERT INTO demo64.employee (employee_id, employee_name, age, gender, create_time) VALUES
(counter, CONCAT('Name_', counter), FLOOR(RAND() * 100), gender, now()) ; -- 使用随机名称和年龄
SET counter = counter + 1;
END WHILE;
END//
DELIMITER ;
-- 调用存储过程插入数据
CALL InsertTestData();
3. SpringBoot中配置EasyExcel
3.1 pom.xml
中引入依赖
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>easyexcel</artifactId>
<version>2.2.6</version>
</dependency>
3.2 实体EmployeeDO.java
@Data
@AllArgsConstructor
@NoArgsConstructor
@Builder
@TableName("employee")
public class EmployeeDO implements Serializable {
@TableId("employee_id")
private Long employeeId;
@TableField(value = "employee_name")
private String employeeName;
private Integer age;
private String gender;
@TableField(value = "create_time")
private LocalDateTime createTime;
}
二、导出
1. 查全部,写入一个sheet
ExportServiceImpl.java
public void exportExcel1(HttpServletResponse response) throws IOException {
setExportHeader(response);
// 获取所有数据
List<EmployeeDO> doList = employeeMapper.selectList(null);
// EasyExcel写入并导出
EasyExcel.write(response.getOutputStream(), EmployeeDO.class).sheet().doWrite(doList);
}
注:该方法只适合少量数据导出,大数据量很慢
2. 查全部,写入多个sheet
ExportServiceImpl.java
public void exportExcel2(HttpServletResponse response)throws IOException{
setExportHeader(response);
// 获取所有数据
List<EmployeeDO> doList=employeeMapper.selectList(null);
try(ExcelWriter excelWriter=EasyExcel.write(response.getOutputStream(),EmployeeDO.class).build()){
// 定义3个sheet页
WriteSheet writeSheet1=EasyExcel.writerSheet(1,"模板1").build();
WriteSheet writeSheet2=EasyExcel.writerSheet(2,"模板2").build();
WriteSheet writeSheet3=EasyExcel.writerSheet(3,"模板3").build();
// 将数据分成3份
List<EmployeeDO> data1=doList.subList(0,doList.size()/3);
List<EmployeeDO> data2=doList.subList(doList.size()/3,doList.size()*2/3);
List<EmployeeDO> data3=doList.subList(doList.size()*2/3,doList.size());
// 写入对应的sheet页
excelWriter.write(data1,writeSheet1);
excelWriter.write(data2,writeSheet2);
excelWriter.write(data3,writeSheet3);
}
}
注:这种方法虽然将数据写入多个sheet,但还是很慢
3. 分页查询,每页写入一个sheet
ExportServiceImpl.java
public void exportExcel3(HttpServletResponse response) throws IOException {
setExportHeader(response);
try (ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), EmployeeDO.class).build()) {
// 获取总的数量
Long count = employeeMapper.selectCount(null);
Integer pages = 20;
Long size = count / pages;
for (int i = 0; i < pages; i++) {
WriteSheet writeSheet = EasyExcel.writerSheet(i, "模板" + i).build();
Page<EmployeeDO> page = new Page<>();
page.setCurrent(i + 1);
page.setSize(size);
Page<EmployeeDO> selectPage = employeeMapper.selectPage(page, null);
excelWriter.write(selectPage.getRecords(), writeSheet);
}
}
}
4. 多线程分页查询,每页写入一个sheet
ExportServiceImpl.java
public void exportExcel4(HttpServletResponse response)throws InterruptedException,IOException {
setExportHeader(response);
Long count = employeeMapper.selectCount(null);
Integer pages = 20;
Long size = count / pages;
ExecutorService executorService = Executors.newFixedThreadPool(pages);
CountDownLatch countDownLatch = new CountDownLatch(pages);
Map<Integer, Page<EmployeeDO>> pageMap = new HashMap<>();
for(int i = 0; i < pages; i++) {
int finalI = i;
executorService.submit(new Runnable() {
@Override
public void run() {
Page<EmployeeDO> page = new Page<>();
page.setCurrent(finalI+1);
page.setSize(size);
Page<EmployeeDO> selectPage = employeeMapper.selectPage(page,null);
pageMap.put(finalI,selectPage);
countDownLatch.countDown();
}
});
}
countDownLatch.await();
try(ExcelWriter excelWriter = EasyExcel.write(response.getOutputStream(), EmployeeDO.class).build()){
for(Map.Entry<Integer, Page<EmployeeDO>>entry : pageMap.entrySet()) {
Integer num = entry.getKey();
Page<EmployeeDO> doPage = entry.getValue();
WriteSheet writeSheet = EasyExcel.writerSheet(num,"模板" + num).build();
excelWriter.write(doPage.getRecords(), writeSheet);
}
}
}
setExportHeader()
private static void setExportHeader(HttpServletResponse response) {
response.setContentType(CONTENT_TYPE);
response.setCharacterEncoding(StandardCharsets.UTF_8.name());
response.setHeader("Content-disposition", "attachment;filename*=utf-8''" + "test.xlsx");
}
注:至于为什么没有用多线程写入sheet页,可以看下面官方的解释
https://github.com/alibaba/easyexcel/issues/1040
三、导入
3.1 准备
导入需要写一个Listener类去实现ReadListener,具体代码如下:
EmployeeListener.java
@Component
@Slf4j
public class EmployeeListener extends ServiceImpl<EmployeeMapper, EmployeeDO> implements ReadListener<EmployeeDO>, IService<EmployeeDO> {
// 线程池
private ExecutorService executorService = Executors.newFixedThreadPool(20);
private ThreadLocal<ArrayList<Salaries>> salariesList = ThreadLocal.withInitial(ArrayList::new);
private static AtomicInteger count = new AtomicInteger(1);
private static final int batchSize = 10000;
@Resource
private EmployeeListener listener;
@Override
@Transactional(rollbackFor = Exception.class)
public void invoke(EmployeeDO employeeDO, AnalysisContext analysisContext) {
// todo
}
@Override
@Transactional(rollbackFor = Exception.class)
public void doAfterAllAnalysed(AnalysisContext analysisContext) {
// todo
}
}
3.2 方法一: 单线程逐条解析,单线程单条插入
ImportExcelServiceImpl.java
public void importExcel(MultipartFile file) throws IOException {
EasyExcel.read(file.getInputStream(), EmployeeDO.class, employeeListener).doReadAll();
}
EmployeeListener.java
@Component
@Slf4j
public class EmployeeListener extends ServiceImpl<EmployeeMapper, EmployeeDO> implements ReadListener<EmployeeDO>, IService<EmployeeDO> {
private ExecutorService executorService = Executors.newFixedThreadPool(20);
private ThreadLocal<ArrayList<Salaries>> salariesList = ThreadLocal.withInitial(ArrayList::new);
private static AtomicInteger count = new AtomicInteger(1);
private static final int batchSize = 10000;
@Resource
private EmployeeListener listener;
@Override
@Transactional(rollbackFor = Exception.class)
public void invoke(EmployeeDO employeeDO, AnalysisContext analysisContext) {
saveOne(employeeDO);
}
public void saveOne(EmployeeDO data){
save(data);
log.info("第" + count.getAndAdd(1) + "次插入1条数据");
}
@Override
@Transactional(rollbackFor = Exception.class)
public void doAfterAllAnalysed(AnalysisContext analysisContext) {
}
}
注:该方法只适合少量数据导入,大数据量是插入非常慢
3.3 方法二: 单线程逐条解析,单线程批量插入
public void importExcel(MultipartFile file) throws IOException {
EasyExcel.read(file.getInputStream(), EmployeeDO.class, employeeListener).doReadAll();
}
EmployeeListener.java
@Component
@Slf4j
public class EmployeeListener extends ServiceImpl<EmployeeMapper, EmployeeDO> implements ReadListener<EmployeeDO>, IService<EmployeeDO> {
private ExecutorService executorService = Executors.newFixedThreadPool(20);
private ArrayList<EmployeeDO> employeeList = new ArrayList<>();
private static AtomicInteger count = new AtomicInteger(1);
private static final int batchSize = 10000;
@Resource
private EmployeeListener listener;
@Override
@Transactional(rollbackFor = Exception.class)
public void invoke(EmployeeDO employeeDO, AnalysisContext analysisContext) {
employeeList.add(employeeDO);
if (employeeList.size() >= batchSize) {
saveData();
}
}
public void saveData() {
if (!employeeList.isEmpty()) {
saveBatch(employeeList, employeeList.size());
log.info("第" + count.getAndAdd(1) + "次插入" + employeeList.size() + "条数据");
employeeList.clear();
}
}
@Override
@Transactional(rollbackFor = Exception.class)
public void doAfterAllAnalysed(AnalysisContext analysisContext) {
log.info("一个Sheet全部处理完");
if (employeeList.size() >= batchSize) {
saveData();
}
}
}
3.4 方法三: 多线程解析,单线程批量插入
开启20个线程分别处理20个sheet页
private ExecutorService executorService = Executors.newFixedThreadPool(20);
public void importExcelAsync(MultipartFile file) {
// 开20个线程分别处理20个sheet
List<Callable<Object>> tasks = new ArrayList<>();
for (int i = 0; i < 20; i++) {
int num = i;
tasks.add(() -> {
EasyExcel.read(file.getInputStream(), EmployeeDO.class, employeeListener)
.sheet(num).doRead();
return null;
});
}
try {
executorService.invokeAll(tasks);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
EmployeeListener.java
@Component
@Slf4j
public class EmployeeListener extends ServiceImpl<EmployeeMapper, EmployeeDO> implements ReadListener<EmployeeDO>, IService<EmployeeDO> {
private ExecutorService executorService = Executors.newFixedThreadPool(20);
private ThreadLocal<ArrayList<EmployeeDO>> employeeList = ThreadLocal.withInitial(ArrayList::new);
private static AtomicInteger count = new AtomicInteger(1);
private static final int batchSize = 10000;
@Resource
private EmployeeListener listener;
@Override
@Transactional(rollbackFor = Exception.class)
public void invoke(EmployeeDO employeeDO, AnalysisContext analysisContext) {
employeeList.get().add(employeeDO);
if (employeeList.get().size() >= batchSize) {
saveData();
}
}
public void saveData() {
if (!employeeList.get().isEmpty()) {
saveBatch(employeeList.get(), employeeList.get().size());
log.info("第" + count.getAndAdd(1) + "次插入" + employeeList.get().size() + "条数据");
employeeList.get().clear();
}
}
@Override
@Transactional(rollbackFor = Exception.class)
public void doAfterAllAnalysed(AnalysisContext analysisContext) {
log.info("一个Sheet全部处理完");
if (employeeList.get().size() >= batchSize) {
saveData();
}
}
}
3.5 方法四: 多线程解析,多线程批量插入
private ExecutorService executorService = Executors.newFixedThreadPool(20);
public void importExcelAsync(MultipartFile file) {
// 开20个线程分别处理20个sheet
List<Callable<Object>> tasks = new ArrayList<>();
for (int i = 0; i < 20; i++) {
int num = i;
tasks.add(() -> {
EasyExcel.read(file.getInputStream(), EmployeeDO.class, employeeListener)
.sheet(num).doRead();
return null;
});
}
try {
executorService.invokeAll(tasks);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
EmployeeListener.java
@Component
@Slf4j
public class EmployeeListener extends ServiceImpl<EmployeeMapper, EmployeeDO> implements ReadListener<EmployeeDO>, IService<EmployeeDO> {
private ExecutorService executorService = Executors.newFixedThreadPool(20);
// private ArrayList<EmployeeDO> employeeList = new ArrayList<>();
private ThreadLocal<ArrayList<EmployeeDO>> employeeList = ThreadLocal.withInitial(ArrayList::new);
private static AtomicInteger count = new AtomicInteger(1);
private static final int batchSize = 10000;
@Resource
private EmployeeListener listener;
@Override
@Transactional(rollbackFor = Exception.class)
public void invoke(EmployeeDO employeeDO, AnalysisContext analysisContext) {
employeeList.get().add(employeeDO);
if (employeeList.get().size() >= batchSize) {
asyncSaveData();
}
}
public void asyncSaveData() {
if (!employeeList.get().isEmpty()) {
ArrayList<EmployeeDO> employeeDOS = (ArrayList<EmployeeDO>) employeeList.get().clone();
executorService.execute(new SaveTask(employeeDOS, listener));
employeeList.get().clear();
}
}
static class SaveTask implements Runnable {
private List<EmployeeDO> employeeList;
private EmployeeListener listener;
public SaveTask(List<EmployeeDO> employeeList, EmployeeListener listener) {
this.employeeList = employeeList;
this.listener = listener;
}
@Override
public void run() {
listener.saveBatch(employeeList);
log.info("第" + count.getAndAdd(1) + "次插入" + employeeList.size() + "条数据");
}
}
@Override
@Transactional(rollbackFor = Exception.class)
public void doAfterAllAnalysed(AnalysisContext analysisContext) {
log.info("一个Sheet全部处理完");
if (employeeList.get().size() >= batchSize) {
asyncSaveData();
}
}
}
四、总结
- 少量数据可以用单线程逐条读取,单线程写入或者批量写入的方式导入, 大量数据建议使用多线程读取,分多个sheet页存储的方式进行导入
- 少量数据的导出可以直接使查询所有数据,并写入一个sheet页中,数据量大的情况下还是要用多线程分页读取,写入到多个sheet页中