poi和easyExcel 读取文件内存占用统计

介绍:

虽然POI是目前使用最多的用来做excel解析的框架,但这个框架并不那么完美。大部分使用POI都是使用他的userModel模式。userModel的好处是上手容易使用简单,随便拷贝个代码跑一下,剩下就是写业务转换了,虽然转换也要写上百行代码,相对比较好理解。然而userModel模式最大的问题是在于非常大的内存消耗,一个几兆的文件解析要用掉上百兆的内存。现在很多应用采用这种模式,之所以还正常在跑一定是并发不大,并发上来后一定会OOM或者频繁的full gc。

EasyExcel是一个基于Java的简单、省内存的读写Excel的开源项目。在尽可能节约内存的情况下支持读写百M的Excel。

前提:需要准备一个excel 文件 (历史问卷-混合.xlsx) 大小约为44M

我们直接来看结果对比

图1 是poi 方式的,图2是easyExcel方式的,明显easyExcel的内存占用小很多很多 ,内存运行也相对稳定,并且读取速度也比poi方式快很多倍。

图1:在这里插入图片描述
图2:
在这里插入图片描述

控制台打印内存情况(另一个文件, 约14M左右)

poi
解析Excel前内存:81940KB
解析Excel后内存:3827028KB
(3827028KB-81940KB)/1024/1024=3.5G
easyexcel
解析Excel前内存:81940KB
解析Excel后内存:804407KB
(804407KB-81940KB)/1024/1024=0.69G

实现:

1.引入poi和easyExcel的包

<dependencies>
    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>easyexcel</artifactId>
        <version>1.1.2-beta4</version>
    </dependency>
     
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>4.0.0</version>
    </dependency>
     
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>4.0.0</version>
    </dependency>
     
    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml-schemas</artifactId>
        <version>4.0.0</version>
    </dependency>
</dependencies>

2.编写测试内容–

poiExcelTest.java

package excel;
 
import com.alibaba.excel.EasyExcelFactory;
import com.alibaba.excel.ExcelReader;
import com.alibaba.excel.context.AnalysisContext;
import com.alibaba.excel.event.AnalysisEventListener;
import com.alibaba.excel.support.ExcelTypeEnum;
import org.apache.poi.ss.usermodel.Sheet;
 
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.List;
 
/**
 * @Author: wjx
 * @Date: 2021/5/7-16:38
 * @Version: 1.0
 */
public class easyExcelTest {
    public static void main(String[] args) throws InterruptedException {
        System.out.println("解析Excel前内存:"+(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/(1024)+"KB");
        read();
        System.out.println("解析Excel后内存:"+(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/(1024)+"KB");
        System.out.println();
        Thread.sleep(20000);
    }
 
    private static void read() {
        try(InputStream inputStream = new FileInputStream("E:\历史问卷-混合2.xlsx")) {
            ExcelReader excelReader = new ExcelReader(inputStream, ExcelTypeEnum.XLSX, null, new AnalysisEventListener<List<String>>() {
 
                @Override
                public void invoke(List<String> object, AnalysisContext context) {
                    /*System.out.println("当前sheet:" + context.getCurrentSheet().getSheetNo() + ",当前行:" +
                            context.getCurrentRowNum());*/
                }
 
                @Override
                public void doAfterAllAnalysed(AnalysisContext context) {
 
                }
            });
            excelReader.read();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

ExcelListener.java

package excel;
 
import com.alibaba.excel.context.AnalysisContext;
import com.alibaba.excel.event.AnalysisEventListener;
 
import java.util.ArrayList;
import java.util.List;
 
/**
 * @Author: wjx
 * @Date: 2021/5/7-16:40
 * @Version: 1.0
 */
public class ExcelListener extends AnalysisEventListener {
 
 
    private List<Object> data = new ArrayList<Object>();
 
    @Override
    public void invoke(Object object, AnalysisContext context) {
        data.add(object);
    }
 
    @Override
    public void doAfterAllAnalysed(AnalysisContext context) {
 
    }
}

easyExcelTest.java

package excel;
 
import org.apache.poi.hssf.usermodel.HSSFDataFormatter;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
 
import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
 
/**
 * @Author: wjx
 * @Date: 2021/5/7-16:28
 * @Version: 1.0
 */
public class poiExcelTest {
 
    private int totalRows = 0;// 总行数
    private int totalCells = 0;// 总列数
 
    public Map<String, List<List<String>>> read(String fileName) {
        Map<String, List<List<String>>> maps = new HashMap<String, List<List<String>>>();
        if (fileName == null || !fileName.matches("^.+\.(?i)((xls)|(xlsx))$"))
            return maps;
        File file = new File(fileName);
        if (file == null || !file.exists())
            return maps;
        try {
            Workbook wb = WorkbookFactory.create(new FileInputStream(file));
            maps = read(wb);
        } catch (Exception e) {
            e.printStackTrace();
        }
        return maps;
    }
 
    public int getTotalRows() {
        return totalRows;
    }
 
    public int getTotalCells() {
        return totalCells;
    }
 
    private Map<String, List<List<String>>> read(Workbook wb) {
        Map<String, List<List<String>>> maps = new HashMap<String, List<List<String>>>();
        int number = wb.getNumberOfSheets();
        if (number > 0) {
            for (int i = 0; i < number; i++) { // 循环每个工作表
                List<List<String>> list = new ArrayList<List<String>>();
                int delnumber = 0;// 第一页去除行数
                Sheet sheet = wb.getSheetAt(i);
                this.totalRows = sheet.getPhysicalNumberOfRows() - delnumber; // 获取工作表中行数
                if (this.totalRows >= 1 && sheet.getRow(delnumber) != null) {
                    this.totalCells = sheet.getRow(0)
                            .getPhysicalNumberOfCells(); // 得到当前行的所有单元格
                    for (int j = 0; j < totalRows; j++) {
                        List<String> rowLst = new ArrayList<String>();
                        for (int f = 0; f < totalCells; f++) {
                            if (totalCells > 0) {
                                String value = getCell(sheet.getRow(j).getCell(f));
                                rowLst.add(value);
                            }
                        }
                        list.add(rowLst);
                    }
                }
                maps.put(sheet.getSheetName(), list);
            }
        }
        return maps;
    }
 
    public String getCell(Cell cell) {
        String cellValue = null;
        HSSFDataFormatter hSSFDataFormatter = new HSSFDataFormatter();
        cellValue = hSSFDataFormatter.formatCellValue(cell); // 使用EXCEL原来格式的方式取得值
        return cellValue;
    }
 
    public static void main(String[] args) {
        try {
            System.out.println("解析Excel前内存:"+(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/(1024)+"KB");
 
            Map<String, List<List<String>>> map = new poiExcelTest()
                    .read("E:\历史问卷-混合2.xlsx");
            System.out.println("解析Excel后内存:"+(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/(1024)+"KB");
 
            //System.out.println(map);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
 
 
}

实际使用动态拼接头部和使用头部样式

最主要:
ExcelWriter excelWriter = write(os)
.registerWriteHandler(new CustomHandler()).build();

private FileDescriptor getZipBytes(Map<DPaper, List<Map<Object, Object>>> paperResultMap, Map<DPaper, List<IndexObject>> paperIndexMap, String module) {
        // 创建excel表格
        Stopwatch stopwatch1 = Stopwatch.createStarted();
        try (ByteArrayOutputStream bos = new ByteArrayOutputStream()) {

            ZipArchiveOutputStream zos = new ZipArchiveOutputStream(bos);
            zos.setEncoding(StandardCharsets.UTF_8.name());

            Map<Integer, List<DPaperQues>> paperQuesMap = dataManager.load(DPaperQues.class)
                    .query("select e from iifdsp$DPaperQues e  where e.paper in :paperIds order by e.quesSnum ")
                    .parameter("paperIds", paperResultMap.keySet())
                    .view("dPaperQues-view")
                    .list()
                    .stream()
                    .collect(Collectors.groupingBy(DPaperQues::getPaper));

            paperResultMap.forEach((k, v) -> {
                String fileName = k.getPaperName() + ".xlsx";
                try (ByteArrayOutputStream os = new ByteArrayOutputStream()){

                    ExcelWriter excelWriter = write(os)
                            .registerWriteHandler(new CustomHandler()).build();
                    // 生成excel文件
                    List<DPaperQues> paperQuesList=paperQuesMap.get(k.getId());
                    List<String> captionList = ExcelExportUtils.getCaptionList(v, paperIndexMap.get(k));
                    createSheetData(excelWriter,paperQuesList, paperIndexMap.get(k), captionList);

                    os.flush();
                    // 加入压缩文件
                    byte[] bytes = os.toByteArray();
                    ArchiveEntry singleDesignEntry = new ZipArchiveEntry(fileName);

                    zos.putArchiveEntry(singleDesignEntry);
                    zos.write(bytes);
                    zos.closeArchiveEntry();
                } catch (IOException io) {
                    log.warn("生成excel失败", io);
                }
            });
            zos.close();


            // 记录系统日志
            UserSession userSession = userSessionSource.getUserSession();
            systemLogOutputBean.outputSystemLog(userSession, module,"导出文件", SystemLogTypeEnum.USER_OPERATE);
            exceptionLogStatisticsService.exceptionLogStatisticsTask(ExceptionLogTypeEnum.FILE_DOWNLOAD_EXCEPTION, userSession.getUser().getLogin(), userSession.getAddress());

            FileDescriptor fileDescriptor = metadata.create(FileDescriptor.class);
            fileDescriptor.setExtension("zip");
            fileDescriptor.setCreateDate(new Date());
            fileDescriptor.setName(getFileDesString());

            fileDescriptor.setSize(Long.parseLong(bos.toByteArray().length + ""));
            fileStorageService.saveFile(fileDescriptor, bos.toByteArray());
            return fileDescriptor;
        } catch (Exception e) {
            log.error("导出失败", e);
        } finally {
            log.debug("Excel Generated done, using :{} ms", stopwatch1.stop().elapsed(TimeUnit.MILLISECONDS));
        }
        return null;
    }

    private String getFileDesString() {
        return "明细_"
                + simpleDateFormat2.format(new Date())
                + ".zip";
    }

    //新数据填充
    private void createSheetData(ExcelWriter writer,List<DPaperQues> paperQuesList,List<IndexObject> indexObjects, List<String> captionList) {
        WriteSheet index1sheet = new WriteSheet();
        WriteSheet index2sheet = new WriteSheet();
        WriteSheet index3sheet = new WriteSheet();

        index1sheet.setSheetName("一级指标得分");
        index2sheet.setSheetName("二级指标得分");
        index3sheet.setSheetName("三级指标得分");

        if (!indexObjects.isEmpty()) {

            IndexObject indexSample = indexObjects.get(0);
            List<String> captionList2 = Arrays.asList((USER_NAME + COMMA + COMPANY_NAME).split(COMMA));

            List<String> index2CaptionList = new ArrayList<>(captionList2);
            List<String> index3CaptionList = new ArrayList<>(captionList2);
            indexSample.getChildren()
                    .forEach(index1 ->
                            Optional.ofNullable(index1.getChildren())
                                    .ifPresent(index2List ->
                                            index2CaptionList.addAll(index2List
                                                    .stream()
                                                    .map(index2 -> {
                                                        Optional.ofNullable(index2.getChildren())
                                                                .ifPresent(index3List ->
                                                                        index3CaptionList.addAll(index3List
                                                                                .stream()
                                                                                .map(IndexObject::getIndexName)
                                                                                .collect(Collectors.toList())));
                                                        return index2.getIndexName();
                                                    })
                                                    .collect(Collectors.toList()))
                                    ));
            Stopwatch stopwatch = Stopwatch.createStarted();
            List<String> fourTitleList=new ArrayList<>();
            //创建第一个表
            createEasyExcelFirstSheet(index1sheet,captionList,indexSample,paperQuesList,fourTitleList);
            //创建第二、三个表
            createEasyExcelSecondSheet(index2CaptionList,index3CaptionList,index2sheet,index3sheet);
            //创建内容
            createEasyExcelData(indexObjects,captionList,writer,index2CaptionList,index3CaptionList,index1sheet,index2sheet,index3sheet,fourTitleList);
            log.debug("Create EasyExcel Data use {} ms",stopwatch.elapsed(TimeUnit.MICROSECONDS));
            log.debug("Create EasyExcel Data,memory: {} MB",(Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory())/(1024)/(1024));

            writer.finish();
        }
    }

    /**
     * 创建第一个sheet 页
     */
    public void  createEasyExcelFirstSheet(WriteSheet index1sheet,List<String> captionList,
                                           IndexObject indexSample,List<DPaperQues> paperQuesList,List<String> fourTitleList){

        //第一个sheet的表头特殊处理
        // 基本信息 + 一级指标 列数
        int baseInfoColumnNum = captionList.size() -
                Optional.ofNullable(indexSample.getAnswerResult())
                        .map(answerResult -> answerResult.keySet().size())
                        .orElse(0);

        // 动态添加 表头 headList --> 所有表头行集合
        List<List<String>> headList = new ArrayList<>();

        // 固定表头生成
        for (int i = 0; i < baseInfoColumnNum; i++) {
            List<String> headTitleI = new ArrayList<>();
            headTitleI.add(" ");
            headTitleI.add(" ");
            headTitleI.add(" ");
            headTitleI.add(captionList.get(i));
            headList.add(headTitleI);
        }

        createDynamicHead(baseInfoColumnNum,paperQuesList,captionList,headList,fourTitleList);

        index1sheet.setHead(headList);
        index1sheet.setSheetNo(1);
    }

    private  void createDynamicHead(int baseInfoColumnNum,List<DPaperQues> paperQuesList,List<String> captionList,
                                    List<List<String>> headList,List<String> fourTitleList){
        // 动态表头
        int fourIndex = baseInfoColumnNum;
        Map<String, String> lastQues = new HashMap<>();
        for (int i = 0; i < paperQuesList.size(); i++) {
            DPaperQues paperQues = paperQuesList.get(i);
            DQuestion ques = paperQues.getQues();

            if (ques.getQuesNo().equals(lastQues.get("no"))) {
                lastQues.put("no", ques.getQuesNo());
                lastQues.put("title", ques.getQuesTitle());
                continue;
            }

            List<String> heads = new ArrayList<>();
            // 第二行标题,如果有子问卷第二行标题需要做合并操作
            Integer quesSnum = paperQues.getQuesSnum();
            if (ques.getQuestionItems().isEmpty()) {
                String itemCode = String.format("%s:A:T", quesSnum);
                fourIndex=firstHeader(headList,heads,ques,lastQues,i,captionList,fourIndex,itemCode,fourTitleList);
            } else {
                final int[] itemIndex = {0};
                for (DQuesItem quesItem : ques.getQuestionItems()) {
                    String itemCode = String.format("%s:%s", quesSnum, (char) (65 + itemIndex[0]));
                    fourIndex=quesItemHeader(headList,ques,quesItem,i,captionList,fourIndex,lastQues,itemCode,fourTitleList);
                    itemIndex[0] ++;
                }
            }

            lastQues.put("no", ques.getQuesNo());
            lastQues.put("title", ques.getQuesTitle());
        }
    }

    private int firstHeader(List<List<String>> headList,List<String> heads,DQuestion ques,Map<String, String> lastQues,int i,
                            List<String> captionList,int fourIndex,String itemCode,List<String> fourTitleList){

        // 第一行标题 题目编号
        heads.add(ques.getQuesNo());
        // 空两行 最后一行给baseInfoColumnNum去生成
        heads.add(ques.getQuesTitle());
        String space = generateSpace(i);
        heads.add(space);
        lastQues.put("space", space);
        if (fourIndex < captionList.size()) {
            String colFour;
            if (next(ques, lastQues.get("no"))) {
                colFour = captionList.get(fourIndex);
                lastQues.put("four", colFour);
                heads.add(itemCode);
                fourIndex++;
            } else {
                colFour = lastQues.get("four");
                heads.add(colFour);
            }

            fourTitleList.add(itemCode);
        }
        headList.add(heads);
        return fourIndex;
    }

    private int quesItemHeader(List<List<String>> headList,DQuestion ques,DQuesItem quesItem,int i,
                               List<String> captionList,int fourIndex,Map<String, String> lastQues,
                               String itemCode,List<String> fourTitleList){
        List<String> heads=new ArrayList<>();
        // 第一行标题 题目编号
        heads.add(ques.getQuesNo());
        heads.add(StringUtils.isBlank(ques.getQuesTitle()) ? generateSpace(i) : ques.getQuesTitle());
        heads.add(StringUtils.isBlank(quesItem.getItemContent()) ? generateSpace(i + 1) : quesItem.getItemContent());

        if (fourIndex < captionList.size()) {
            heads.add(captionList.get(fourIndex));
            lastQues.put("four", itemCode);
            fourTitleList.add(itemCode);
            fourIndex++;
        }
        headList.add(heads);

        //判断 当前的选项是否是填空
        if( quesItem.getIsInput()==1 ){
            List<String> headInput = new ArrayList<>(heads);
            headInput.remove(3);
            headInput.add(3, heads.get(3).concat(":T"));
            fourTitleList.add(heads.get(3).concat(":T"));
            fourIndex++;
            headList.add(headInput);
        }
        return fourIndex;
    }

    /**
     * 创建第二、三个sheet 页
     */
    private void createEasyExcelSecondSheet(List<String> index2CaptionList,List<String> index3CaptionList
            ,WriteSheet index2sheet,WriteSheet index3sheet){

        List<List<String>> headers2=new ArrayList<>();
        for(String header:index2CaptionList){
            headers2.add(Lists.newArrayList(header));
        }
        index2sheet.setHead(headers2);
        List<List<String>> headers3=new ArrayList<>();
        for(String header:index3CaptionList){
            headers3.add(Lists.newArrayList(header));
        }
        index3sheet.setHead(headers3);
    }

 private boolean next(DQuestion ques, String no) {
        return !ques.getQuesNo().equals(no) || (ques.getQuesNo().equals(no) && !ques.getQuestionItems().isEmpty());
    }

    private String generateSpace(int qty) {
        StringBuilder builder = new StringBuilder(" ");
        for (int i = 0; i < qty; i++) builder.append(" ");
        return builder.toString();
    }
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值