Office软件一直是一个诲誉参半的软件,广大普通计算机用户用Office来满足日常办公需求,于是就产生了很多生产数据和文档,需要和企业单位的专用办公系统对接,而Office的解析工作一直是程序员非常头痛的问题,经常招致程序员的谩骂,也被誉为是微软最烂的发明之一。POI的诞生解决了Excel的解析难题(POI即“讨厌的电子表格”,确实很讨厌,我也很讨厌Excel),但如果用不好POI,也会导致程序出现一些BUG,例如内存溢出,假空行,公式等等问题。下面介绍一种解决POI读取Excel内存溢出的问题。
POI读取Excel有两种模式,一种是用户模式,一种是SAX模式,将xlsx格式的文档转换成CVS格式后再进行处理用户模式相信大家都很清楚,也是POI常用的方式,用户模式API接口丰富,我们可以很容易的使用POI的API读取Excel,但用户模式消耗的内存很大,当遇到很多sheet、大数据网格、假空行、公式等问题时,很容易导致内存溢出。POI官方推荐解决内存溢出的方式使用CVS格式解析,我们不可能手工将Excel文件转换成CVS格式再上传,这样做太麻烦了,好再POI给出了xlsx转换CVS的例子,基于这个例子我进行了一下改造,即可解决用户模式读取Excel内存溢出的问题。下面附上代码:
/* ====================================================================
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
==================================================================== */
import java.io.IOException;
import java.io.InputStream;
import java.io.PrintStream;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.poi.hssf.usermodel.HSSFDateUtil;
import org.apache.poi.openxml4j.exceptions.OpenXML4JException;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.openxml4j.opc.PackageAccess;
import org.apache.poi.ss.usermodel.BuiltinFormats;
import org.apache.poi.ss.usermodel.DataFormatter;
import org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.StylesTable;
import org.apache.poi.xssf.usermodel.XSSFCellStyle;
import org.apache.poi.xssf.usermodel.XSSFRichText