使用poi读取doc文档时报错
java.lang.ArrayIndexOutOfBoundsException: Index 65946 out of bounds for length 9355
at org.apache.poi.util.LittleEndian.getUShort(LittleEndian.java:355)
at org.apache.poi.hwpf.model.FileInformationBlock.<init>(FileInformationBlock.java:118)
at org.apache.poi.hwpf.HWPFDocumentCore.<init>(HWPFDocumentCore.java:170)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:193)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:177)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:165)
代码
StringBuilder buffer;
InputStream is = new FileInputStream(path);
HWPFDocument doc = new HWPFDocument(is);
StringBuilder buffer = doc.getText();
stackoverflow上面发现有人遇到同样的问题,原来是apache的一个bug,并给出了apache提bug的入口,好多年的bug了,大佬们估计没时间解决。