InputStream读取不完整 导致出现 Unexpected end of ZLIB input stream

错误信息

测试环境,某个文件读取功能报错,错误日志如下

java.io.IOException: Failed to read zip entry source

 at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:103)
 at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:324)
 at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:185)
 at org.apache.poi.ss.usermodel.WorkbookFactory.create(WorkbookFactory.java:144)
 at com.wwwarehouse.xdw.commonindustry.manager.impl.importConvert.templateList.GoodsConvertImpl.cloneExcelTemplate(GoodsConvertImpl.java:406)
 at com.wwwarehouse.xdw.commonindustry.manager.impl.importConvert.templateList.GoodsConvertImpl.genErrorExcelBytes(GoodsConvertImpl.java:299)
 at com.wwwarehouse.xdw.commonindustry.service.dubbo.DubboServiceTest.importGoods(DubboServiceTest.java:84)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate(RunBeforeTestMethodCallbacks.java:75)
 at org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate(RunAfterTestMethodCallbacks.java:86)
 at org.springframework.test.context.junit4.statements.SpringRepeat.evaluate(SpringRepeat.java:84)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:252)
 at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild(SpringJUnit4ClassRunner.java:94)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate(RunBeforeTestClassCallbacks.java:61)
 at org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate(RunAfterTestClassCallbacks.java:70)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run(SpringJUnit4ClassRunner.java:191)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
 at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
 at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
 at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
 at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
 at java.util.zip.ZipInputStream.read(ZipInputStream.java:194)
 at org.apache.poi.openxml4j.util.ZipSecureFile$ThresholdInputStream.read(ZipSecureFile.java:213)
 at java.io.FilterInputStream.read(FilterInputStream.java:107)
 at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource$FakeZipEntry.<init>(ZipInputStreamZipEntrySource.java:132)
 at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:56)
 at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:100)
 ... 34 more

对应源码如下:

InputStream is = ...
int available = is.available();
byte[] fileBytes = new byte[available];
int read = is.read(fileBytes);

后面基于fileBytes创建文件时失败,报上面的错误

排查

本地验证的时候没有问题,网上查了下,说是流对象没有关闭,读取不完整。
查了下代码,输入流确实没有做关闭。,作为老司机,内心自我谴责了下。
但增加了流关闭动作,问题依旧。

增加了日志信息输出:

InputStream is = ...
int available = is.available();
byte[] fileBytes = new byte[available];
int read = is.read(fileBytes);
log.info("read:" + read + ", available:" + available);

本地执行,ok,read和available一致。
测试环境运行,打印日志如下:read:9608, available:54192
为什么读取到的内容和available不一样,查看了下InputStream中read方法的注释,
方法返回的是实际读取长度,最多可能读取字节数组的长度。而非我误以为的一定读取字节数组长度。

修改了下读取代码,问题解决

import org.apache.commons.io.IOUtils;
...

byte[] fileBytes = IOUtils.toByteArray(is);

工具类实现原理如下():

byte[] buffer = new byte[1024];
int len = 0;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
while((len = inputStream.read(buffer)) != -1) {
     bos.write(buffer, 0, len);
}
bos.close();
return bos.toByteArray();

为什么出现这个问题

简单回顾了下操作系统内容,理解如下:
操作系统内容有段式、页式内存管理。相比于廉价的磁盘,内存要更昂贵。操作系统从外部加载内容到内存时,无需将所有内容全部加载到内存。
否则如果应用程序大小大于内存,将无法执行。
如页式内存管理,按页分配空间,访问文件时按页加载,如果能直接命中,返回命中内容,否则产生缺页中断重新从磁盘上读取。
我们部署的服务器是Linux虚拟机,猜测访问磁盘,比我本地实体机直接访问,IO映射的层级要多,在读取过程中产生了间断。
获取到的是第一部分读取到的内容,并非完整的所有内容。

理解皮毛,比较糙。有兴趣的同学可以翻一翻操作系统的内存管理和IO原理相关知识。

附, InputStream.read方法的源码注释 (jdk1.8)

/**
 ... 略,见下面方法注释
 */
public int read(byte b[]) throws IOException {
    return read(b, 0, b.length);
}

/**
 * Reads up to <code>len</code> bytes of data from the input stream into
 * an array of bytes.  An attempt is made to read as many as
 * <code>len</code> bytes, but a smaller number may be read.
 * The number of bytes actually read is returned as an integer.
 *
 * <p> This method blocks until input data is available, end of file is
 * detected, or an exception is thrown.
 *
 * <p> If <code>len</code> is zero, then no bytes are read and
 * <code>0</code> is returned; otherwise, there is an attempt to read at
 * least one byte. If no byte is available because the stream is at end of
 * file, the value <code>-1</code> is returned; otherwise, at least one
 * byte is read and stored into <code>b</code>.
 *
 注意下面这段话,假设k是读取到的实际字节数, 读取到的内容会填充到b[off]到b[off+k-1],b[off+k]到b[off+len-1]维持原状
 * <p> The first byte read is stored into element <code>b[off]</code>, the
 * next one into <code>b[off+1]</code>, and so on. The number of bytes read
 * is, at most, equal to <code>len</code>. Let <i>k</i> be the number of
 * bytes actually read; these bytes will be stored in elements
 * <code>b[off]</code> through <code>b[off+</code><i>k</i><code>-1]</code>,
 * leaving elements <code>b[off+</code><i>k</i><code>]</code> through
 * <code>b[off+len-1]</code> unaffected.
 *
 * <p> In every case, elements <code>b[0]</code> through
 * <code>b[off]</code> and elements <code>b[off+len]</code> through
 * <code>b[b.length-1]</code> are unaffected.
 *
 * <p> The <code>read(b,</code> <code>off,</code> <code>len)</code> method
 * for class <code>InputStream</code> simply calls the method
 * <code>read()</code> repeatedly. If the first such call results in an
 * <code>IOException</code>, that exception is returned from the call to
 * the <code>read(b,</code> <code>off,</code> <code>len)</code> method.  If
 * any subsequent call to <code>read()</code> results in a
 * <code>IOException</code>, the exception is caught and treated as if it
 * were end of file; the bytes read up to that point are stored into
 * <code>b</code> and the number of bytes read before the exception
 * occurred is returned. The default implementation of this method blocks
 * until the requested amount of input data <code>len</code> has been read,
 * end of file is detected, or an exception is thrown. Subclasses are encouraged
 * to provide a more efficient implementation of this method.
 *
 * @param      b     the buffer into which the data is read.
 * @param      off   the start offset in array <code>b</code>
 *                   at which the data is written.
 * @param      len   the maximum number of bytes to read.    ---  最多读取len
 * @return     the total number of bytes read into the buffer, or
 *             <code>-1</code> if there is no more data because the end of
 *             the stream has been reached.                  ---  实际读取的长度
 * @exception  IOException If the first byte cannot be read for any reason
 * other than end of file, or if the input stream has been closed, or if
 * some other I/O error occurs.
 * @exception  NullPointerException If <code>b</code> is <code>null</code>.
 * @exception  IndexOutOfBoundsException If <code>off</code> is negative,
 * <code>len</code> is negative, or <code>len</code> is greater than
 * <code>b.length - off</code>
 * @see        java.io.InputStream#read()
 */
public int read(byte b[], int off, int len) throws IOException {
...
}

参考资料

https://blog.csdn.net/lilidejing/article/details/37913627
https://www.cnblogs.com/ddwarehouse/p/10127729.html
https://www.cnblogs.com/linuxprobe/p/5925397.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值