RAR/ZIP文件解压(兼容RAR5)

Zephyr丶Syn

已于 2023-06-29 09:49:28 修改

阅读量1k

点赞数 2

分类专栏： RAR5解压文章标签： java

于 2021-10-28 10:54:16 首次发布

本文链接：https://blog.csdn.net/weixin_44624841/article/details/121009682

版权

RAR5解压专栏收录该内容

1 篇文章

订阅专栏

前言：

 记录压缩包解压功能开发过程遇见的一些问题，及最终的解决方案；

原始需求：

 客户提出需要批量上传文档，上传文件为包含一系列文件的压缩包，格式为zip或rar；

历史实现方式：

zip格式：使用net.lingala.zip4j.core.ZipFile包下的api进行解压：ZipFile.extractAll，默认设置”GBK“编码，然后通过FileHeader.getFileName的文件名有没有乱码来设置解压时的编码”UTF-8“；
存在的问题：因为我们粗暴的设置了一种编码，如果压缩包里的文件名包含了不同的编码，会导致某些文件解压出来后文件名乱码；

rar格式：使用de.innosystec.unrar.Archive包下的api进行解压：Archive.extractFile；
存在的问题：Archive不支持解析RAR5格式的压缩包；

初步改进后的实现：

在原来的基础上，增加通过命令调用解压工具(unrar/unzip)进行解压；
存在的问题：zip格式的压缩包超过80M或rar格式的压缩包超过30M就只解压限制之内的文件出来，超过限制的文件会不解压；(限制原因未明，在服务器上调用命令解压是可以全部解压的，但程序调用就会有限制，怀疑JVM有关)；

再次优化后的实现：

zip格式：
	通过查询资料，可以把压缩包中的所有文件的FileHeader拿出来进行单独解压，进行到这一步发现文件名还是可能出现乱码；通过阅读ZipFile源码发现，api内部会判断文件名是否是UTF-8编码，并回填到fileHeader.isFileNameUTF8Encoded属性中，所以先默认使用GBK编码，在遍历解压文件时，根据fileHeader.isFileNameUTF8Encoded判断是否是UTF-8，如果是则修改ZipFile.Charset，同时修改FileHeader.FileName为正确的文件名称，这样就可以防止压缩包中包含不同编码的文件名乱码；
	存在的问题：在使用过程中发现，台湾客户发回来的压缩包发现在转换文件名的时候，已经乱码文件名无法转换为源文件名，会解压失败：File header and local file header mismatch；

最终的实现：

rar/zip格式：使用net.sf.sevenzipjbinding包下的api进行解压，需要自定义一个MyExtractCallback类去实现IArchiveExtractCallback，并重写getStream方法，此方式兼容rar5及rar5以下版本的解压；

实现代码：

maven依赖：

<dependency>
	<groupId>net.sf.sevenzipjbinding</groupId>
	<artifactId>sevenzipjbinding</artifactId>
	<version>16.02-2.01</version>
</dependency>
<dependency>
	<groupId>net.sf.sevenzipjbinding</groupId>
	<artifactId>sevenzipjbinding-all-platforms</artifactId>
	<version>16.02-2.01</version>
</dependency>

1、自定义ExtractCallback类，实现IArchiveExtractCallback，重写getStream;

import net.sf.sevenzipjbinding.*;
import org.apache.commons.io.FileUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.*;

public class MyExtractCallback implements IArchiveExtractCallback {

    private static final Logger LOG = LoggerFactory.getLogger(MyExtractCallback.class);

    private int index;
    private IInArchive inArchive;
    private String outDir;
    private String errorInfo;// 输出异常信息

    public MyExtractCallback(IInArchive inArchive, String outDir) {
        this.inArchive = inArchive;
        this.outDir = outDir;
    }

    @Override
    public void setCompleted(long arg0) throws SevenZipException {
    }

    @Override
    public void setTotal(long arg0) throws SevenZipException {
    }

    @Override
    public ISequentialOutStream getStream(int index, ExtractAskMode extractAskMode) throws SevenZipException {
        this.index = index;
        String archivePath = (String) inArchive.getProperty(index, PropID.PATH);
        final boolean isFolder = (boolean) inArchive.getProperty(index, PropID.IS_FOLDER);
        final String path;
        try {
            // 判断是否是ISO-8859-1编码，如果是则转换为gbk
            if (archivePath.equals(new String(archivePath.getBytes(StandardCharsets.ISO_8859_1), StandardCharsets.ISO_8859_1))) {
                archivePath = new String(archivePath.getBytes(StandardCharsets.ISO_8859_1), CHARSET_GBK);
            }
            path = archivePath;
            String savePath = outDir + File.separator + path;
            File outFile = FileUtil.getFile(savePath);
            if (!outFile.exists()) {
                if (isFolder) {
                    FileUtils.forceMkdir(outFile);
                } else {
                    FileUtils.forceMkdir(outFile.getParentFile());
                    outFile.createNewFile();
                }
            }
        } catch (UnsupportedEncodingException e) {
            this.errorInfo = "文件[" + archivePath + "]的名称编码不正确，请修改为GBK/UTF-8格式！";
            throw new SevenZipException(e.getMessage(), e);
        } catch (IOException e) {
            this.errorInfo = "文件[" + archivePath + "]解压失败，请检查：" + e.getMessage();
            throw new SevenZipException(e.getMessage(), e);
        }
        return data -> {
            if (!isFolder) {
                File file = FileUtil.getFile(outDir + File.separator + path);
                this.saveToFile(file, data);
            }
            return data.length;
        };
    }

    @Override
    public void prepareOperation(ExtractAskMode arg0) throws SevenZipException {
    }

    @Override
    public void setOperationResult(ExtractOperationResult extractOperationResult) throws SevenZipException {

    }

    private boolean saveToFile(File file, byte[] msg) {
        File parent = file.getParentFile();
        if ((!parent.exists()) && (!parent.mkdirs())) {
            return false;
        }
        try (OutputStream fos = new FileOutputStream(file, true)) {
            fos.write(msg);
            fos.flush();
            return true;
        } catch (IOException e) {
            LOG.error("保存文件失败：{}", e);
            return false;
        }
    }

	public String getErrorInfo(){
        return errorInfo;
    }
}

2、解压文件:

	/**
     * 解压文件
     *
     * @param targetPath    解压路径
     * @param sourceFile    源文件
     * @param archiveFormat 压缩格式，zip文件设置ArchiveFormat.ZIP，rar和rar5可不设置值
     * @throws Exception
     */
    public static void unRarOrZipFiles(String targetPath, String sourceFile, ArchiveFormat archiveFormat) throws Exception {
        MyExtractCallback extractCallback = null;
        try (
                RandomAccessFile randomAccessFile = new RandomAccessFile(sourceFile, "rw");
                IInArchive archive = SevenZip.openInArchive(archiveFormat,
                        new RandomAccessFileInStream(randomAccessFile))
        ) {
            int[] in = new int[archive.getNumberOfItems()];
            for (int i = 0; i < in.length; i++) {
                in[i] = i;
            }
            extractCallback = new MyExtractCallback(archive, targetPath);
            archive.extract(in, false, extractCallback);
        } catch (IOException e) {
            if (e.getMessage().contains(RAR_FILE_NAME_TOO_LONG_ERROR)) {
                throw new BaseTextException(TrustBaseException.ERROR_IO_EXCEPTION,
                        "文件名长度超过操作系统限制，请检查！");
            } else if (!e.getMessage().contains(RAR_FILE_NAME_TOO_LONG_ERROR) && extractCallback != null) {
                throw new BaseTextException(TrustBaseException.ERROR_IO_EXCEPTION, extractCallback.getErrorInfo());
            } else {
                throw e;
            }
        }
    }