java实现Itext7对PDF压缩及踩坑

前言

讲述使用此工具的前因后果,与第三方传送文件时,需要将文件转换为Base64,文件流转base64后,文件会变大很多,大约会变大33%左右,因为对方服务器网络限制文件大小不能太大,所以需要将过大的文件需要压缩,特此需要此功能。[转base64后变大具体原因](https://blog.csdn.net/JackieDYH/article/details/122558936)

1、maven依赖配置

这里使用的是maven依赖,并且使用的是Itext7全家桶,单独引用依赖比较多,这样比较方便

<!-- PDF操作,itext7全家桶 -->
  <dependency>
      <groupId>com.itextpdf</groupId>
      <artifactId>itext7-core</artifactId>
      <version>7.1.15</version>
      <type>pom</type>
  </dependency>

2、PdfUtil工具

具体代码实现

@Slf4j
public class PdfUtil {

    /**
     * 图像的乘法因子,调整该参数,控制图片压缩后的大小
     */
    public static float FACTOR = 0.7f;


    /**
     * PDF压缩后转为base64
     * @param is  源文件输入流
     * @throws Exception 抛出异常
     */
    public static String compress(InputStream is) throws Exception {
        PdfName key = new PdfName("ITXT_SpecialId");
        PdfName value = new PdfName("123456789");
        // 读取pdf文件
        PdfReader reader = new PdfReader(is);
        ByteArrayOutputStream swapStream = new ByteArrayOutputStream();
        PdfDocument pdfDocument = new PdfDocument(reader,new PdfWriter(swapStream));

        long n = reader.getLastXref();
        PdfObject object;
        PdfStream stream;
        //查找图像并操作图像流
        for (int i = 0; i < n; i++) {

            object = pdfDocument.getPdfObject(i);
            if (object == null || !object.isStream())
                continue;
            stream = (PdfStream) object;
            PdfObject pdfsubtype = stream.get(PdfName.Subtype);
            if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.Image.toString())) {
                PdfImageXObject image = new PdfImageXObject(stream);
                BufferedImage bi = image.getBufferedImage();
                if (bi == null) continue;
                int width = (int) (bi.getWidth() * FACTOR);
                int height = (int) (bi.getHeight() * FACTOR);
                BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
                AffineTransform at = AffineTransform.getScaleInstance(FACTOR, FACTOR);
                Graphics2D g = img.createGraphics();
                g.drawRenderedImage(bi, at);
                ByteArrayOutputStream imgBytes = new ByteArrayOutputStream();
                //标记此处,后面会修改
                ImageIO.write(img, "JPG", imgBytes);
                stream.clear();
                stream.setData(imgBytes.toByteArray(), false);
                stream.put(PdfName.Type, PdfName.XObject);
                stream.put(PdfName.Subtype, PdfName.Image);
                stream.put(key, value);
                stream.put(PdfName.Filter, PdfName.DCTDecode);
                stream.put(PdfName.Width, new PdfNumber(width));
                stream.put(PdfName.Height, new PdfNumber(height));
                stream.put(PdfName.BitsPerComponent, new PdfNumber(8));
                stream.put(PdfName.ColorSpace, PdfName.DeviceRGB);
            }
        }
        //将数据写入到输出流中
        pdfDocument.close();
        reader.close();
        log.info("pdf压缩完成, 文件大小={}", swapStream.size());
        return new BASE64Encoder().encode(swapStream.toByteArray());
    }

}

此工具使用过程中出现bug,进过翻看源码,找到具体原因,进行解决。

产生的问题

Exception in thread "main" com.itextpdf.io.IOException: The color depth 4 is not supported.
	at com.itextpdf.kernel.pdf.xobject.ImagePdfBytesInfo.decodeTiffAndPngBytes(ImagePdfBytesInfo.java:91)
	at com.itextpdf.kernel.pdf.xobject.PdfImageXObject.getImageBytes(PdfImageXObject.java:220)
	at com.itextpdf.kernel.pdf.xobject.PdfImageXObject.getImageBytes(PdfImageXObject.java:198)
	at com.itextpdf.kernel.pdf.xobject.PdfImageXObject.getBufferedImage(PdfImageXObject.java:188)
	at com.chintanneng.finance.util.PdfUtil.compress(PdfUtil.java:76)
	at com.chintanneng.finance.util.PdfUtil.main(PdfUtil.java:107)

通过报错,定位具体位置,ImagePdfBytesInfo该类方法decodeTiffAndPngBytes()
在这里插入图片描述
当pngColorType<0时,bpc != 8时,就会报错,bpc赋值过程,该方法压缩时,不支持BitsPerComponent=4的图片,这个是图片的位图为4的,PDF中存在位图为4的图片。

public ImagePdfBytesInfo(PdfImageXObject imageXObject) {
        pngColorType = -1;
        bpc = imageXObject.getPdfObject().getAsNumber(PdfName.BitsPerComponent).intValue();
        pngBitDepth = bpc;

        palette = null;
        icc = null;
        stride = 0;
        width = (int) imageXObject.getWidth();
        height = (int) imageXObject.getHeight();
        colorspace = imageXObject.getPdfObject().get(PdfName.ColorSpace);
        decode = imageXObject.getPdfObject().getAsArray(PdfName.Decode);
        findColorspace(colorspace, true);
    }

3、具体解决方法,代码实现如下

public static String compress(InputStream is) throws Exception {
        PdfName key = new PdfName("ITXT_SpecialId");
        PdfName value = new PdfName("123456789");
        // 读取pdf文件
        PdfReader reader = new PdfReader(is);
        ByteArrayOutputStream swapStream = new ByteArrayOutputStream();
        PdfDocument pdfDocument = new PdfDocument(reader,new PdfWriter(swapStream));

        long n = reader.getLastXref();
        PdfObject object;
        PdfStream stream;
        //查找图像并操作图像流
        for (int i = 0; i < n; i++) {

            object = pdfDocument.getPdfObject(i);
            if (object == null || !object.isStream())
                continue;
            stream = (PdfStream) object;
            PdfObject pdfsubtype = stream.get(PdfName.Subtype);
            if (pdfsubtype != null && pdfsubtype.toString().equals(PdfName.Image.toString())) {
                PdfImageXObject image = new PdfImageXObject(stream);
                int i1 = image.getPdfObject().getAsNumber(PdfName.BitsPerComponent).intValue();
                //小于8的都跳过,BitsPerComponent=4是提示报错The color depth BitsPerComponent的值 is not supported
                //this.bpc != 8 ImagePdfBytesInfo类下decodeTiffAndPngBytes
                //当位图不等于8时,直接跳过
                if (i1 != 8) {
                    continue;
                }
                BufferedImage bi = image.getBufferedImage();
                if (bi == null) continue;
                int width = (int) (bi.getWidth() * FACTOR);
                int height = (int) (bi.getHeight() * FACTOR);
                BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
                AffineTransform at = AffineTransform.getScaleInstance(FACTOR, FACTOR);
                Graphics2D g = img.createGraphics();
                g.drawRenderedImage(bi, at);
                ByteArrayOutputStream imgBytes = new ByteArrayOutputStream();
                //标记此处,后面会修改
                ImageIO.write(img, "JPG", imgBytes);
                stream.clear();
                stream.setData(imgBytes.toByteArray(), false);
                stream.put(PdfName.Type, PdfName.XObject);
                stream.put(PdfName.Subtype, PdfName.Image);
                stream.put(key, value);
                stream.put(PdfName.Filter, PdfName.DCTDecode);
                stream.put(PdfName.Width, new PdfNumber(width));
                stream.put(PdfName.Height, new PdfNumber(height));
                stream.put(PdfName.BitsPerComponent, new PdfNumber(8));
                stream.put(PdfName.ColorSpace, PdfName.DeviceRGB);
            }
        }
        //将数据写入到输出流中
        pdfDocument.close();
        reader.close();
        log.info("pdf压缩完成, 文件大小={}", swapStream.size());
        return new BASE64Encoder().encode(swapStream.toByteArray());
    }

具体情况没有深入研究,只是解决了当前遇到的问题。大佬们可以进行技术交流

  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值