在已有的PDF中提取图片，生成新的PDF文件

最新推荐文章于 2021-08-23 21:11:12 发布

luyao96

最新推荐文章于 2021-08-23 21:11:12 发布

阅读量354

点赞数

分类专栏： java

本文链接：https://blog.csdn.net/weixin_44201769/article/details/109115228

版权

java 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

具体学习地址
https://blog.csdn.net/loongshawn/article/details/51542309

依赖
第一个是用来读写PDF文件，第二个在PDF中写入中文，第三个在PDF中提取图片（自己理解的，估计会有错误）

<!--itextpdf整合-->
        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>itextpdf</artifactId>
            <version>5.4.3</version>
        </dependency>
        <dependency>
            <groupId>com.itextpdf</groupId>
            <artifactId>itext-asian</artifactId>
            <version>5.2.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.pdfbox</groupId>
            <artifactId>pdfbox</artifactId>
            <version>2.0.1</version>
        </dependency>

public static void test() {
        String path = "D:\\mms\\16008229200141630.pdf";
        String path2 = "image\\image";
        Document document2 = new Document(PageSize.A4);
        FileInputStream fis = null;
        PDDocument document = null;
        ByteArrayOutputStream out = null;
        String PDFpath = "D:\\mms\\测试.pdf";//输出pdf的路径
        try {
            PdfWriter.getInstance(document2, new FileOutputStream(PDFpath));
            document2.open();
            // 打开pdf文件流
            fis = new FileInputStream(path);
            // 加载 pdf 文档,获取PDDocument文档对象
            document = PDDocument.load(fis);
            /** 文档页面信息 **/// 获取PDDocumentCatalog文档目录对象
            PDDocumentCatalog catalog = document.getDocumentCatalog();
            // 获取文档页面PDPage列表
            int pages = document.getNumberOfPages();
            int count = 0;
            document2.newPage();
            for (int j = 0; j < pages; j++) {
                PDPage page = document.getPage(j);
                PDResources resources = page.getResources();
                Iterable xobjects = resources.getXObjectNames();
                if (xobjects != null) {
                    Iterator imageIter = xobjects.iterator();
                    while (imageIter.hasNext()) {
                        COSName key = (COSName) imageIter.next();
                        if (resources.isImageXObject(key)) {
                            PDImageXObject image = (PDImageXObject) resources.getXObject(key);
                            BufferedImage bimage = image.getImage();
                            String imgPath = path2 + count + ".jpg";
                            //输出到本地文件夹中
                            //  ImageIO.write(bimage, "jpg", new File(imgPath));
                            out = new ByteArrayOutputStream();
                            //把读到的图片放入流
                            ImageIO.write(bimage, "jpg", out);
                            //把流转换成字节流图片
                            Image image2 = Image.getInstance(out.toByteArray());
                            float documentWidth = document2.getPageSize().getWidth() - document2.leftMargin() - document2.rightMargin();
                            float documentHeight = documentWidth / 600 * 320;//重新设置宽高
                            image2.scaleAbsolute(documentWidth, documentHeight);//重新设置宽高
                            //添加换行，隔开图片
                            document2.add(new Paragraph("\n"));
                            document2.add(image2);
                            count++;
                        }
                    }
                }
            }
        } catch (DocumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
            System.out.println("有异常图片");
        } finally {
            document2.close();
            try {
                fis.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            try {
                document.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            try {
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
            try {
                out.flush();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

刚开始提取出的图片，不知道怎么存入新的PDF中，只能先以image格式存入本地，然后再从本地路径取出图片存入新的pdf，后来觉得这样太麻烦。发现生成的图片可以接收字节流，于是就是使用ByteArrayOutputStream把生成的BufferedImage 存入输出流，再把流转换成图片字节流。这样就可以不落地，直接通过流生成在新的pdf中生成图片。

luyao96

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
在已有的PDF中提取图片，生成新的PDF文件

具体学习地址https://blog.csdn.net/loongshawn/article/details/51542309依赖第一个是用来读写PDF文件，第二个在PDF中写入中文，第三个在PDF中提取图片（自己理解的，估计会有错误） <dependency> <groupId>com.itextpdf</groupId> <artifactId&
复制链接

扫一扫