java 富文本(含图片)导出为docx文件，以及docx文件的合并

最新推荐文章于 2024-07-08 00:01:13 发布

我不是码神（dn）

最新推荐文章于 2024-07-08 00:01:13 发布

阅读量210

点赞数 1

文章标签： java 开发语言 docx

本文链接：https://blog.csdn.net/weixin_45729937/article/details/139750031

版权

应用场景：

本身是用easypoi来进行导出docx文件的，但是发现easypoi 不支持导出富文本格式。于是有了此场景。

解决思路：

先把富文本(含图片)内容转为docx文件。
将easypoi导出的文件与富文本转出的docx文件合并。
获取到最终文件
注释清楚的一批，又不会的直接留言。不经常看，回复的可能不及时

使用工具：
使用了e-iceblue依赖，来进行对docx文档的操作（富文本转docx，合并文档）。（缺点就是有水印，需要去除文档中的水印）

用到的依赖(应该有多余的)：

 <dependency>
            <groupId>cn.afterturn</groupId>
            <artifactId>easypoi-base</artifactId>
            <version>4.1.0</version>
        </dependency>
        <dependency>
            <groupId>cn.afterturn</groupId>
            <artifactId>easypoi-web</artifactId>
            <version>4.1.0</version>
        </dependency>
        <dependency>
            <groupId>cn.afterturn</groupId>
            <artifactId>easypoi-annotation</artifactId>
            <version>4.1.0</version>
        </dependency>
        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.13.1</version>
        </dependency>
        <dependency>
            <groupId>org.freemarker</groupId>
            <artifactId>freemarker</artifactId>
            <version>2.3.31</version>
        </dependency>

        <dependency>
            <groupId>e-iceblue</groupId>
            <artifactId>spire.doc</artifactId>
            <version>12.4.1</version>
        </dependency>
<!--poi-->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-examples</artifactId>
            <version>3.14</version>
        </dependency>
    </dependencies>

    <repositories>
        <repository>
            <id>com.e-iceblue</id>
            <name>e-iceblue</name>
            <url>http://repo.e-iceblue.cn/repository/maven-public/</url>
        </repository>
    </repositories>

富文本转为docx文档

 /**
     * 富文本导出为docx文档
     * @param htmlContext 富文本内容
     * @param temporaryFileName  导出的临时文件名称
     * @param titleName  这一块富文本内容的标题，看自己需求
     * @return
     * @throws IOException
     */

    private String htmlStringToWord(String htmlContext, String temporaryFileName, String titleName) throws IOException {
//        替换img 标签中  src的代理路径，换成自己的 profile为我图片的保存路径，自己替换
        htmlContext = htmlContext.replace("/dev-api/profile", profile)
                .replace("/prod-api/profile", profile)
                .replace("http://localhost", "");
        //临时保存文件路径
        String filePath = profile + temporaryFilePath + temporaryFileName + ".docx";
        //新建Document对象
        com.spire.doc.Document document = new com.spire.doc.Document();
//         在此docx文件中 创一个标题样式-可以不要
        ParagraphStyle style1 = new ParagraphStyle(document);
        style1.setName("titleStyle");
        style1.getCharacterFormat().setBold(true);
        style1.getCharacterFormat().setFontSize(16f);
        style1.getCharacterFormat().setFontName("Times New Roman");
        document.getStyles().add(style1);

        //添加section
        Section sec = document.addSection();

//        先新增一个标题 可以不要
        Paragraph para1 = sec.addParagraph();
//        titleName 为 方法传入 可以不要
        para1.appendText(titleName);
        para1.applyStyle("titleStyle");

//        img标签的 正则表达式
        String imgPattern = "<img\\s+[^>]*src=\"([^\"]*)\"[^>]*>";
        Pattern pattern = Pattern.compile(imgPattern);
        Matcher matcher = pattern.matcher(htmlContext);

//        创建表格
        Table table = sec.addTable(true);
//        只有一个单元格
        table.resetCells(1,1);

        // 获取左页边距
        float leftMargin = sec.getPageSetup().getMargins().getLeft();
        // 获取右页边距
        float rightMargin = sec.getPageSetup().getMargins().getRight();
        double pageWidth = sec.getPageSetup().getPageSize().getWidth() - leftMargin - rightMargin;

//        获取第(1,1)单元格
        TableCell cell = table.getRows().get(0).getCells().get(0);

        int lastIndex = 0;
        while (matcher.find()) {
            // 获取 img 标签之前的 HTML 片段
            String textBeforeImg = htmlContext.substring(lastIndex, matcher.start());

            // 添加 HTML 片段到段落
            Paragraph textParagraph = cell.addParagraph();
            textParagraph.appendHTML(textBeforeImg);

            // 获取 src 属性的值
            String imgSrc = matcher.group(1);

            // 添加图片到新段落
            Paragraph imgParagraph = cell.addParagraph();
            DocPicture picture = new DocPicture(document);
            picture.loadImage(imgSrc);
            imgParagraph.getChildObjects().insert(0, picture);
            // 更新 lastIndex
            lastIndex = matcher.end();
        }
//        新增表格的段落
        Paragraph remainingTextParagraph = cell.addParagraph();
        remainingTextParagraph.appendHTML(htmlContext.substring(lastIndex));
//        表格宽度
        table.getRows().get(0).getCells().get(0).setCellWidth((float)(pageWidth * 1.3), CellWidthType.Point);
        //文档另存为docx
        document.saveToFile(filePath, FileFormat.Docx);

        return filePath;
    }

合并文档

/**
     * 合并文档  另起一页合并
     * @param filePath1  最终留存文件路径
     * @param filePath2 被合并文档路径
     * @return
     */
    private String mergeDocx(String filePath1, String filePath2, String finalFilePath){

        //创建 Document 类的对象并从磁盘加载 Word 文档
        com.spire.doc.Document document = new com.spire.doc.Document(filePath1);

        //将另一个文档插入当前文档
        document.insertTextFromFile(filePath2, FileFormat.Docx);

        //保存结果文档
        document.saveToFile(finalFilePath, FileFormat.Docx);

        return finalFilePath;
    }
    /**
     * 合并文档  内容合并  不另起一页
     * @param filePath1  最终留存文件路径
     * @param filePath2 被合并文档路径
     * @return
     */
    private String mergeDocxFile(String filePath1, String filePath2, String finalFilePath){

        com.spire.doc.Document document1  = new com.spire.doc.Document(filePath1);
        com.spire.doc.Document document2  = new com.spire.doc.Document(filePath2);

        //在第二个文档中循环获取所有节
        for (Object sectionObj : (Iterable) document2.getSections()) {
            Section sec=(Section)sectionObj;
            //在所有节中循环获取所有子对象
            for (Object docObj :(Iterable ) sec.getBody().getChildObjects()) {
                DocumentObject obj=(DocumentObject)docObj;

                //获取第一个文档的最后一节
                Section lastSection = document1.getLastSection();

                //将所有子对象添加到第一个文档的最后一节中
                Body body = lastSection.getBody();
                body.getChildObjects().add(obj.deepClone());
            }
        }

        //保存结果文档
        document1.saveToFile(finalFilePath, FileFormat.Docx);
        return finalFilePath;
    }

我不是码神（dn）

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
2
评论
java 富文本(含图片)导出为docx文件，以及docx文件的合并

应用场景：本身是用easypoi来进行导出docx文件的，但是发现easypoi 不支持导出富文本格式。于是有了此场景。使用工具：使用了依赖，来进行对docx文档的操作（富文本转docx，合并文档）。（缺点就是有水印，需要去除文档中的水印）
复制链接

扫一扫