Java word转html 适用于(docx) 完美保留格式与图片

maven中关于此功能的jar包

		<dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>3.17</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-scratchpad</artifactId>
            <version>3.17</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>3.17</version>
        </dependency>

        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>xdocreport</artifactId>
            <version>2.0.1</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml-schemas</artifactId>
            <version>3.17</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>ooxml-schemas</artifactId>
            <version>1.4</version>
        </dependency>
        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-lang3</artifactId>
            <version>3.7</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>1.8.0-beta2</version>
        </dependency>

word(docx)转html方法

 public static String docxToHtml4() throws Exception {
        //存放图片的目的地址()
        String imagePath = "D:\\wordTohtml\\image";
        //需要被转换的docx文件
        String sourceFileName = "C:/Users/ADMIN/Desktop/Sentinel.docx";
        //转换成的html文件(不存在将会被创建,存在会被覆盖)
        String targetFileName = "D:\\wordTohtml\\123.html";

        OutputStreamWriter outputStreamWriter = null;
        try {
            XWPFDocument document = new XWPFDocument(new FileInputStream(sourceFileName));
            XHTMLOptions options = XHTMLOptions.create();
            options.setIgnoreStylesIfUnused(false);
            options.setFragment(true);

            // 存放图片的文件夹
            options.setExtractor(new FileImageExtractor(new File(imagePath)));
            // html中图片的路径
            options.URIResolver(new BasicURIResolver("image"));
            outputStreamWriter = new OutputStreamWriter(new FileOutputStream(targetFileName), "utf-8");
            XHTMLConverter xhtmlConverter = (XHTMLConverter) XHTMLConverter.getInstance();
            xhtmlConverter.convert(document, outputStreamWriter, options);
        } finally {
            if (outputStreamWriter != null) {
                outputStreamWriter.close();
            }
        }
        return targetFileName;
    }
  • 1
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
要将Word中的内容保留格式换为图片,可以使用Java的Apache POI和Apache Batik库来实现。 首先,使用Apache POI读取Word文档,并将内容换为HTML格式。然后,使用Apache Batik将HTML换为SVG格式。最后,使用Java的图形处理库将SVG换为图片。 以下是一个简单的Java代码示例,可以用来实现这个功能: ```java import java.awt.image.BufferedImage; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.File; import java.io.FileOutputStream; import javax.imageio.ImageIO; import org.apache.batik.transcoder.TranscoderInput; import org.apache.batik.transcoder.TranscoderOutput; import org.apache.batik.transcoder.image.PNGTranscoder; import org.apache.poi.hwpf.extractor.WordExtractor; import org.apache.poi.hwpf.usermodel.Range; import org.apache.poi.xwpf.usermodel.XWPFDocument; public class WordToImageConverter { public static void main(String[] args) throws Exception { String inputFilePath = "/path/to/word/document.docx"; String outputFilePath = "/path/to/output/image.png"; // Read Word document File inputFile = new File(inputFilePath); String fileExtension = inputFile.getName().substring(inputFile.getName().lastIndexOf(".") + 1); String text = ""; if (fileExtension.equals("doc")) { WordExtractor extractor = new WordExtractor(inputFile); text = extractor.getText(); } else if (fileExtension.equals("docx")) { XWPFDocument doc = new XWPFDocument(new FileInputStream(inputFile)); for (XWPFParagraph p : doc.getParagraphs()) { text += p.getText(); } } // Convert HTML to SVG String html = "<html><body>" + text + "</body></html>"; ByteArrayInputStream input = new ByteArrayInputStream(html.getBytes()); ByteArrayOutputStream output = new ByteArrayOutputStream(); TranscoderInput transcoderInput = new TranscoderInput(input); TranscoderOutput transcoderOutput = new TranscoderOutput(output); PNGTranscoder transcoder = new PNGTranscoder(); transcoder.transcode(transcoderInput, transcoderOutput); output.flush(); output.close(); // Convert SVG to image ByteArrayInputStream imageInput = new ByteArrayInputStream(output.toByteArray()); BufferedImage image = ImageIO.read(imageInput); ImageIO.write(image, "png", new FileOutputStream(outputFilePath)); } } ``` 请注意,此代码仅适用换单个Word文档。如果您需要批量换多个文档,请编写一个循环来处理每个文档。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值