[Java]使用OppenOffice将word文档转换为PDF

最新推荐文章于 2024-12-12 17:13:18 发布

安达鲁狗

最新推荐文章于 2024-12-12 17:13:18 发布

阅读量1k

点赞数 6

分类专栏： java 文章标签： java 开发语言后端

本文链接：https://blog.csdn.net/ljh981124/article/details/121334470

版权

java 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

牛逼课堂开课啦：
首先需要引入所需jar包：

<dependency>
			<groupId>com.artofsolving</groupId>
			<artifactId>jodconverter</artifactId>
			<version>2.2.1</version>
</dependency>
<dependency>
			<groupId>org.openoffice</groupId>
			<artifactId>jurt</artifactId>
			<version>3.0.1</version>
</dependency>
<dependency>
			<groupId>org.openoffice</groupId>
			<artifactId>ridl</artifactId>
			<version>3.0.1</version>
</dependency>
<dependency>
			<groupId>org.openoffice</groupId>
			<artifactId>juh</artifactId>
			<version>3.0.1</version>
</dependency>
<dependency>
			<groupId>org.openoffice</groupId>
			<artifactId>unoil</artifactId>
			<version>3.0.1</version>
</dependency>

<!--jodconverter2.2.1必须依赖slf4j-jdk14必须这个版本，不然源码中日志会报错-->
<dependency>
		<groupId>org.slf4j</groupId>
		<artifactId>slf4j-jdk14</artifactId>
		<version>1.4.3</version>
</dependency>

这些就是需要用到的jar包
2.2.1也可以将docx转pdf！我说到！耶稣来了我也这么说！当初百度了半天说2.2.2支持docx转pdf我去maven库搜了根本没有2.2.2，百度找的也一个都用不成！岂可休！
在使用前必须在OppenOffice安装目录下（我这里是："C:\Program Files (x86)\OpenOffice 4\program）运行cmd 开启端口服务 soffice -headless -accept=“socket,host=127.0.0.1,port=8100;urp;” -nofirststartwizard 复制这串就可以了
接下来上代码：

    public void convert(File file) {
        String filePath = file.getAbsolutePath();

        //把输入的word路径，后缀替换为pdf.
        String pdfPath = filePath.substring(0, filePath.lastIndexOf(".") + 1)+".pdf";

        //输出文件夹
        File outFile = new File(pdfPath);

        OpenOfficeConnection openOfficeConnection = null;

        Process p = null;

        try{
            if(!file.exists()){
                logger.error("文件不存在");
                return;
            }
           //这里放的是你OpenOffice运行exe的地址
            p = Runtime.getRuntime().exec("C:\\Program Files (x86)\\OpenOffice 4\\program\\soffice.exe");
            // 连接openoffice服务
            openOfficeConnection = new SocketOpenOfficeConnection("127.0.0.1", 8100);
            openOfficeConnection.connect();

            // 转换word到pdf
            OpenOfficeDocumentConverter openOfficeDocumentConverter = new OpenOfficeDocumentConverter(openOfficeConnection);
            //解决转为pdf出现的乱码问题
            ReadUtils.saveAsUTF8(file.getAbsolutePath(),outFile.getAbsolutePath());
            
            openOfficeDocumentConverter.convert(file.getAbsoluteFile(), outFile);

            logger.info("输出完成");

        }catch (Exception e){
            e.printStackTrace();
        }finally {
            if (openOfficeConnection != null) {
                // 关闭连接
                openOfficeConnection.disconnect();
            }
            if (p != null) {
                // 关闭进程
                p.destroy();
            }
        }
    }

ReadUtils是解决转换出现的乱码问题的，之前没加转存为utf-8编码，结果完美的乱码了，以下是代码：

public class ReadUtils {
    public static void saveAsUTF8(String inputFileUrl, String outputFileUrl) throws IOException {
        String line_separator = System.getProperty("line.separator");
        FileInputStream fis = new FileInputStream(inputFileUrl);
        StringBuffer content = new StringBuffer();
        DataInputStream in = new DataInputStream(fis);
        BufferedReader d = new BufferedReader(new InputStreamReader(in, "GBK"));// , "UTF-8"
        String line = null;
        while ((line = d.readLine()) != null)
            content.append(line + line_separator);
        d.close();
        in.close();
        fis.close();

        Writer ow = new OutputStreamWriter(new FileOutputStream(outputFileUrl), "utf-8");
        ow.write(content.toString());
        ow.close();
    }
}

重要的出现了，jodconverter-2.2.1本来确确实实不支持docx转pdf的，但是你只要重写 BasicDocumentFormatRegistry 方法对文档进行统一处理，那么docx，excel，txt都可以转，连xml也可以！（除了打开会显示文件损坏之外没有一点毛病，这个只有xml转pdf会出现）。
首先在com下创建artofsolving.jodconverter包，之后创建 BasicDocumentFormatRegistry 类（注意！名字一定要一样！）
在这里插入图片描述
就像这样👆。
以下是 BasicDocumentFormatRegistry 代码：

public class BasicDocumentFormatRegistry implements DocumentFormatRegistry{

    private List/* <DocumentFormat> */ documentFormats = new ArrayList();

    public void addDocumentFormat(DocumentFormat documentFormat) {
        documentFormats.add(documentFormat);
    }

    protected List/* <DocumentFormat> */ getDocumentFormats() {
        return documentFormats;
    }

    @Override
    public DocumentFormat getFormatByFileExtension(String extension) {
        if (extension == null) {
            return null;
        }
        //new DefaultDocumentFormatRegistry();
        //将文件名后缀统一转化
        if (extension.indexOf("doc") >= 0) {
            extension = "doc";
        }
        if (extension.indexOf("ppt") >= 0) {
            extension = "ppt";
        }
        if (extension.indexOf("xls") >= 0) {
            extension = "xls";
        }
        String lowerExtension = extension.toLowerCase();
        for (Iterator it = documentFormats.iterator(); it.hasNext();) {
            DocumentFormat format = (DocumentFormat) it.next();
            if (format.getFileExtension().equals(lowerExtension)) {
                return format;
            }
        }
        return null;
    }

    @Override
    public DocumentFormat getFormatByMimeType(String extension) {
        for (Iterator it = documentFormats.iterator(); it.hasNext();) {
            DocumentFormat format = (DocumentFormat) it.next();
            if (format.getMimeType().equals(extension)) {
                return format;
            }
        }
        return null;
    }
   }