两种方式:documents4j、groupdocs
一、documents4j
1、添加依赖
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-local</artifactId>
<version>1.1.5</version>
</dependency>
<dependency>
<groupId>com.documents4j</groupId>
<artifactId>documents4j-transformer-msoffice-word</artifactId>
<version>1.1.5</version>
</dependency>
2、Word转PDF
public interface DocumentTypeConstant {
/**
* DOC
*/
String DOC = ".doc";
/**
* DOCX
*/
String DOCX = ".docx";
}
/**
* word转pdf
* @param inputStream 输入流
* @param suffix 文件后缀
* @return 字节数组
*/
public static byte[] doc2Pdf(InputStream inputStream, String suffix) {
IConverter converter = LocalConverter.builder().build();
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream();) {
if(suffix.equals(DocumentTypeConstant.DOC)){
converter.convert(inputStream).as(DocumentType.DOC).to(outputStream).as(DocumentType.PDF).execute();
} else if(suffix.equals(DocumentTypeConstant.DOCX)){
converter.convert(inputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
}
return outputStream.toByteArray();
} catch (Exception e) {
e.printStackTrace();
} finally {
converter.shutDown();
}
return null;
}
/**
* 获取Converter
* @return
*/
private static IConverter getConverter() {
return LocalConverter.builder().build();
}
注:在本地版的实现策略中,document4j将指定文件的转换任务委派给本机中相应的应用程序。因此,为了保证正常运行,这台机器需要在后台预装好支持转换的软件,诸如Microsoft Word / Excel 。 document4j 提供了一套简单易用的机制允许用户注册自定义的转换器,同时将具体的实现细节和Microsoft Word / Excel 进行对接结合。Linux下无法使用。
二、groupdocs
1、添加依赖
<dependency>
<groupId>com.groupdocs</groupId>
<artifactId>groupdocs-conversion</artifactId>
<version>22.12.1</version>
</dependency>
2、word转PDF
public static byte[] doc2Pdf (InputStream inputStream) {
Converter converter = new Converter(() -> inputStream);
AtomicReference<ByteArrayOutputStream> output = new AtomicReference<>(new ByteArrayOutputStream());
converter.convert((SaveDocumentStreamForFileType) t -> {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
output.set(outputStream);
return outputStream;
}, new PdfConvertOptions());
return output.get().toByteArray();
}
注:本地代码能运行,使用docker打包后会出现Caused by Zip64 archives are not supported,
可以使用maven-shade-plugin插件解决。本地测试通过,使用Docker打包报一样的错。
三、openoffice+jodconverter
1、引入依赖
<dependency>
<groupId>com.artofsolving</groupId>
<artifactId>jodconverter</artifactId>
<version>2.2.1</version>
</dependency>
<dependency>
<groupId>org.jodconverter</groupId>
<artifactId>jodconverter-spring-boot-starter</artifactId>
<version>4.3.0</version>
2、编写代码
/**
* word 转 PDf
* @param inputStream 输入流
* @param host 转换主机ip
* @param port 转换主机端口
* @return
*/
public static byte[] wordToPdf(InputStream inputStream, String host, int port) {
log.info("connect openoffice host: {}, port: {}", host, port);
SocketOpenOfficeConnection connection = new SocketOpenOfficeConnection(host, port);
ByteArrayOutputStream out = new ByteArrayOutputStream();
try {
connection.connect();
StreamOpenOfficeDocumentConverter converter = new StreamOpenOfficeDocumentConverter(connection);
DefaultDocumentFormatRegistry formatReg = new DefaultDocumentFormatRegistry();
DocumentFormat targetFormat = formatReg.getFormatByFileExtension(DEFAULT_SUFFIX);
DocumentFormat sourceFormat = formatReg.getFormatByFileExtension(WORD_SUFFIX);
converter.convert(inputStream, sourceFormat, out, targetFormat);
return outputStreamConvertInputStream(out);
} catch (ConnectException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
} finally {
connection.disconnect();
}
return out.toByteArray();
}
3、安装OpenOffice
3.1 下载
Apache OpenOffice - Official Download
3.2 安装
# 解压
tar -zxvf 压缩包
# 进入目录
cd zh-cn/RPMS
# 安装
yum localinstall *.rpm
# 进入desktop-integration
cd desktop-integration
rpm -ivh openoffice4.1.13-redhat-menus-4.1.13-9810.noarch.rpm
3、启动
# 进入安装路径:默认为/opt/openoffice4/program
cd /opt/openoffice4/program/
# 执行启动命令, host是连接ip,port是连接端口
./soffice "-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager" -nologo -headless -nofirststartwizard &
4、查看进程
1、netstat -nltp | grep 8100
2、ps -ef | grep openoffice
5、添加开机自启
vim /etc/rc.local
添加启动命令:
/opt/openoffice4/program/soffice "-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager" -nologo -headless -nofirststartwizard &
6、问题
6.1 /opt/openoffice4/program/soffice.bin: error while loading shared libraries: libfreetype.so.6: cannot open shared object file: No such file or directory 或者 /opt/openoffice4/program/soffice.bin: error while loading shared libraries: libXext.so.6: cannot open shared object file: No such file or directory
解决:
# 出现该问题是由于在/opt/openoffice4/program下少libXext.so.6文件
1、从/usr/lib64或者/usr/lib中找,如果有,复制到/opt/openoffice4/program下。
2、没有,使用命令yum install libXext.x86_64安装,再从/usr/lib64或/usr/lib中复制到/opt/openoffice4/program。
6.2 no suitable windowing system found, exiting
yum groupinstall "X Window System"
6.3 中文字体乱码
1、从window中拷贝字体到服务器上。
2、复制字体到/usr/share/fonts下。
3、执行fc-cache生效。
4、查找openoffice进程并杀掉
ps -ef | grep openoffice
kill -9 pid
5、重启openoffice
# 进入安装路径:默认为/opt/openoffice4/program
cd /opt/openoffice4/program/
# 执行启动命令, host是连接ip,port是连接端口
./soffice "-accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager" -nologo -headless -nofirststartwizard &注: 当在/usr/share/fonts下单独创建一个目录存放字体,会导致openoffice启动失败。