一、libreOffice
与openOffice类似,但比openOffice稳定。
优点:样式稳定
缺点:性能较差
调用方式:windows:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15public static String (String docPath){
String libreOfficePath = Global.getConfig("libreOffice");
if (!libreOfficePath.endsWith(File.separator))
libreOfficePath += File.separator;
//soffice --convert-to pdf -outdir E:/test.docx
String command = libreOfficePath + "soffice --convert-to pdf -outdir " + new File(docPath).getParent() + " " + docPath;
// 执行转换
String result = commandExecutor.executeCommand(command, EXECUTE_COMMAND_TIME_OUT).getExecuteOut();
logger.info(result);
docPath = docPath.replace(".docx", ".pdf");
return docPath;
}linux:1
2
3
4
5
6
7
8
9
10
11public static String (String docPath){
String libreOfficePath = Global.getConfig("libreOffice");
String command = libreOfficePath + " --invisible --convert-to pdf:writer_pdf_Export --outdir "
+ new File(docPath).getParent() + " " + docPath;
// 执行转换
String result = commandExecutor.executeCommand(command, EXECUTE_COMMAND_TIME_OUT).getExecuteOut();
logger.debug("转换结果:{}", result);
docPath = docPath.replace(".docx", ".pdf");
return docPath;
}
二、docx4j
优点:性能比libreoffice稍好
缺点:性能差、容易出现PDF和Word样式不一致问题1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39private static Mapper fontMapper = new IdentityPlusMapper();
// 初始化字体
static {
fontMapper.put("隶书", PhysicalFonts.get("LiSu"));
fontMapper.put("宋体", PhysicalFonts.get("SimSun"));
fontMapper.put("微软雅黑", PhysicalFonts.get("Microsoft Yahei"));
fontMapper.put("黑体", PhysicalFonts.get("SimHei"));
fontMapper.put("楷体", PhysicalFonts.get("KaiTi"));
fontMapper.put("新宋体", PhysicalFonts.get("NSimSun"));
fontMapper.put("华文行楷", PhysicalFonts.get("STXingkai"));
fontMapper.put("华文仿宋", PhysicalFonts.get("STFangsong"));
fontMapper.put("宋体扩展", PhysicalFonts.get("simsun-extB"));
fontMapper.put("仿宋", PhysicalFonts.get("FangSong"));
fontMapper.put("仿宋_GB2312", PhysicalFonts.get("FangSong_GB2312"));
fontMapper.put("幼圆", PhysicalFonts.get("YouYuan"));
fontMapper.put("华文宋体", PhysicalFonts.get("STSong"));
fontMapper.put("华文中宋", PhysicalFonts.get("STZhongsong"));
}
public static String docxToPdf(String docxPath){
OutputStream outputStream = null;
String pdfPath = docxPath.replace(".docx", ".pdf");
try {
WordprocessingMLPackage mlPackage = WordprocessingMLPackage.load(new File(docxPath));
mlPackage.setFontMapper(fontMapper);
outputStream = new BufferedOutputStream(new FileOutputStream(pdfPath));
FOSettings foSettings = Docx4J.createFOSettings();
foSettings.setWmlPackage(mlPackage);
Docx4J.toFO(foSettings, outputStream, Docx4J.FLAG_EXPORT_PREFER_XSL);
} catch (Exception ex) {
logger.error("docx转PDF失败!", ex);
pdfPath = null;
} finally {
IOUtils.closeQuietly(outputStream);
}
return pdfPath;
}
三、documents4j
优点:样式稳定、性能高
缺点:要依赖本地的office软件做转换,在linux下要调远程服务来转换
调用方式:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37public static String docxToPdf(String docxPath){
String pdfPath = docxPath.replace(".docx", ".pdf");
boolean success = getConverter()
.convert(new File(docxPath))
.as(DocumentType.DOCX)
.to(new File(pdfPath))
.as(DocumentType.PDF).execute();
logger.debug("Word转换PDF结果:{}", success);
return success ? pdfPath : null;
}
public static IConverter getConverter(){
if (converter == null) {
String conversionServerUrl = Global.getConfig("conversionServer.url");
// 如果配置了远程转换服务器地址,则初始化远程转换对象
if (StringUtils.isNotBlank(conversionServerUrl)) {
if (!conversionServerUrl.startsWith("http"))
conversionServerUrl = "http://" + conversionServerUrl;
converter = RemoteConverter.builder()
.baseFolder(new File(POfficeConstants.TEMP_SAVE_PATH))
.workerPool(20, 25, 2, TimeUnit.SECONDS)
.requestTimeout(120, TimeUnit.SECONDS)
.baseUri(conversionServerUrl)
.build();
} else {
// 创建本地转换对象
converter = LocalConverter.builder()
.baseFolder(new File(POfficeConstants.TEMP_SAVE_PATH))
.workerPool(20, 25, 2, TimeUnit.SECONDS)
.processTimeout(2L, TimeUnit.MINUTES)
.build();
}
}
return converter;
}
四、jacob
优点:样式稳定、性能高
缺点:只支持window系统且服务器要安装office软件,并发量大时会有瓶颈
可以单独部署一台windows服务器,提供文档转换服务
基于spring boot的转换服务器例子:converter
五、pageOffice
优点:兼容性好,性能高
缺点:收费,客户端需要安装office软件和卓正控件,偶尔会出现兼容性问题
具体实现是客户的浏览器利用卓正控件打开Word文档,调用卓正提供的js接口,将文档保存为PDF,上传到服务器,服务器将上传的PDF做处理(如添加水印等)后提供给客户下载。
因为Word转PDF的过程是在客户的电脑上实现的,所以服务器基本没什么压力,但客户的电脑需要安装office软件和卓正控件。
总结
实际应用中,前面四种方案都用过,踩了不少坑,比如libreoffice,要考虑生产环境低内核版本问题,docx4j的转换后样式错乱问题,documents4j不稳定,会出现进程阻塞,jacob只支持window服务器
目前实际采用的是卓正的pageOffice。