最近做一个项目,需要用到文档的预览功能,在网上找了很多方法,比如使用poi转成html或者使用openoffice转成pdf在线预览,最后调研决定使用openoffice来做。
首先下载安装openoffice
windows安装步骤
- 官网下载:http://www.openoffice.org/
然后就是等待下载完成,安装即可,安装目录最好默认,因为程序调用的时候会用到他的默认地址,如果改了就得改程序。
windows下启动程序
默认安装完就会启动,如果想默认自动启动,可以使用命令执行:
先进入到安装目录
cd C:\Program Files (x86)\OpenOffice 4\program
输入命令:
soffice -headless-accept="socket,host=127.0.0.1,port=8100;urp;"-nofirststartwizard
这样windows安装openoffice就成功了
linux安装步骤
- 官网下载:http://www.openoffice.org/
然后在目录下解压:
tar -zxvf Apache_OpenOffice_4.1.7_Linux_x86-64_install-deb_zh-CN.tar.gz
linux下启动程序
解压后有几个目录:
- 进入到RPMS中,执行 rpm -ivh *rpm 安装该目录下所有的 rpm 文件
rpm -ivh *rpm
- 完成后进入到 /opt/openoffice4/program 目录下
cd /opt/openoffice4/program
- 执行
/opt/openoffice4/program/soffice "-accept=socket,host=127.0.0.1,port=8100;urp;" -headless -nofirststartwizard &
启动后台程序
这里如果报错,解决方案参考:OpenOffice 在 Linux 下安装使用
若执行启动命令时报错 /opt/openoffice4/program/soffice.bin: error while loading shared libraries: libXext.so.6: cannot open shared object file: No such file or directory ,则需要安装 libXext 依赖包,根据 Linux 版本选择安装类型
执行 yum install libXext.x86_64
在 /usr/lib64 或 /usr/lib 中找到 libXext.so.6 文件,复制到 /opt/openoffice4/program/ 目录中
对复制过来的文件执行 chmod 777 libXext.so.6
如果报错:no suitable windowing system found, exiting.那么需要装一下桌面包
yum groupinstall "X Window System"
- 解决后继续执行
/opt/openoffice4/program/soffice "-accept=socket,host=127.0.0.1,port=8100;urp;" -headless -nofirststartwizard &
- 验证
使用 ps -ef | grep openoffice 可以查到进程
使用 netstat -tunlp |grep 8100 可以查到默认的openoffice端口
这样安装就完成了
如果安装有问题,那么需要卸载掉重新安装
先查一下安装的openoffice相关包
rpm -qa | grep openoffice
通常是全部卸载就行了
rpm -e `rpm -qa | grep openoffice`
使用
引入依赖
<!-- openoffice文档预览插件-->
<dependency>
<groupId>org.openoffice</groupId>
<artifactId>juh</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.openoffice</groupId>
<artifactId>jurt</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.openoffice</groupId>
<artifactId>ridl</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>org.openoffice</groupId>
<artifactId>unoil</artifactId>
<version>4.1.2</version>
</dependency>
<dependency>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
<version>1.4.10</version>
</dependency>
<!--office文件转html-->
<!-- https://mvnrepository.com/artifact/com.artofsolving/jodconverter-maven-plugin -->
<dependency>
<groupId>com.artofsolving</groupId>
<artifactId>jodconverter-maven-plugin</artifactId>
<version>2.2.1</version>
</dependency>
注意 jodconverter-maven-plugin 这个2.2.1是不支持docx转换的,所以要么下载2.2.2的包放到你的私服,要么重写他的方法
重写解决docx不能转化问题
重写时需要建立相同的包路径:com.artofsolving.jodconverter
package com.artofsolving.jodconverter;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
public class BasicDocumentFormatRegistry implements DocumentFormatRegistry {
private List documentFormats = new ArrayList();
public BasicDocumentFormatRegistry() {
}
public void addDocumentFormat(DocumentFormat documentFormat) {
this.documentFormats.add( documentFormat );
}
protected List getDocumentFormats() {
return this.documentFormats;
}
@Override
public DocumentFormat getFormatByFileExtension(String extension) {
if (extension == null) {
return null;
} else {
if (extension.contains("doc")) {
extension = "doc";
}
if (extension.contains("ppt")) {
extension = "ppt";
}
if (extension.contains("xls")) {
extension = "xls";
}
String lowerExtension = extension.toLowerCase();
Iterator it = this.documentFormats.iterator();
DocumentFormat format;
do {
if (!it.hasNext()) {
return null;
}
format = (DocumentFormat) it.next();
} while (!format.getFileExtension().equals( lowerExtension ));
return format;
}
}
@Override
public DocumentFormat getFormatByMimeType(String mimeType) {
Iterator it = this.documentFormats.iterator();
DocumentFormat format;
do {
if (!it.hasNext()) {
return null;
}
format = (DocumentFormat) it.next();
} while (!format.getMimeType().equals( mimeType ));
return format;
}
}
工具类
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.ConnectException;
import java.text.SimpleDateFormat;
import java.util.Date;
import com.artofsolving.jodconverter.DocumentConverter;
import com.artofsolving.jodconverter.openoffice.connection.OpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.connection.SocketOpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.converter.OpenOfficeDocumentConverter;
import com.artofsolving.jodconverter.openoffice.converter.StreamOpenOfficeDocumentConverter;
import org.springframework.beans.factory.annotation.Value;
public class Office2HtmlOrPdfUtil {
@Value(value = "openOffice.port")
private static Integer port=8100;
@Value(value = "openOffice.host")
private static String host="127.0.0.1";
private static Office2HtmlOrPdfUtil office2HtmlOrPdfUtil;
/** * 获取Doc2HtmlUtil实例 */
public static synchronized Office2HtmlOrPdfUtil getDoc2HtmlUtilInstance() {
if (office2HtmlOrPdfUtil == null) {
office2HtmlOrPdfUtil = new Office2HtmlOrPdfUtil();
}
return office2HtmlOrPdfUtil;
}
/*** 转换文件成pdf */
public static String file2pdf(InputStream fromFileInputStream, String toFilePath,String type) throws IOException {
Date date = new Date();
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String timesuffix = sdf.format(date);
String docFileName = null;
String htmFileName = null;
if(".doc".equals(type)){
docFileName = "doc_" + timesuffix + ".doc";
htmFileName = "doc_" + timesuffix + ".pdf";
}else if(".docx".equals(type)){
docFileName = "docx_" + timesuffix + ".docx";
htmFileName = "docx_" + timesuffix + ".pdf";
}else if(".xls".equals(type)){
docFileName = "xls_" + timesuffix + ".xls";
htmFileName = "xls_" + timesuffix + ".pdf";
}else if(".ppt".equals(type)){
docFileName = "ppt_" + timesuffix + ".ppt";
htmFileName = "ppt_" + timesuffix + ".pdf";
}else{
return null;
}
File htmlOutputFile = new File(toFilePath + File.separatorChar + htmFileName);
File docInputFile = new File(toFilePath + File.separatorChar + docFileName);
if (htmlOutputFile.exists())
htmlOutputFile.delete();
htmlOutputFile.createNewFile();
if (docInputFile.exists())
docInputFile.delete();
docInputFile.createNewFile();
/*** 由fromFileInputStream构建输入文件 */
try {
OutputStream os = new FileOutputStream(docInputFile);
int bytesRead = 0;
byte[] buffer = new byte[1024 * 8];
while ((bytesRead = fromFileInputStream.read(buffer)) != -1) {
os.write(buffer, 0, bytesRead);
}
os.close();
fromFileInputStream.close();
} catch (IOException e) {
}
// 连接服务
OpenOfficeConnection connection = new SocketOpenOfficeConnection(host,port);
try {
connection.connect();
} catch (ConnectException e) {
System.err.println("文件转换出错,请检查OpenOffice服务是否启动。");
}
// convert 转换
DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
converter.convert(docInputFile, htmlOutputFile);
connection.disconnect();
// 转换完之后删除word文件
docInputFile.delete();
return htmFileName;
}
/**文件转换成Html*/
public static String file2Html (InputStream fromFileInputStream, String toFilePath,String type) throws Exception{
Date date = new Date();
SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMddHHmmss");
String timesuffix = sdf.format(date);
String docFileName = null;
String htmFileName = null;
if("doc".equals(type)){
docFileName = timesuffix.concat(".doc");
htmFileName = timesuffix.concat(".html");
}else if("xls".equals(type)){
docFileName = timesuffix.concat(".xls");
htmFileName = timesuffix.concat(".html");
}else if("ppt".equals(type)){
docFileName = timesuffix.concat(".ppt");
htmFileName = timesuffix.concat(".html");
}else if("txt".equals(type)){
docFileName = timesuffix.concat(".txt");
htmFileName = timesuffix.concat(".html");
}else if("pdf".equals(type)){
docFileName = timesuffix.concat(".pdf");
htmFileName = timesuffix.concat(".html");
}else{
return null;
}
File htmlOutputFile = new File(toFilePath + File.separatorChar + htmFileName);
File docInputFile = new File(toFilePath + File.separatorChar + docFileName);
if (htmlOutputFile.exists()){
htmlOutputFile.delete();
}
htmlOutputFile.createNewFile();
docInputFile.createNewFile();
/**
* 由fromFileInputStream构建输入文件
*/
int bytesRead = 0;
byte[] buffer = new byte[1024 * 8];
OutputStream os = new FileOutputStream(docInputFile);
while ((bytesRead = fromFileInputStream.read(buffer)) != -1) {
os.write(buffer, 0, bytesRead);
}
os.close();
fromFileInputStream.close();
OpenOfficeConnection connection = new SocketOpenOfficeConnection(host,port);
connection.connect();
// convert
DocumentConverter converter = new StreamOpenOfficeDocumentConverter(connection);
converter.convert(docInputFile, htmlOutputFile);
connection.disconnect();
// 转换完之后删除word文件
docInputFile.delete();
return htmFileName;
}
public static void main(String[] args) throws IOException {
Office2HtmlOrPdfUtil coc2HtmlUtil = getDoc2HtmlUtilInstance ();
File file = null;
FileInputStream fileInputStream = null;
file = new File("D:\\logs\\休假申请单.docx");
fileInputStream = new FileInputStream(file);
String fileName = coc2HtmlUtil.file2pdf(fileInputStream, "D:\\logs",".docx");
System.out.println(fileName);
}
}
测试一下
执行工具中的main方法试一下:
成功!
PS:在windows下转的基本没问题,但是linux下转的pdf有些中文居然没有显示出来
解决办法参考:OpenOffice4的使用
复制window系统的字体库
C:\Windows\Fonts 到 /usr/share/fonts
然后更新缓存
fc-cache -fv
在linux,把字体文件拷贝到字体目录后,执行 fc-cache -fv命令, fc-cache -fv扫描字体目录并生成字体信息的缓存,然后应用程序就可以立即使用这些新安装的字体
然后重启oppenoffice就行了(貌似不用重启也行,可以自己先试一下)