在jsp中打开word,或者提取内容的一些方式:
1、用流的方式,界面呈现为:提示用户下载,或直接打开
<%@ page language="java" contentType="application/msword; charset=gb2312" pageEncoding="gb2312"%>
<%@page import="java.io.*"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<%@ include file="/common/page_include.jsp"%>
<meta http-equiv="Content-Type" content="application/msword; charset=gb2312" />
</head>
<body>
<%
File file = new File (request.getSession().getServletContext().getRealPath("/") +"/frame/帮助文档.docx" );
System.out.println("=======--------") ;
FileInputStream is = null;
OutputStream os = null;
try {
request.setCharacterEncoding("iso_8859_1");
//response.reset();
response.setContentType("application/vnd.ms-word;charset=8859_1");
response.setHeader("Content-disposition","attachment;filename="+"帮助文档.docx");
is = new FileInputStream (file);
os = response.getOutputStream();
out.clear();
out = pageContext.pushBody();
byte buff[]=new byte[1024];
int len=0;//表示实际每次读取了多少个字节
while((len=is.read(buff))>0){
os.write(buff, 0, len);
}
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
os.flush();
is.close();
// file.delete();
} catch (IOException e) {
e.printStackTrace();
}
}
%>
</body>
</html>
2、用poi解析word中的内容,只做到提取文字,提取图片有问题,而且文字显示不是按原来word文档里的格式了。
可以参考:http://blog.csdn.net/hemingwang0902/article/details/4381598
3、java调用pageOffice来实现
http://wenku.baidu.com/link?url=8G7LTQaMVxQL1AdX-Pf9xghKE0wTq2psCa_xIsPShwpCLC6gnVULrpwxr9G3fquPhcpkPoYBDHSCNqnZdnZ5W9qpWIqZa3xEHiDOpBhI8AS