java把word转html或txt


最近项目中需要在页面中预览word文件,虽说word本身就可以在页面中打开,但是有两个弊端,1是可客户端必须安装word, 2是客户端的环境以及office版本有差异,会造成预览不稳定。在网上找了一下,发现poi可以把word装换成txt,但是格式都丢了,只有光秃秃的文本,又搜jacob, 网友们众说纷纭, 最后还是自己sourceforge上下载jacob并阅读doc搞定了.

1 goto http://sourceforge.net/projects/jacob-project/ and download latest library of jacob.

下载的zip文件结构如下:

2 intel cpu的机器拷贝jacob-1.15-M3-x86.dll到%JAVA_HOME%/jre/bin, AMD cpu的机器拷贝jacob-1.15-M3-x64.dll. 不过请确保jre目录是你正在使用的jre, 因为现在很多eclipse版本自己带jre. 这个在eclipse windows-> preferences -> installed jres可以查看.

3 拷贝jacob.jar到你项目目录的lib下面并确保加入到了classpath.

准备工作完毕, 现在就写程序了.

view plaincopy to clipboardprint?
import com.jacob.activeX.ActiveXComponent;  
import com.jacob.com.Dispatch;  
import com.jacob.com.Variant;  
public class Test {  
    public static void main(String[] args) {  
        ActiveXComponent app = new ActiveXComponent("Word.Application");  
        app.setProperty("Visible", new Variant(false));  
        Dispatch doc1 = app.getProperty("Documents").toDispatch();  
        //打开aaaa.doc  
        Dispatch doc2 = Dispatch.invoke(  
            doc1,   
            "Open",   
            Dispatch.Method,  
            new Object[]{"e://aaaa.doc", new Variant(false), new Variant(true)},  
            new int[1]  
        ).toDispatch();  
        //另存为aaaa.html  
        Dispatch.invoke(  
            doc2,  
            "SaveAs",   
            Dispatch.Method,   
            new Object[]{  
                "c://aaaa.html",   
                new Variant(8)//7为txt格式, 8保存为html格式  
            },   
            new int[1]  
        );  
        Variant f = new Variant(false);  
        Dispatch.call(doc2, "Close", f);  
    }  

import com.jacob.activeX.ActiveXComponent;
import com.jacob.com.Dispatch;
import com.jacob.com.Variant;
public class Test {
 public static void main(String[] args) {
  ActiveXComponent app = new ActiveXComponent("Word.Application");
  app.setProperty("Visible", new Variant(false));
  Dispatch doc1 = app.getProperty("Documents").toDispatch();
  //打开aaaa.doc
  Dispatch doc2 = Dispatch.invoke(
   doc1,
   "Open",
   Dispatch.Method,
   new Object[]{"e://aaaa.doc", new Variant(false), new Variant(true)},
   new int[1]
  ).toDispatch();
  //另存为aaaa.html
  Dispatch.invoke(
   doc2,
   "SaveAs",
   Dispatch.Method,
   new Object[]{
    "c://aaaa.html",
    new Variant(8)//7为txt格式, 8保存为html格式
   },
   new int[1]
  );
  Variant f = new Variant(false);
  Dispatch.call(doc2, "Close", f);
 }
}

使用起来很简单.

当然jacob不光可以做word to html, 还可以做很多事情:

Jacob is a Java library that lets Java applications communicate with Microsoft Windows DLLs or COM libraries. It does this through the use of a custom DLL that the Jacob Java classes communicate with via JNI. The library and dll isolate the Java developer from the underlying windows libraries so that the Java developer does not have to write custom JNI code.

更多的功能, 只有在需要的时候自己摸索了.

 

本文来自CSDN博客,转载请标明出处:http://blog.csdn.net/sunxing007/archive/2010/05/19/5609404.aspx

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
以下是Java实现Word/Pdf/TXTHTML的示例代码,您可以根据需要进行修改以实现换为Word或Pdf: ```java import java.io.*; import org.apache.poi.hwpf.HWPFDocument; import org.apache.poi.hwpf.converter.WordToHtmlConverter; import org.apache.poi.hwpf.usermodel.Range; import org.apache.poi.xwpf.converter.core.BasicURIResolver; import org.apache.poi.xwpf.converter.core.FileImageExtractor; import org.apache.poi.xwpf.converter.core.IURIResolver; import org.apache.poi.xwpf.converter.core.XWPFConverterException; import org.apache.poi.xwpf.converter.html.*; import org.apache.poi.xwpf.usermodel.XWPFDocument; public class ConvertToHtml { public static void main(String[] args) throws Exception { String inputFile = "input.docx"; String outputFile = "output.html"; convertToHtml(inputFile, outputFile); } public static void convertToHtml(String inputFile, String outputFile) throws Exception { if (inputFile.endsWith(".docx")) { convertDocxToHtml(inputFile, outputFile); } else if (inputFile.endsWith(".doc")) { convertDocToHtml(inputFile, outputFile); } else if (inputFile.endsWith(".txt")) { convertTxtToHtml(inputFile, outputFile); } else { throw new IllegalArgumentException("Unsupported file type: " + inputFile); } } private static void convertDocxToHtml(String inputFile, String outputFile) throws IOException, XWPFConverterException { try (InputStream in = new FileInputStream(inputFile); OutputStream out = new FileOutputStream(outputFile)) { XWPFDocument document = new XWPFDocument(in); IURIResolver resolver = new BasicURIResolver("."); FileImageExtractor extractor = new FileImageExtractor(new File(".")); XHTMLConverter.getInstance().convert(document, out, resolver, extractor); } } private static void convertDocToHtml(String inputFile, String outputFile) throws IOException { try (InputStream in = new FileInputStream(inputFile); HWPFDocument document = new HWPFDocument(in); OutputStream out = new FileOutputStream(outputFile)) { WordToHtmlConverter converter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument()); converter.processDocument(document); Range range = document.getRange(); out.write(converter.getDocument().getBytes()); } catch (ParserConfigurationException e) { throw new IOException(e); } } private static void convertTxtToHtml(String inputFile, String outputFile) throws IOException { try (BufferedReader in = new BufferedReader(new FileReader(inputFile)); PrintWriter out = new PrintWriter(new FileWriter(outputFile))) { out.println("<html>"); out.println("<body>"); String line; while ((line = in.readLine()) != null) { out.println(line); out.println("<br>"); } out.println("</body>"); out.println("</html>"); } } } ``` 您可以使用此代码将Word、Pdf、Txt文件换为HTML,然后使用其他工具将HTML换为Pdf或Word。例如,您可以使用iText库将HTML换为Pdf,或使用Apache POI将HTML换为Word
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值