word或Excel程序是以一种COM组件的形式存在的。如果能从Java中调用word的COM组件,就能够使用它的方法获取Word文档中的文本信息,目前网上也有很多提供这种操作的工具。使用jacob前应确保本机安装有Word的应用程序,否则无法建立Java-COM桥,进而无法解析。
jacob的下载地址为http://sourceforge.net/project/showfiles.php?group_id=109543&package_id=118368
下载到本机后解压缩。将jacob.jar复制到项目目录,增加到Java Build Path,然后将jacob.dll文件放入系统盘的system32文件夹下。如果没有放的话,会出现"no jacob-1.14.3-x86 in java.library.path"的异常。
1
package
test;
2![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
3
import
com.jacob.activeX.ActiveXComponent;
4
import
com.jacob.com.Dispatch;
5
import
com.jacob.com.Variant;
6![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
7![ExpandedBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
public
class
WordReader
{
8![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
9![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
public static void extractDoc(String inputFile, String outputFile)
{
10
boolean flag = false;
11
//打开word应用程序,生成一个ActivexComponent对象
12
ActiveXComponent app = new ActiveXComponent("Word.Application");
13![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
try
{
14
//设置Word不可见
15
app.setProperty("Visible", new Variant(false));
16
//打开word文件
17
Dispatch doc1 = app.getProperty("Documents").toDispatch();
18![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
Dispatch doc2 = Dispatch.invoke(doc1, "Open", Dispatch.Method, new Object[]
{ inputFile, new Variant(false),new Variant(true)},new int[1]).toDispatch();
19
//作文txt格式保存到临时文件
20![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
Dispatch.invoke(doc2, "SaveAs", Dispatch.Method, new Object[]
{outputFile, new Variant(7)}, new int[1]);
21
//关闭word
22
Variant f = new Variant(false);
23
Dispatch.call(doc2, "Close", f);
24
flag = true;
25![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
}catch(Exception e)
{
26
e.printStackTrace();
27![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
}finally
{
28![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
app.invoke("Quit", new Variant[]
{});
29
}
30![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
if(flag == true)
{
31
System.out.println("Transformed Successfully!");
32![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
}else
{
33
System.out.println("Transform Failed");
34
}
35
36
}
37
38![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
/** *//**
39
* @param args
40
*/
41![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
public static void main(String[] args)
{
42
//对测试文本进行处理
43
WordReader.extractDoc("d:/index/网点地址.doc", "d:/index/网点地址.txt");
44
}
45
}
46
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
2
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
3
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
4
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
5
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
6
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)
7
![ExpandedBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![ContractedBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
8
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
9
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
10
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
11
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
12
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
13
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
14
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
15
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
16
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
17
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
18
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
19
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
20
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
21
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
22
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
23
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
24
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
25
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
26
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
27
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
28
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
29
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
30
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
31
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
32
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
33
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
34
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
35
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
36
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
37
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
38
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
39
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
40
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
41
![ExpandedSubBlockStart.gif](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
![ContractedSubBlock.gif](https://www.cnblogs.com/Images/OutliningIndicators/ContractedSubBlock.gif)
![](https://www.cnblogs.com/Images/dot.gif)
42
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
43
![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
44
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockEnd.gif)
45
![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockEnd.gif)
46
![](https://www.cnblogs.com/Images/OutliningIndicators/None.gif)