[摘]用Java生成Word文档

最新推荐文章于 2023-04-21 11:10:39 发布

随风_csdn

最新推荐文章于 2023-04-21 11:10:39 发布

阅读量1.4w

点赞数

分类专栏： Java 文章标签： java string import 文档 null header

Java 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

开发中隔三叉五的就要用到Word，经常被搞得不胜其烦，不过这次找到了不少好例子，干脆将他们都摘了过来，内容如下：

1. poi是apache的一个项目，不过就算用poi你可能都觉得很烦，不过不要紧，这里提供了更加简单的一个接口给你：
下载经过封装后的poi包：

这个包就是：tm-extractors-0.4.jar

下载之后，放到你的classpath就可以了，下面是如何使用它的一个例子：
import java.io.*;
import org.textmining.text.extraction.WordExtractor;
/**
* Title: pdf extraction

* Description: email:chris@matrix.org.cn

* Copyright: Matrix Copyright (c) 2003
* Company: Matrix.org.cn

* @author chris
* @version 1.0,who use this example pls remain the declare
*/

public class PdfExtractor {
public PdfExtractor() {
}
public static void main(String args[]) throws Exception
{
    FileInputStream in = new FileInputStream ("c://a.doc");
    WordExtractor extractor = new WordExtractor();
    String str = extractor.extractText(in);
    System.out.println("the result length is"+str.length());
    System.out.println("the result is"+str);

}
}

如果没有这个包呢？就用下面这段 ：

read word

代码

public class WordExtractor {
public WordExtractor() {
}
public String extractText(InputStream in) throws IOException {
ArrayList text = new ArrayList();
POIFSFileSystem fsys = new POIFSFileSystem(in);
DocumentEntry headerProps = (DocumentEntry) fsys.getRoot().getEntry("WordDocument");
DocumentInputStream din = fsys.createDocumentInputStream("WordDocument");
byte[] header = new byte[headerProps.getSize()];
din.read(header);
din.close();
// Prende le informazioni dall'header del documento
int info = LittleEndian.getShort(header, 0xa);
boolean useTable1 = (info & 0x200) != 0;
//boolean useTable1 = true;
// Prende informazioni dalla piece table
int complexOffset = LittleEndian.getInt(header, 0x1a2);
//int complexOffset = LittleEndian.getInt(header);
String tableName = null;
if (useTable1) {
tableName = "1Table";
} else {
tableName = "0Table";
}
DocumentEntry table = (DocumentEntry) fsys.getRoot().getEntry(tableName);
byte[] tableStream = new byte[table.getSize()];
din = fsys.createDocumentInputStream(tableName);
din.read(tableStream);
din.close();
din = null;
fsys = null;
table = null;
headerProps = null;
int multiple = findText(tableStream, complexOffset, text);
StringBuffer sb = new StringBuffer();
int size = text.size();
tableStream = null;
for (int x = 0; x < size; x++) {
WordTextPiece nextPiece = (WordTextPiece) text.get(x);
int start = nextPiece.getStart();
int length = nextPiece.getLength();
boolean unicode = nextPiece.usesUnicode();
String toStr = null;
if (unicode) {
toStr = new String(header, start, length * multiple, "UTF-16LE");
} else {
toStr = new String(header, start, length, "ISO-8859-1");
}
sb.append(toStr).append(" ");
}
return sb.toString();
}
private static int findText(byte[] tableStream, int complexOffset, ArrayList text)
throws IOException {
//actual text
int pos = complexOffset;
int multiple = 2;
//skips through the prms before we reach the piece table. These contain data
//for actual fast saved files
while (tableStream[pos] == 1) {
pos++;
int skip = LittleEndian.getShort(tableStream, pos);
pos += 2 + skip;
}
if (tableStream[pos] != 2) {
throw new IOException("corrupted Word file");
} else {
//parse out the text pieces
int pieceTableSize = LittleEndian.getInt(tableStream, ++pos);
pos += 4;
int pieces = (pieceTableSize - 4) / 12;
for (int x = 0; x < pieces; x++) {
int filePos =
LittleEndian.getInt(tableStream, pos + ((pieces + 1) * 4) + (x *"/images/forum/smiles/icon_cool.gif"/> + 2);
boolean unicode = false;
if ((filePos & 0x40000000) == 0) {
unicode = true;
} else {
unicode = false;
multiple = 1;
filePos &= ~(0x40000000); //gives me FC in doc stream
filePos /= 2;
}
int totLength =
LittleEndian.getInt(tableStream, pos + (x + 1) * 4)
- LittleEndian.getInt(tableStream, pos + (x * 4));
WordTextPiece piece = new WordTextPiece(filePos, totLength, unicode);
text.add(piece);
}
}
return multiple;
}
public static void main(String[] args){
WordExtractor w = new WordExtractor();
POIFSFileSystem ps = new POIFSFileSystem();
try{
File file = new File("C://test.doc");
InputStream in = new FileInputStream(file);
String s = w.extractText(in);
System.out.println(s);
}catch(Exception e){
e.printStackTrace();
}
}
}
class WordTextPiece {
private int _fcStart;
private boolean _usesUnicode;
private int _length;
public WordTextPiece(int start, int length, boolean unicode) {
_usesUnicode = unicode;
_length = length;
_fcStart = start;
}
public boolean usesUnicode() {
return _usesUnicode;
}
public int getStart() {
return _fcStart;
}
public int getLength() {
return _length;
}
}

write word

代码

public boolean writeWordFile(String path, String content) {
boolean w = false;
try {
// byte b[] = content.getBytes("ISO-8859-1");
byte b[] = content.getBytes();
ByteArrayInputStream bais = new ByteArrayInputStream(b);
POIFSFileSystem fs = new POIFSFileSystem();
DirectoryEntry directory = fs.getRoot();
DocumentEntry de = directory.createDocument("WordDocument", bais);
FileOutputStream ostream = new FileOutputStream(path);
fs.writeFilesystem(ostream);
bais.close();
ostream.close();
} catch (IOException e) {
e.printStackTrace();
}
return w;
}

写操作的代码还是有些问题：打开WORD时提示要选择字符类型，希望能改进!

这样写文件有问题，因为不是word格式。

当然这几个jar是少不了的
poi-2.5.1-final-20040804.jar
poi-contrib-2.5.1-final-20040804.jar
poi-scratchpad-2.5.1-final-20040804.jar

如果要直接用Jakarta POI HWPF，没提供编译好的下载，只是原码了，自己编译吧

jakarta POI开源项目组HWPF(在下载后的scratchpad目录里)是操作word文档,在这里作了个简单的例子
下载地址: http://www.apache.org/dist/jakarta/Poi/



<!--r /> HWPFDocument doc = new HWPFDocument(new FileInputStream("g://a.doc"));

Range r = doc.getRange (); //取得word文档的范围

StyleSheet styleSheet = doc.getStyleSheet ();

int sectionLevel = 0;

int lenParagraph = r.numParagraphs ();//取得段落数

int c=r.numCharacterRuns();

int b=r.numSections();

String s=r.text();

boolean inCode = false;

// Paragraph p;

for (int x = 0; x < lenParagraph; x++)

{ Paragraph p = r.getParagraph (x);

String text = p.text (); -->

<!--r /> if (text.trim ().length () == 0)

{ continue; }

}

//doc.write(new FileOutputStream("g://b.doc")); -->
char:

section:

text:

2. java操作word，可以试试java2word

java2word 是一个在java程序中调用 MS Office Word 文档的组件(类库)。该组件提供了一组简单的接口，以便java程序调用他的服务操作Word 文档。
这些服务包括：
打开文档、新建文档、查找文字、替换文字，
插入文字、插入图片、插入表格，
在书签处插入文字、插入图片、插入表格等。
填充数据到表格中读取表格数据
1.1版增强的功能：
@指定文本样式，指定表格样式。如此，则可动态排版word文档。
@填充表格数据时，可指定从哪行哪列开始填充。配合输入数据的大小，你可以修改表中的任意部分，甚至只修改一个单元格的内容。
@合并单元格。
更多激动人心的功能见详细说明： http://www.heavenlake.com/java2word/doc
免费下载:http://dev.heavenlake.com:81/developer/listthreads?forum=8

3. 用java生成word文档

java

作者 javasky @ 2006-06-10 14:22:04

这几日, 公司有个项目, 要用java生成word文档, 在网上找来找去也没有找到好的生成word文档的库, 找到apache的POI可以使用, 但是所有的release版中也没有支持word的class. 只能从svn上下载源代码编译.
后来发现java支持rtf格式的文档, word也支持, 于是乎便使用此产生word文档了. 呵呵..
java支持的rtf文档功能不是很强大, 我们可以借助于一些开源的库, 比如: itext就可以很好的支持. itext上有很多例子, 有兴趣的可以上去看一下, 这里就不摘录了.
但是itext比较大要1.4M, 不是很喜欢. 在sf上找来找去, 发现一个更小的库, 尽管功能不是很强大, 基本的功能都有, 他就是srw(Simple RTF Writer目前它的版本是0.6,好久都没有人维护了).
srw内置了很多例子,  例如: 我们要写一个简单的rtf, 我们只需要这么写:
public class TestSimpleRtf {

    private static final String FILE_NAME = "out_testsimplertf.rtf";

    public static void main(String[] args) {
        try {
            // RTF Dokument generieren (in Querformat)
            RTFDocument doc = new RTFDocument(PageSize.DIN_A4_QUER);
            // Anzeige-Zoom und Ansicht definieren
            doc.setViewscale(RTFDocument.VIEWSCALE_FULLPAGE);    // Anzeige-Zoom auf "komplette Seite" einstellen
            doc.setViewkind(RTFDocument.VIEWKIND_PAGELAYOUT);    // ViewMode auf Seitenlayout stellen

            Paragraph absatz = new Paragraph(18, 0, 16, Font.ARIAL, new TextPart("Simple RTF Writer Testdokument"));
            absatz.setAlignment(Paragraph.ALIGN_CENTER);
            doc.addParagraph(absatz);
            File savefile = new File(FILE_NAME);
            doc.save(savefile);
            System.out.println("Neues RTF Dokument erstellt: " + savefile.getAbsolutePath());
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}
用法很简单, 但是功能很少, 比如没有table的功能, 不能设置打印方向等问题. 不过这个基本上就够用了.
后来, 我们的项目要求横向打印, 这可难坏了. 没办法, 自己查找word的rtf格式库, 拓展横向打印功能, 目前已经完成...
import com.itseasy.rtf.RTFDocument;
import com.itseasy.rtf.text.PageSize;

public class MyRTFDocument extends RTFDocument {
    public static final int ORIENTATION_PORTRAIT = 0;
    public static final int ORIENTATION_LANDSCAPE = 1;
    private int orientation;

    /**
     *
     */
    public MyRTFDocument() {
        super();
    }
    /**
     * @param arg0
     */
    public MyRTFDocument(PageSize arg0) {
        super(arg0);
    }
    /* (non-Javadoc)
     * @see com.itseasy.rtf.RTFDocument#getDocumentAsString()
     */
    protected String getDocumentAsString() {
        StringBuffer sb = new StringBuffer(super.getDocumentAsString());
        int pos = -1;
        if (ORIENTATION_LANDSCAPE == orientation) {
            pos = sb.indexOf("paperw");
            if (pos > 0) {
                sb.insert(pos, "lndscpsxn");
            }
        }
        pos = 0;
        while((pos = sb.indexOf("pardplain", pos)) > 0){
            pos = sb.indexOf("{", pos);
            sb.insert(pos, "dbchaf2");
        }
        return sb.toString();
    }
    /**
     * @return Returns the orientation.
     */
    public int getOrientation() {
        return orientation;
    }
    /**
     * @param orientation The orientation to set.
     */
    public void setOrientation(int orientation) {
        this.orientation = orientation;
    }

}

4. java生成word,html文件并将内容保存至数据库
Posted on 2005-12-15 17:19 Kela 阅读(2715) 评论(3) 编辑收藏引用
在最近的一个项目中需要将一段字符类型的文本存为word，html并要将word的内容保存在数据库中，于是就有了如下的一个工具类，希望能对碰到这样需求的朋友提供点帮助。
匆匆忙忙的就copy上来了，没有做一些删减，有一些多余的东西，有兴趣的朋友可以自行略去。我的注释相对比较清楚，可以按照自己的需求进行组合。
在操作word的地方使用了jacob(jacob_1.9)，这个工具网上很容易找到，将jacob.dll放置系统Path中，直接放在system32下也可以，jacob.jar放置在classPath中。

代码如下：WordBridge.java

/**
* WordBridge.java
*/
package com.kela.util;

import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.sql.Connection;
import java.sql.PreparedStatement;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

import com.jacob.activeX.ActiveXComponent;
import com.jacob.com.Dispatch;
import com.jacob.com.Variant;
import com.kela.db.PoolingDataSource;

/**
* 说明: 对word的操作 
*
* @author kela.kf@gmail.com
*/
public class WordBridge {

 Log log = LogFactory.getLog("WordBridgt");

 private ActiveXComponent MsWordApp = null;
 private Dispatch document = null;

 /**
 * 打开word
 * @param makeVisible, true显示word, false不显示word
 */
 public void openWord(boolean makeVisible) {
 if (MsWordApp == null) {
 MsWordApp = new ActiveXComponent("Word.Application");
 }

 Dispatch.put(MsWordApp, "Visible", new Variant(makeVisible));
 }

 /**
 * 创建新的文档
 *
 */
 public void createNewDocument() {
 Dispatch documents = Dispatch.get(MsWordApp, "Documents").toDispatch();
 document = Dispatch.call(documents, "Add").toDispatch();
 }

 /**
 * 关闭文档
 */
 public void closeDocument() {
 // 0 = wdDoNotSaveChanges
 // -1 = wdSaveChanges
 // -2 = wdPromptToSaveChanges
 Dispatch.call(document, "Close", new Variant(0));
 document = null;
 }

 /**
 * 关闭word
 *
 */
 public void closeWord() {
 Dispatch.call(MsWordApp, "Quit");
 MsWordApp = null;
 document = null;
 }

 /**
 * 插入文本
 * @param textToInsert 文本内容
 */
 public void insertText(String textToInsert) {
 Dispatch selection = Dispatch.get(MsWordApp, "Selection").toDispatch();
 Dispatch.put(selection, "Text", textToInsert);
 }

 /**
 * 保存文件
 * @param filename
 */
 public void saveFileAs(String filename) {
 Dispatch.call(document, "SaveAs", filename);
 }

 /**
 * 将word转换成html
 * @param htmlFilePath
 */
 public void wordToHtml(String htmlFilePath) {
 Dispatch.invoke(document,"SaveAs", Dispatch.Method, new Object[]{htmlFilePath,new Variant(8)}, new int[1]);
 }

 /**
 * 保存word的同时，保存一个html
 * @param text 需要保存的内容
 * @param wordFilePath word的路径
 * @param htmlFilePath html的路径
 * @throws LTOAException
 */
 public void wordAsDbOrToHtml(String text, String wordFilePath, String htmlFilePath) throws LTOAException {

 try {
 openWord(false);
 createNewDocument();
 insertText(text);
 saveFileAs(wordFilePath);
 wordToHtml(htmlFilePath);
 } catch (Exception ex) {
 log.error("错误 - 对word的操作发生错误");
 log.error("原因 - " + ex.getMessage());
 throw new LTOAException(LTOAException.ERR_UNKNOWN, "对word的操作发生错误("
 + this.getClass().getName() + ".wordAsDbOrToHtml())", ex);
 } finally {
 closeDocument();
 closeWord();
 }

 }

 /**
 * 将word保存至数据库
 * @param wordFilePath
 * @param RecordID
 * @throws LTOAException
 */
 public void wordAsDatabase(String wordFilePath, String RecordID) throws LTOAException {

 Connection conn = null;
 PreparedStatement pstmt = null;
 PoolingDataSource pool = null;

 File file = null;

 String sql = "";
 try {
 sql = " UPDATE Document_File SET FileBody = ? WHERE RecordID = ? ";

 pool = new PoolingDataSource();
 conn = pool.getConnection();

 file = new File(wordFilePath);
 InputStream is = new FileInputStream(file);
 byte[] blobByte = new byte[is.available()];
 is.read(blobByte);
 is.close();

 pstmt = conn.prepareStatement(sql);
 pstmt.setBinaryStream(1,(new ByteArrayInputStream(blobByte)), blobByte.length);
 pstmt.setString(2, RecordID);
 pstmt.executeUpdate();

 } catch (Exception ex) {
 log.error("错误 - 表 Document_File 更新数据发生意外错误");
 log.error("原因 - " + ex.getMessage());
 throw new LTOAException(LTOAException.ERR_UNKNOWN,
 "表Document_File插入数据发生意外错误("
 + this.getClass().getName() + ".wordAsDatabase())", ex);
 } finally {
 pool.closePrepStmt(pstmt);
 pool.closeConnection(conn);
 }
 }

 /**
 * 得到一个唯一的编号
 * @return 编号
 */
 public String getRecordID() {

 String sRecordID = "";

 java.util.Date dt=new java.util.Date();
 long lg=dt.getTime();
 Long ld=new Long(lg);
 sRecordID =ld.toString();

 return sRecordID;
 }

 /**
 * 得到保存word和html需要的路径
 * @param systemType 模块类型 givInfo, sw, fw
 * @param fileType 文件类型 doc, html
 * @param recID 文件编号
 * @return 路径
 */
 public String getWordFilePath(String systemType, String fileType, String recID) {

 String filePath = "";

 File file = new File(this.getClass().getResource("/").getPath());

 filePath = file.getPath().substring(0, file.getPath().length() - 15);

 if(systemType.equalsIgnoreCase("govInfo")) {

 if(fileType.equalsIgnoreCase("doc"))
 filePath = filePath + "/uploadFiles/govInfo/document/" + recID + ".doc";
 else if(fileType.equalsIgnoreCase("htm"))
 filePath = filePath + "/HTML/govInfo/" + recID + ".htm";
 } else if(systemType.equalsIgnoreCase("sw")){
 if(fileType.equalsIgnoreCase("doc"))
 filePath = filePath + "/uploadFiles/sw/document/" + recID + ".doc";
 else if(fileType.equalsIgnoreCase("htm"))
 filePath = filePath + "/HTML/sw/" + recID + ".htm";
 } else if(systemType.equalsIgnoreCase("fw")) {
 if(fileType.equalsIgnoreCase("doc"))
 filePath = filePath + "/uploadFiles/fw/document/" + recID + ".doc";
 else if(fileType.equalsIgnoreCase("htm"))
 filePath = filePath + "/HTML/fw/" + recID + ".htm";
 }

 return filePath;
 }
}

5. 另一个例子(用jacob包)：

里面两个文件：jacob.jar，jacob.dll。

jacob.jar就是我们要使用的包是和java交互的东西，在项目中

import com.jacob.com.*;

import com.jacob.activeX.*;
同时，需要在环境变量中指明jacob.jar的位置。
jacob.dll是和com 交互的东西，我们需要把它放入windows/system32中，而且在path中要指明它的位置。
这样我们就可以在项目中使用了：

下面给一个例子：
类ReplaceWord.java
import com.jacob.com.*;
import com.jacob.activeX.*;

public class ReplaceWord {
public static void main(String[] args) {

ActiveXComponent app = new ActiveXComponent("Word.Application"); //启动word
String inFile = "C://test.doc"; //要替换的word文件
boolean flag = false;
try {
              app.setProperty("Visible", new Variant(false)); //设置word不可见
Object docs = app.getProperty("Documents").toDispatch();
Object doc = Dispatch.invoke(docs, "Open", Dispatch.Method, new Object[]{inFile, new Variant(false), new Variant(false)}, new int[1]).toDispatch(); //打开word文件，注意这里第三个参数要设为false，这个参数表示是否以只读方式打开，因为我们要保存原文件，所以以可写方式打开。
Object content = Dispatch.get(doc, "Content").toDispatch(); //提取word文档内容对象
Object finder = Dispatch.get(content, "Find").toDispatch(); //提取find对象，也就查找替换的那个对象
              Variant f = new Variant(false);

boolean rt = true;
while(rt){
rt = Dispatch.invoke(finder, "Execute", Dispatch.Method, new Object[] {"New", f, f, f, f, f, f, f, f, "Old", new Variant(true)}, new int[1]).toBoolean(); //替换Old ---> New
}

Dispatch.call(doc, "Save"); //保存
Dispatch.call(doc, "Close", f);
flag = true;
System.out.println("is over");
}
catch (Exception e){
e.printStackTrace();
}
finally {
              app.invoke("Quit", new Variant[] {});
}
}
}

随风_csdn

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
[摘]用Java生成Word文档

开发中隔三叉五的就要用到Word，经常被搞得不胜其烦，不过这次找到了不少好例子，干脆将他们都摘了过来，内容如下：1. poi是apache的一个项目，不过就算用poi你可能都觉得很烦，不过不要紧，这里提供了更加简单的一个接口给你：下载经过封装后的poi包：这个包就是：tm-extractors-0.4.jar下载之后，放到你的classpath就可以了，下面是如何使用它的一个例子：im
复制链接

扫一扫