java 读取doc文件_如何在java中读取Doc或Docx文件?

本文介绍了如何在Java中使用Apache POI库读取doc和docx文件。通过示例代码展示了读取文档内容、页眉、页脚以及文档摘要的方法。
摘要由CSDN通过智能技术生成

我想在

java中读一个word文件

import org.apache.poi.poifs.filesystem.*;

import org.apache.poi.hpsf.DocumentSummaryInformation;

import org.apache.poi.hwpf.*;

import org.apache.poi.hwpf.extractor.*;

import org.apache.poi.hwpf.usermodel.HeaderStories;

import java.io.*;

public class ReadDocFileFromJava {

public static void main(String[] args) {

/**This is the document that you want to read using Java.**/

String fileName = "C:\\Path to file\\Test.doc";

/**Method call to read the document (demonstrate some useage of POI)**/

readMyDocument(fileName);

}

public static void readMyDocument(String fileName){

POIFSFileSystem fs = null;

try {

fs = new POIFSFileSystem(new FileInputStream(fileName));

HWPFDocument doc = new HWPFDocument(fs);

/** Read the content **/

readParagraphs(doc);

int pageNumber=1;

/** We will try reading the header for page 1**/

readHeader(doc, pageNumber);

/** Let's try reading the footer for page 1**/

readFooter(doc, pageNumber);

/** Read the document summary**/

readDocumentSummary(doc);

} catch (Exception e) {

e.printStackTrace();

}

}

public static void readParagraphs(HWPFDocument doc) throws Exception{

WordExtractor we = new WordExtractor(doc);

/**Get the total number of paragraphs**/

String[] paragraphs = we.getParagraphText();

System.out.println("Total Paragraphs: "+paragraphs.length);

for (int i = 0; i < paragraphs.length; i++) {

System.out.println("Length of paragraph "+(i +1)+": "+ paragraphs[i].length());

System.out.println(paragraphs[i].toString());

}

}

public static void readHeader(HWPFDocument doc, int pageNumber){

HeaderStories headerStore = new HeaderStories( doc);

String header = headerStore.getHeader(pageNumber);

System.out.println("Header Is: "+header);

}

public static void readFooter(HWPFDocument doc, int pageNumber){

HeaderStories headerStore = new HeaderStories( doc);

String footer = headerStore.getFooter(pageNumber);

System.out.println("Footer Is: "+footer);

}

public static void readDocumentSummary(HWPFDocument doc) {

DocumentSummaryInformation summaryInfo=doc.getDocumentSummaryInformation();

String category = summaryInfo.getCategory();

String company = summaryInfo.getCompany();

int lineCount=summaryInfo.getLineCount();

int sectionCount=summaryInfo.getSectionCount();

int slideCount=summaryInfo.getSlideCount();

enter code here

System.out.println("---------------------------");

System.out.println("Category: "+category);

System.out.println("Company: "+company);

System.out.println("Line Count: "+lineCount);

System.out.println("Section Count: "+sectionCount);

System.out.println("Slide Count: "+slideCount);

}

}

我想用Java阅读doc或docx文件

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值