java word 转 html_java将word转换为html(代码)

这篇博客展示了如何使用Java将Word文档(docx和doc)转换为HTML。通过加载文档,配置XWPFDocument和HWPFDocument,利用XHTMLConverter和WordToHtmlConverter进行转换,并保存为HTML文件。所需的依赖库包括fr.opensagres.xdocreport、Apache POI等。
摘要由CSDN通过智能技术生成

6d8e0d1a7cef01b91163dfb387783165.png

代码:public static void main(String[] args) throws Exception {

String filePath = "C:/Users/Administrator/Desktop/92个诊疗方案及临床路径/";

File file = new File(filePath);

File[] files = file.listFiles();

String name = null;

for (File file2 : files) {

Thread.sleep(500);

name = file2.getName().substring(0, file2.getName().lastIndexOf("."));

System.out.println(file2.getName());

if (file2.getName().endsWith(".docx") || file2.getName().endsWith(".DOCX")) {

CaseHtm.docx(filePath ,file2.getName(),name +".htm");

}else{

CaseHtm.dox(filePath ,file2.getName(),name +".htm");

}

}

}

/**

* 转换docx

* @param filePath

* @param fileName

* @param htmlName

* @throws Exception

*/

public static void docx(String filePath ,String fileName,String htmlName) throws Exception{

final String file = filePath + fileName;

File f = new File(file);

// ) 加载word文档生成 XWPFDocument对象

InputStream in = new FileInputStream(f);

XWPFDocument document = new XWPFDocument(in);

// ) 解析 XHTML配置 (这里设置IURIResolver来设置图片存放的目录)

File imageFolderFile = new File(filePath);

XHTMLOptions options = XHTMLOptions.create().URIResolver(new FileURIResolver(imageFolderFile));

options.setExtractor(new FileImageExtractor(imageFolderFile));

options.setIgnoreStylesIfUnused(false);

options.setFragment(true);

// ) 将 XWPFDocument转换成XHTML

OutputStream out = new FileOutputStream(new File(filePath + htmlName));

XHTMLConverter.getInstance().convert(document, out, options);

}

/**

* 转换doc

* @param filePath

* @param fileName

* @param htmlName

* @throws Exception

*/

public static void dox(String filePath ,String fileName,String htmlName) throws Exception{

final String file = filePath + fileName;

InputStream input = new FileInputStream(new File(file));

HWPFDocument wordDocument = new HWPFDocument(input);

WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());

//解析word文档

wordToHtmlConverter.processDocument(wordDocument);

Document htmlDocument = wordToHtmlConverter.getDocument();

File htmlFile = new File(filePath + htmlName);

OutputStream outStream = new FileOutputStream(htmlFile);

DOMSource domSource = new DOMSource(htmlDocument);

StreamResult streamResult = new StreamResult(outStream);

TransformerFactory factory = TransformerFactory.newInstance();

Transformer serializer = factory.newTransformer();

serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");

serializer.setOutputProperty(OutputKeys.INDENT, "yes");

serializer.setOutputProperty(OutputKeys.METHOD, "html");

serializer.transform(domSource, streamResult);

outStream.close();

}

pom.xml配置:

fr.opensagres.xdocreport

fr.opensagres.xdocreport.document

1.0.5

fr.opensagres.xdocreport

org.apache.poi.xwpf.converter.xhtml

1.0.5

org.apache.poi

poi

3.12

org.apache.poi

poi-scratchpad

3.12

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值