java poi判断文件类型,如何在POI中判断文件是doc还是docx

The title may be a little confusing. The simplest method must be judging by extension name just like:

// is represents the InputStream

if (filePath.endsWith("doc")) {

WordExtractor ex = new WordExtractor(is);

text = ex.getText();

ex.close();

} else if(filePath.endsWith("docx")) {

XWPFDocument doc = new XWPFDocument(is);

XWPFWordExtractor extractor = new XWPFWordExtractor(doc);

text = extractor.getText();

extractor.close();

}

This works in most cases. But I have found that for certain file whose extension is doc (a docx file essentially) if you open using winrar, you will find xml files. As it is known that a docx file is a zip file consists of xml files.

I believe this problem must not be rare. But I have not found any information about thi

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值