当前使用docx4j的版本为2.8.1,并不是最新版本,通过测试,最新版本的中文乱码问题依然存在,但是此版本,目前来看没问题。测试的是后缀为docx的word文档,非早期的doc后缀的word文档,如需请先手工转换为docx格式。
JDK Version:1.8
maven 依赖配置
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j</artifactId>
<version>2.8.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.xmlgraphics/batik-bridge -->
<dependency>
<groupId>org.apache.xmlgraphics</groupId>
<artifactId>batik-bridge</artifactId>
<version>1.7</version>
</dependency>
batik-bridge 并不知道实际作用,只是没有引入,启动会报某个class找不到,好像也并没影响使用。
具体代码:
File file = new File("C:\\Users\\luzh\\Desktop\\运输合同.docx");
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(file);
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
// 替换变量
String xml = XmlUtils.marshaltoString(documentPart.getJaxbElement(), true);
HashMap<String, String> mappings = new HashMap<String, String>();
mappings.put("1", "green");
mappings.put("2", "chocolate");
mappings.put("3", "美国");
mappings.put("4", "中国");
Object obj = XmlUtils.unmarshallFromTemplate(xml, mappings);
documentPart.setJaxbElement((org.docx4j.wml.Document) obj);
Mapper fontMapper = new IdentityPlusMapper();
//解决中文乱码问题
fontMapper.getFontMappings().put("隶书", PhysicalFonts.getPhysicalFonts().get("LiSu"));
fontMapper.getFontMappings().put("宋体", PhysicalFonts.getPhysicalFonts().get("SimSun"));
fontMapper.getFontMappings().put("微软雅黑", PhysicalFonts.getPhysicalFonts().get("Microsoft Yahei"));
fontMapper.getFontMappings().put("黑体", PhysicalFonts.getPhysicalFonts().get("SimHei"));
fontMapper.getFontMappings().put("楷体", PhysicalFonts.getPhysicalFonts().get("KaiTi"));
fontMapper.getFontMappings().put("新宋体", PhysicalFonts.getPhysicalFonts().get("NSimSun"));
fontMapper.getFontMappings().put("华文行楷", PhysicalFonts.getPhysicalFonts().get("STXingkai"));
fontMapper.getFontMappings().put("华文仿宋", PhysicalFonts.getPhysicalFonts().get("STFangsong"));
fontMapper.getFontMappings().put("宋体扩展", PhysicalFonts.getPhysicalFonts().get("simsun-extB"));
fontMapper.getFontMappings().put("仿宋", PhysicalFonts.getPhysicalFonts().get("FangSong"));
fontMapper.getFontMappings().put("仿宋_GB2312", PhysicalFonts.getPhysicalFonts().get("FangSong_GB2312"));
fontMapper.getFontMappings().put("幼圆", PhysicalFonts.getPhysicalFonts().get("YouYuan"));
fontMapper.getFontMappings().put("华文宋体", PhysicalFonts.getPhysicalFonts().get("STSong"));
fontMapper.getFontMappings().put("华文中宋", PhysicalFonts.getPhysicalFonts().get("STZhongsong"));
wordMLPackage.setFontMapper(fontMapper);
OutputStream os = new FileOutputStream(new File("e:/000.pdf"));
org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);
//清除pdf页码额外调试信息
Docx4jProperties.getProperties().setProperty("docx4j.Log4j.Configurator.disabled", "true");
Log4jConfigurator.configure();
org.docx4j.convert.out.pdf.viaXSLFO.Conversion.log.setLevel(Level.OFF);
c.output(os, new PdfSettings());
效果如下:
Word 文档模板
转换后的PDF 文档: