Latex公式表达式导出word文档

Latex公式导出,需将Latex公式表达式转换成MathML(数学标记语言) ,然后再将MathML(数学标记语言)转换成OMML(Word公式),然后使用POI导出。

步骤如下所示:

1. 导入依赖
<!-- https://mvnrepository.com/artifact/de.rototor.snuggletex/snuggletex-core -->
<dependency>
    <groupId>de.rototor.snuggletex</groupId>
    <artifactId>snuggletex-core</artifactId>
    <version>1.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.1.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/ooxml-schemas -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>ooxml-schemas</artifactId>
    <version>1.4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>4.1.2</version>
</dependency>
<dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.11.0</version>
</dependency>
2. 将Latex公式转换成MathML(数学标记语言)
public void addLatex(String latex, XWPFParagraph paragraph) throws Exception {
    paragraph.setAlignment(ParagraphAlignment.LEFT);
    paragraph.setFontAlignment(ParagraphAlignment.LEFT.getValue());
    SnuggleEngine engine = new uk.ac.ed.ph.snuggletex.SnuggleEngine();
    SnuggleSession session = engine.createSession();
    SnuggleInput input = new uk.ac.ed.ph.snuggletex.SnuggleInput(latex);
    session.parseInput(input);
    String mathML = session.buildXMLString();
    CTOMath ctOMath = getOMML(mathML);
    CTP ctp = paragraph.getCTP();
    CTOMath ctoMath = ctp.addNewOMath();
    ctoMath.set(ctOMath);
}
3. 将MathML(数学标记语言)转换成OMML(Word公式)

在windows的Office安装目录里面找到MML2OMML.XSL文件

将文件放入项目resources里

private CTOMath getOMML(String mathML) throws Exception {
    InputStream in = this.getClass().getClassLoader().getResourceAsStream("MML2OMML.XSL");
    TransformerFactory tFactory = TransformerFactory.newInstance();
    StreamSource stylesource = new StreamSource(in);
    Transformer transformer = tFactory.newTransformer(stylesource);
    StringReader stringreader = new StringReader(mathML);
    StreamSource source = new StreamSource(stringreader);
    StringWriter stringwriter = new StringWriter();
    StreamResult result = new StreamResult(stringwriter);
    transformer.transform(source, result);
    String ooML = stringwriter.toString();
    stringwriter.close();
    CTOMathPara ctOMathPara = CTOMathPara.Factory.parse(ooML);
    CTOMath ctOMath = ctOMathPara.getOMathArray(0);
    //for making this to work with Office 2007 Word also, special font settings are necessary
    XmlCursor xmlcursor = ctOMath.newCursor();
    while (xmlcursor.hasNextToken()) {
        XmlCursor.TokenType tokentype = xmlcursor.toNextToken();
        if (tokentype.isStart()) {
            if (xmlcursor.getObject() instanceof CTR) {
                CTR cTR = (CTR) xmlcursor.getObject();
                cTR.addNewRPr2().addNewRFonts().setAscii("Cambria Math");
                cTR.getRPr2().getRFonts().setHAnsi("Cambria Math"); // up to apache poi 4.1.2
                //cTR.getRPr2().getRFontsArray(0).setHAnsi("Cambria Math"); // since apache poi 5.0.0
            }
        }
    }
    return ctOMath;
}
4. 发现存在无法识别的符号
发现存在无法识别的符号,因此单独处理,提前过滤识别掉,①②③④⑤等符合无法识别,即latex表达式是 \textcircled
public class LatexUtils {
    public static String latexFilter(String latex){
        if(!latex.contains("textcircled")){
            return latex;
        }
        return TextCircledEnum.replaceTextCircled(latex);
    }

    private enum TextCircledEnum{
        Zero("\\\\textcircled\\{0\\}","⓪"),
        One("\\\\textcircled\\{1\\}","①"),
        Two("\\\\textcircled\\{2\\}","②"),
        Three("\\\\textcircled\\{3\\}","③"),
        Four("\\\\textcircled\\{4\\}","④"),
        Five("\\\\textcircled\\{5\\}","⑤"),
        Six("\\\\textcircled\\{6\\}","⑥"),
        Seven("\\\\textcircled\\{7\\}","⑦"),
        Eight("\\\\textcircled\\{8\\}","⑧"),
        Nine("\\\\textcircled\\{9\\}","⑨"),
        Ten("\\\\textcircled\\{10\\}","⑩");

        TextCircledEnum(String code, String v) {
            this.code = code;
            this.v = v;
        }

        public final String code;
        public final String v;

        public static String replaceTextCircled(String latex){
            for (TextCircledEnum c : TextCircledEnum.values()) {
                latex = latex.replaceAll(c.code,c.v);
            }
            return latex;
        }
    }
}
5. 调用

这里的latex表达式必须用$$包裹,例如:$S=4\pi R^{2}$

XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
paragraph.setAlignment(ParagraphAlignment.LEFT);
String latex = "$S=4\pi R^{2}$";
addLatex(LatexUtils.latexFilter(latex), document.createParagraph());

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值