双十一再送再送自己一篇论文 2022-11-11_on solving event-based optimization with average r-CSDN博客

本文链接：https://blog.csdn.net/huihuixiaoxue/article/details/127812435

pre-training tasks：

MathBERT is jointly trained with formula and its context.

Two pre-training

tasks are employed to learn representations of formula which are Masked Language Modeling (MLM) and Context Corre spondence Prediction (CCP). Furthermore, mathematical formula contains rich structural information, which is important to semantic understanding and formula retrieval tasks. Thus, we take the Operator Trees (OPTs) as the input and design a novel pre-training task named Masked Substructure Predic tion (MSP) to capture semantic-level structural information of formula.

downstream tasks：

mathematical information retrieval, formula topic classifification and formula headline generation

what is difficult ?

Processing mathematical information is still a challenging task due to the diversity of mathematical formula representations, the complexity of formula structure and the ambiguity of implicit semantics.

what is wrong with previous works?

Customized models are built upon either the structural features of formula or topical correspondence between formula and context

do not consider a joint training of structural and semantic information