今日arXiv精选 | 4篇EMNLP 2021最新论文

 关于 #今日arXiv精选 

这是「AI 学术前沿」旗下的一档栏目,编辑将每日从arXiv中精选高质量论文,推送给读者。

Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding

Comment: Long paper at EMNLP 2021

Link: http://arxiv.org/abs/2109.01583

Abstract

Lack of training data presents a grand challenge to scaling out spokenlanguage understanding (SLU) to low-resource languages. Although various dataaugmentation approaches have been proposed to synthesize training data inlow-resource target languages, the augmented data sets are often noisy, andthus impede the performance of SLU models. In this paper we focus on mitigatingnoise in augmented data. We develop a denoising training approach. Multiplemodels are trained with data produced by various augmented methods. Thosemodels provide supervision signals to each other. The experimental results showthat our method outperforms the existing state of the art by 3.05 and 4.24percentage points on two benchmark datasets, respectively. The code will bemade open sourced on github.

Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation

Comment: Findings of EMNLP 2021

Link: http://arxiv.org/abs/2109.01484

Abstract

Exemplar-Guided Paraphrase Generation (EGPG) aims to generate a targetsentence which conforms to the style of the given exemplar while encapsulatingthe content information of the source sentence. In this paper, we propose a newmethod with the goal of learning a better representation of the style andthecontent. This method is mainly motivated by the recent success of contrastivelearning which has demonstrated its power in unsupervised feature extractiontasks. The idea is to design two contrastive losses with respect to the contentand the style by considering two problem characteristics during training. Onecharacteristic is that the target sentence shares the same content with thesource sentence, and the second characteristic is that the target sentenceshares the same style with the exemplar. These two contrastive losses areincorporated into the general encoder-decoder paradigm. Experiments on twodatasets, namely QQP-Pos and ParaNMT, demonstrate the effectiveness of ourproposed constrastive losses.

Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

Comment: EMNLP 2021

Link: http://arxiv.org/abs/2109.01396

Abstract

Differently from the traditional statistical MT that decomposes thetranslation task into distinct separately learned components, neural machinetranslation uses a single neural network to model the entire translationprocess. Despite neural machine translation being de-facto standard, it isstill not clear how NMT models acquire different competences over the course oftraining, and how this mirrors the different models in traditional SMT. In thiswork, we look at the competences related to three core SMT components and findthat during training, NMT first focuses on learning target-side languagemodeling, then improves translation quality approaching word-by-wordtranslation, and finally learns more complicated reordering patterns. We showthat this behavior holds for several models and language pairs. Additionally,we explain how such an understanding of the training process can be useful inpractice and, as an example, show how it can be used to improve vanillanon-autoregressive neural machine translation by guiding teacher modelselection.

Detecting Speaker Personas from Conversational Texts

Comment: Accepted by EMNLP 2021

Link: http://arxiv.org/abs/2109.01330

Abstract

Personas are useful for dialogue response prediction. However, the personasused in current studies are pre-defined and hard to obtain before aconversation. To tackle this issue, we study a new task, named Speaker PersonaDetection (SPD), which aims to detect speaker personas based on the plainconversational text. In this task, a best-matched persona is searched out fromcandidates given the conversational text. This is a many-to-many semanticmatching task because both contexts and personas in SPD are composed ofmultiple sentences. The long-term dependency and the dynamic redundancy amongthese sentences increase the difficulty of this task. We build a dataset forSPD, dubbed as Persona Match on Persona-Chat (PMPC). Furthermore, we evaluateseveral baseline models and propose utterance-to-profile (U2P) matchingnetworks for this task. The U2P models operate at a fine granularity whichtreat both contexts and personas as sets of multiple sequences. Then, eachsequence pair is scored and an interpretable overall score is obtained for acontext-persona pair through aggregation. Evaluation results show that the U2Pmodels outperform their baseline counterparts significantly.

·

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值