QAConv:信息性对话的问答

最新推荐文章于 2024-08-31 08:52:08 发布

夜空霓虹

最新推荐文章于 2024-08-31 08:52:08 发布

阅读量626

点赞数

分类专栏：自然语言处理文章标签：大数据

本文链接：https://blog.csdn.net/zhang_xiaomeng/article/details/125608026

版权

自然语言处理专栏收录该内容

40 篇文章

订阅专栏

This paper introduces QAConv1 , a new question answering (QA) dataset that uses conversations as a knowledge source. We focus on informative conversations including business emails, panel discussions, and work channels. Unlike open domain and task-oriented dialogues, these conversations are usually long, complex, asynchronous, and involve strong domain knowledge. In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions, from 10,259 selected conversations with both human-written and machine-generated questions. We segment long conversations into chunks, and use a question gen erator and dialogue summarizer as auxiliary tools to collect multi-hop questions. The dataset has two testing scenarios, chunk mode and full mode, depending on whether the grounded chunk is provided or retrieved from a large conversational pool. Experimental results show that state-of-the-art QA systems trained on exist ing QA datasets have limited zero-shot ability and tend to predict our questions as unanswerable. Fine-tuning such systems on our corpus can achieve signifi cant improvement up to 23.6% and 13.6% in both chunk mode and full mode, respectively

本文介绍了QAConv1，一种新的问答数据集，它使用对话作为知识源。我们专注于信息交流，包括商业电子邮件、小组讨论和工作渠道。与开放领域和面向任务的对话不同，这些对话通常是长的、复杂的、异步的，并且涉及强大的领域知识。总的来说，我们从10259个选定的对话中收集了34204个问答对，包括基于广度的、自由形式的和无法回答的问题，这些对话包括人类书写的和机器生成的问题。我们将长对话分割成块，并使用问题生成器和对话摘要器作为辅助工具来收集多跳问题。数据集有两个测试场景，区块模式和完整模式，这取决于固定区块是提供的还是从大型会话池中检索的。实验结果表明，在现有QA数据集上训练的最先进的QA系统具有有限的 zero-shot 能力，并且倾向于将我们的问题预测为无法回答。在我们的语料库上微调这些系统可以在组块模式和全模式下分别实现23.6%和13.6%的显著改善。

Having conversations is one of the most common ways to share knowledge and exchange information. Recently, many communication tools and platforms are heavily used with the increasing volume of remote working, and how to effectively retrieve information and answer questions based on past conversations becomes more and more important. In this paper, we focus on conversations such as business emails (e.g., Gmail), panel discussions (e.g., Zoom), and work channels (e.g., Slack). Different from daily chit-chat [Li et al., 2017] and task-oriented dialogues [Budzianowski et al., 2018], these conversations are usually long, complex, asynchronous, multi-party, and involve strong domain-knowledge. We refer to them as informative conversations and an example is shown in Figure 1.

对话是分享知识和交流信息的最常见方式之一。最近，随着远程工作量的增加，许多通信工具和平台被大量使用，如何根据过去的对话有效地检索信息和回答问题变得越来越重要。在本文中，我们关注诸如商务电子邮件（例如 Gmail）、小组讨论（例如 Zoom）和工作渠道（例如 Slack）等对话。与日常闲聊 [Li et al., 2017] 和面向任务的对话 [Budzianowski et al., 2018] 不同，这些对话通常是冗长的、复杂的、异步的、多方的，并且涉及强大的领域知识。我们将它们称为信息性对话，示例如图 1 所示。

However, QA research mainly focuses on document understanding (e.g., Wikipedia) not dialogue understanding, and dialogues have significant differences with documents in terms of data format and wording style [Wolf et al., 2019b, Wu et al., 2020]. Existing work related to QA and conversational AI focuses on conversational QA [Reddy et al., 2019, Choi et al., 2018] instead of QA on conversa[1]tions. Specifically, conversational QA has sequential dialogue-like QA pairs that are grounded on a short document paragraph, but what we are more interested in is to have QA pairs grounded on conversations, treating past dialogues as a knowledge source. The most related work to ours is the FriendsQA dataset [Yang and Choi, 2019], but it is built on short chit-chat transcripts of TV shows with only one thousand dialogues.

然而，QA 研究主要关注文档理解（例如维基百科）而不是对话理解，并且对话在数据格式和措辞风格方面与文档存在显着差异 [Wolf et al., 2019b, Wu et al., 2020]。与 QA 和对话式 AI 相关的现有工作侧重于对话式 QA [Reddy et al., 2019, Choi et al., 2018]，而不是对话式 QA。具体来说，对话式 QA 具有基于短文档段落的顺序对话式 QA 对，但我们更感兴趣的是让 QA 对基于对话，将过去的对话视为知识源。与我们最相关的工作是 FriendsQA 数据集 [Yang and Choi, 2019]，但它是建立在只有 1000 个对话的电视节目的简短聊天记录之上的。