“万物皆可Seq2Seq” | 忠于原文的T5手写论文翻译

本文提出将所有文本处理问题视为文本到文本问题,通过T5模型进行预训练和微调,实现在多个NLP任务中的最顶性能。研究对比了预训练目标、数据集等因素,强调了无监督预训练和大规模数据的优势。
摘要由CSDN通过智能技术生成

《Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer》

摘要 / Abstract

     Transfer learning, where a model is first pre-trained on a data-rich task before being finetuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new “Colossal Clean Crawled Corpus”, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.1 Keywords: transfer learning, natural language processing, multi-task learning, attentionbased models, deep learning

     迁移学习,把一个模型先在数据丰富的任务上进行预训练,然后再针对下游任务进行微调,这在自然语言处理中是一个强大的技术。迁移学习的有效性引起了方法、方式和实现的多样性。在本文中,我们探索了NLP的迁移学习技术的前景,通过引入一个统一框架将所有基于文本的语言问题转换为文本到文本格式。我们系统的比较了数十种语言理解任务的预训练目标,体系结构,未标记的数据集,迁移方法和其他因素。通过结合对规模的探索和新的“巨型清洁爬虫语料库(C4)”,我们在许多基准上获得了最先进的结果,包括文本摘要,问题解答,文本分类等。为了促进NLP迁移学习的发展,我们发布了数据集,预训练的模型和代码。

章节1 介绍 / Introduction

     Training a machine learning model to perform natural language processing (NLP) tasks often requires that the model can process text in a way that is amenable to downstream learning. This can be loosely viewed as developing general-purpose knowledge that allows the model to “understand” text. This knowledge can range from low-level (e.g. the spelling or meaning of words) to high-level (e.g. that a tuba is too large to fit in most backpacks). In modern machine learning practice, providing this knowledge is rarely done explicitly; instead, it is often learned as part

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值