文本生成模型
介绍(Introduction)
The Data to text generation capability of NLG models is something that I have been exploring since the inception of sequence to sequence models in the field of NLP. The earlier attempts to tackle this problem were not showing any promising results. The non- ML Rule-based approaches like simple NLG did not seem to scale well as they require a well-formatted input and can only perform tasks such as changing the tense of the sentence. But in the age of language models, where new variants of transformers are getting released every two weeks, a task like this is not a far-fetched dream anymore.
自从NLP领域中的序列到序列模型诞生以来,我一直在探索NLG模型的数据到文本生成功能。 解决该问题的较早尝试并未显示出任何有希望的结果。 像简单的NLG这样的非基于ML规则的方法似乎无法很好地扩展,因为它们需要格式正确的输入,并且只能执行诸如更改句子时态的任务。 但是在语言模型时代,每两周发布一次变压器的新变体,像这样的任务不再是一个遥不可及的梦想。
In this blog, I will discuss how I approached the Data-to-text generation problem with advanced deep learning models.
在此博客中,我将讨论如何使用高级深度学习模型解决数据到文本生成问题。
The openAI GPT-2 seemed like a good option as it had compelling text generation capabilities. But training it on the web NLG 2017 data didn’t get me anywhere. The model didn’t converge at all. The conditional, as well as the unconditional text generation capabilities of GPT-2, are reasonably good, but you would hardly find a business use case that can be addressed with these tasks.
openAI GPT-2似乎是一个不错的选择,因为它具有引人注目的文本生成功能。 但是在网络上对NLG 2017数据进行培训并没有帮助我。 该模型根本没有收敛。 GPT-2的条件生成和无条件文本生成功能都相当不错,但是您很难找到可以用这些任务解决的业务用例。
Furthermore, finetuning them on the domain-specific data at times resulted in the generation of the sentences which were out of context
此外,有时会根据特定领域的数据对它们进行微调,从而导致生成上下文无关的句子
With openAI(Not so open) n