gpt-2 生成文本
Note: This blog was originally posted in this following link.
注意: 此博客最初发布在以下 链接中 。
We all heard modern-day Natural Language Processing (NLP) has progressed by leaps and bounds in the past couple of years following the development of attention networks and transformers. It paved the way for a plethora of new algorithms achieving State-Of-The-Art (SOTA) for the different tasks of NLP.
我们都听说,随着注意力网络和转换器的发展,近几年来,现代自然语言处理(NLP)取得了突飞猛进的发展。 它为众多新算法铺平了道路,从而可以为NLP的不同任务提供最新技术(SOTA)。
OpenAI has been one of the leaders in providing their own language model (now released GPT-3) which is trained on a huge corpus of internet data. Since GPT-3 is a recent phenomenon and in English at the moment, and is only accessible through API given by OpenAI, we shift our focus on the earlier version of it, i.e. GPT-2. To know about the internal nuts and bolts of GPT-2, I suggest you to go through this link. For more depths into Attention and Transformers, here are some excellent links:
OpenAI一直是提供自己的语言模型(现已发布的GPT-3)的领导者之一,该模型在庞大的互联网数据集上进行了培训。 由于GPT-3是目前的一种新现象,目前仅以英语提供,并且只能通过OpenAI提供的API进行访问,因此我们将重点放在GPT-2的早期版本上。 要了解GPT-2的内部螺母和螺栓,建议您通过以下链接进行操作 。 要深入了解“注意力”和“变形金刚”,以下是一些出色的链接:
The illustrated Transformer by Jay Alammar
Jay Alammar 的插图变形金刚
The Annotated Transformer by Harvard NLP
哈佛NLP 的带注释的变压器
GPT-2 was also released for English, which makes it difficult for someone trying to generate text in a different language.
GPT-2也发布了英语版本,这使得尝试生成其他语言的文本变得很困难。
So why not train your own GPT-2 model on your favorite language for text generation? That is exactly what we are going to do. So, without further ado, let us jump in.
那么,为什么不使用自己喜欢的语言训练自己的GPT-2模型来生成文本呢? 这正是我们要做的。 因此,事不宜迟,让我们跳进去。
For the demo, I have considered a non-Latin alphabet script (Bengali here), because why not? I have used Huggingface’s implementation for the model.
对于该演示,我考虑了非拉丁字母脚本(此处为孟加拉),因为为什么不呢? 我已经为模型使用了Huggingface的实现。