[Pytorch官方NLP实验解惑02]NGram语言模型

最新推荐文章于 2022-05-26 15:24:10 发布

iSikai

最新推荐文章于 2022-05-26 15:24:10 发布

阅读量452

点赞数

分类专栏： NLP

本文链接：https://blog.csdn.net/oksupersonic/article/details/103755025

版权

NLP 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

该实验不仅介绍了语言模型，还引出了word embedding，前者在nlp中称为下游任务，后者称为预处理，预处理+下游任务的二阶段模型是现在nlp实验的常用框架。
这篇博客是对https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html#sphx-glr-beginner-nlp-word-embeddings-tutorial-py中一些问题的解惑。

语言模型

语言模型在nlp中一般指一个句子在自然语言中出现的概率，本实验的目的是训练一个根据上文预测单词的语言模型，即求 $P(Wi|W_{i-N}...W{i-1})$

test_sentence = """When forty winters shall besiege thy brow,
And dig deep trenches in thy beauty's field,
Thy youth's proud livery so gazed on now,
Will be a totter'd weed of small worth held:
Then being asked, where all thy beauty lies,
Where all the treasure of thy lusty days;
To say, within thine own deep sunken eyes,
Were an all-eating shame, and thriftless praise.
How much more praise deserv'd thy beauty's use,
If thou couldst answer 'This fair child of mine
Shall sum my count, and make my old excuse,'
Proving his beauty by succession thine!
This were to be new made when thou art old,
And see thy blood warm when thou feel'st it cold.""".split()

具体来说，实验目的是根据以上文本训练一个预测器，给定前2个单词，输出下一个单词是 $W_i$ 的概率，即不同单词的Score向量。

网络模型

在这里插入图片描述

class NGarm(nn.Module):
    def __init__(self,vocab_size,embedding_dim,context_size):
        super(NGarm,self).__init__()
        self.embeddings=nn.Embedding(vocab_size,embedding_dim)
        self.linear1=nn.Linear(context_size*embedding_dim,128)
        self.linear2=nn.Linear(128,vocab_size)

    def forward(self,context):
        context_embedding=self.embeddings(context).view((1,-1))
        out=F.relu(self.linear1(context_embedding))
        out=F.log_softmax(self.linear2(out),dim=1)
        return out

需要注意的地方：
1.在init中尽管embedding层的参数设置为(vocal，embedding)，但是实验时传入的是index vector，而不是onehot vector。
2.在forward中，做完embedding之后首先需要做flatten，在torch中使用view(1,-1)转变为一维tensor。

训练

for epoch in range(15):
    total_loss = 0.
    for context,target in trigarms:
        #embedding不是以乘积的形式工作的,无需传入onthot，只需要index
        context_idx=torch.tensor([word_to_ix[w] for w in context], dtype=torch.long)

        model.zero_grad()

        ans=model(context_idx)

        loss=loss_function(ans,torch.tensor([word_to_ix[target]],dtype=torch.long))
        loss.backward()

        optimizer.step()

iSikai

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[Pytorch官方NLP实验解惑02]NGram语言模型

该实验不仅介绍了语言模型，还引出了word embedding，前者在nlp中称为下游任务，后者称为预处理，预处理+下游任务的二阶段模型是现在nlp实验的常用框架。语言模型语言模型在nlp中一般指一个句子在自然语言中出现的概率，本实验的目的是训练一个根据上文预测单词的语言模型，即求P(Wi∣Wi−N...Wi−1)P(Wi|W_{i-N}...W{i-1})P(Wi∣Wi−N...Wi−1)...
复制链接

扫一扫

专栏目录