PyTorch Exercise: Computing Word Embeddings: Continuous Bag-of-Words

PyTorch Tutorial

PyTorch中,关于训练词向量的练习,描述如下:

The Continuous Bag-of-Words model (CBOW) is frequently used in NLP deep learning. It is a model that tries to predict words given the context of a few words before and a few words after the target word. This is distinct from language modeling, since CBOW is not sequential and does not have to be probabilistic. Typcially, CBOW is used to quickly train word embeddings, and these embeddings are used to initialize the embeddings of some more complicated model. Usually, this is referred to as pretraining embeddings. It almost always helps performance a couple of percent.

The CBOW model is as follows. Given a target word wiwi and an NN context window on each side, wi1,,wiNwi−1,…,wi−N and wi+1,,wi+Nwi+1,…,wi+N, referring to all context words collectively as CC, CBOW tries to minimize

logp(wi|C)=logSoftmax(A(wCqw)+b)−log⁡p(wi|C)=−log⁡Softmax(A(∑w∈Cqw)+b)

where qwqw is the embedding for word ww.

Implement this model in Pytorch by filling in the class below. Some tips:

  • Think about which parameters you need to define.
  • Make sure you know what shape each operation expects. Use .view() if you need to reshape.
代码如下:
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

torch.manual_seed(1)

CONTEXT_SIZE = 2  # 2 words to the left, 2 to the right
raw_text = """We are about to study the idea of a computational process.
Computational processes are abstract beings that inhabit computers.
As they evolve, processes manipulate other abstract things called data.
The evolution of a process is directed by a pattern of rules
called a program. People create programs to direct processes. In effect,
we conjure the spirits of the computer with our spells.""".split()

# By deriving a set from `raw_text`, we deduplicate the array
vocab = set(raw_text)
vocab_size = len(vocab)

word_to_ix = {word: i for i, word in enumerate(vocab)}
data = []
for i in range(2, len(raw_text) - 2):
    context = [raw_text[i - 2], raw_text[i - 1],
               raw_text[i + 1], raw_text[i + 2]]
    target = raw_text[i]
    data.append((context, target))
print(data[:5])


class CBOW(nn.Module):
    def __init__(self, vocab_size, embedding_dim):
        super(CBOW,self).__init__() 
        self.embeddings = nn.Embedding(vocab_size, embedding_dim) # embeddings, 待训练参数为embedding词表
        self.linear1 = nn.Linear(embedding_dim, vocab_size) # 待训练参数为 A b


    def forward(self, inputs):
        embeds = self.embeddings(inputs)
        add_embeds = torch.sum(embeds, dim=0).view(1,-1) # 相加后reshape
        out = self.linear1(add_embeds)
        log_probs = F.log_softmax(out)
        return log_probs

# create your model and train.  here are some functions to help you make
# the data ready for use by your module


def make_context_vector(context, word_to_ix):
    idxs = [word_to_ix[w] for w in context]
    tensor = torch.LongTensor(idxs)
    return Variable(tensor)


make_context_vector(data[0][0], word_to_ix)  # example

# 声明loss model optimizer
losses = []
loss_function = nn.NLLLoss()
model = CBOW(vocab_size, embedding_dim=20, context_size=CONTEXT_SIZE)
optimizer = optim.SGD(model.parameters(), lr=0.001)

# 训练10个epoch
for epoch in range(10):
    total_loss = torch.FloatTensor([0])
    for context, target in data:
        context_idxs = [word_to_ix[w] for w in context]
        target_idx = word_to_ix[target]
        context_var = Variable(torch.LongTensor(context_idxs))
        target_var = Variable(torch.LongTensor([target_idx]))
        model.zero_grad()
        log_probs = model(context_var)

        loss = loss_function(log_probs,target_var)
        loss.backward()
        optimizer.step()

        total_loss += loss.data
    losses.append(total_loss)
print(losses)

运行结果:

[(['We', 'are', 'to', 'study'], 'about'), (['are', 'about', 'study', 'the'], 'to'), (['about', 'to', 'the', 'idea'], 'study'), (['to', 'study', 'idea', 'of'], 'the'), (['study', 'the', 'of', 'a'], 'idea')]
[
 260.2805
[torch.FloatTensor of size 1]
, 
 255.0300
[torch.FloatTensor of size 1]
, 
 249.8967
[torch.FloatTensor of size 1]
, 
 244.8781
[torch.FloatTensor of size 1]
, 
 239.9720
[torch.FloatTensor of size 1]
, 
 235.1766
[torch.FloatTensor of size 1]
, 
 230.4900
[torch.FloatTensor of size 1]
, 
 225.9105
[torch.FloatTensor of size 1]
, 
 221.4367
[torch.FloatTensor of size 1]
, 
 217.0672
[torch.FloatTensor of size 1]
]


  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值