pytorch官方教程学习笔记08：SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT

最新推荐文章于 2024-04-29 16:12:40 发布

cc 提升ing 变优秀ing

最新推荐文章于 2024-04-29 16:12:40 发布

阅读量1k

点赞数 3

分类专栏： nlp pytorch 文章标签： pytorch

本文链接：https://blog.csdn.net/weixin_42721412/article/details/109856721

版权

本文详细介绍了如何使用PyTorch的TransformerEncoder进行序列到序列建模，结合TorchText处理数据，包括self-attention机制、PositionalEncoding、数据预处理以及分批次处理和模型训练的基本步骤。

摘要由CSDN通过智能技术生成

1.seq2seq的transformer模型图：

在这里插入图片描述

2.本节任务：

The language modeling task is to assign a probability for the likelihood of a given word (or a sequence of words) to follow a sequence of words.
语言建模任务是为给定的单词(或单词序列)与单词序列之间的关系分配一个概率。

3.TransformerEncoderLaye:

官方链接
is made up of self-attn and feedforward network。
参数：
在这里插入图片描述

import torch
import torch.nn as nn
encoder_layer = nn.TransformerEncoderLayer(d_model=512, nhead=8)
src = torch.rand(10, 32, 512)
out = encoder_layer(src)
print(out.shape)

结果：
在这里插入图片描述
要点说明：：：
1.经过这一层的处理输入维度和输出是相同的。
2.文本中一般都是三维的，第一维表示句子的个数，第二维表示这个句子的词或字的个数，第三位表示的是词向量的嵌入的深度。
3.词向量嵌入的深度应该可以整除nhead，512/8，可以。
4.d_model官方的解释是特征的个数，相当于是对词嵌入的另一种解释，词嵌入的深度就是特征的个数。

self-attention之后维度不变：

在这里插入图片描述

4.TransformerEncoder

官方
TransformerEncoder is a stack of N encoder layers
torch.nn.TransformerEncoder(encoder_layer, num_layers, norm=None)
表示num_layers层encoder_layer这样的网络，
怎样的网络呢？3中用TransformerEncoderLaye定义的网络。

5.PositionalEncoding:

感觉就是一个数学过程，产生一个和embedding深度一样的词向量，然后与原来的embedding相加。

class PositionalEncoding(nn.Module):

    def __init__(self, d_model, dropout=0.1, max_len=5000):
        super(PositionalEncoding, self).__init__()
        self.dropout = nn.Dropout(p=dropout)
        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0).transpose(0, 1)
        self.register_buffer('pe', pe)

    def forward(self, x):
        x = x + self.pe[:x.size(0), :]
        return self.dropout(x)

6.torchtext处理数据：

Field：
官网

TEXT = torchtext.data.Field(tokenize=get_tokenizer("basic_english"),
                            init_token='<sos>',
                            eos_token='<eos>',
                            lower=True)

参数：
第一个参数：指定为进行的是"basic_english"的tokenizer，不同的语言进行tokenizer不同。
第二个参数：A token that will be prepended to every example using this field, or None for no initial token. 每一个句子前面都加这个字符。
第三个参数：A token that will be appended to every example using this field, or None for no end-of-sentence token. Default: None. 每一个句子的最后都加一个字符。
第四个参数：Whether to lowercase the text in this field 全部转化为小写、

train_txt, val_txt, test_txt = torchtext.datasets.WikiText2.splits(TEXT)

wikitext-2数据集,是torchtext中自然语言建模数据集之一。
对数据集使用上述的Field。

TEXT.build_vocab(train_txt)

Field对象TEXT通过调用build_vocab()方法来生成一个内置的Vocab对象。vocab有很多的关于词向量与词的内置函数可以使用。

7.定义一个分批次的函数:

# 定义一个分批次的函数。
def batchify(data, bsz):
    data = TEXT.numericalize([data.examples[0].text])
    # Divide the dataset into bsz parts.
    nbatch = data.size(0) // bsz
    # Trim off any extra elements that wouldn't cleanly fit (remainders).
    data = data.narrow(0, 0, nbatch * bsz)
    # Evenly divide the data across the bsz batches.
    data = data.view(bsz, -1).t().contiguous()
    return data.

最低0.47元/天解锁文章

cc 提升ing 变优秀ing

关注

3
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
pytorch官方教程学习笔记08：SEQUENCE-TO-SEQUENCE MODELING WITH NN.TRANSFORMER AND TORCHTEXT

1.seq2seq的transformer模型图：2.本节任务：The language modeling task is to assign a probability for the likelihood of a given word (or a sequence of words) to follow a sequence of words.语言建模任务是为给定的单词(或单词序列)与单词序列之间的关系分配一个概率。3.TransformerEncoderLaye:官方链接is made
复制链接

扫一扫

专栏目录