1. nn.Embedding
torch.nn.Embedding(num_embeddings,
embedding_dim,
padding_idx=None,
max_norm=None,
norm_type=2,
scale_grad_by_freq=False,
sparse=False)
参数
num_embeddings (int)
- 嵌入字典的大小
embedding_dim (int)
- 每个嵌入向量的大小
padding_idx (int, optional) - 如果提供的话,输出遇到此下标时用零填充
max_norm (float, optional) - 如果提供的话,会重新归一化词嵌入,使它们的范数小于提供的值
norm_type (float, optional) - 对于max_norm选项计算p范数时的p
scale_grad_by_freq (boolean, optional) - 如果提供的话,会根据字典中单词频率缩放梯度
属性
weight (Tensor)
-形状为(num_embeddings, embedding_dim)
,即初始化的词向量
输入输出
输入:LongTensor (N, W)
, N = mini-batch
, W = 每个mini-batch
中提取的下标数
输出:(N, W, embedding_dim)
该函数用于词嵌入,类似tensorflow中的lookup_embedding
例子:
import torch.nn as nn
import torch
# 随机初始化词向量
embedding = nn.Embedding(4,5)
# 打印随机初始向量矩阵
print(embedding.weight)
x = torch.tensor([[1,2,3],[1,2,1]])
# 根据索引词嵌入
tensor = embedding(x)
print(tensor)
输出
tensor([[ 0.4416, -0.6984, -0.7924, 0.1451, -2.0954],
[ 0.6118, -1.1465, 0.0256, -0.1333, -0.2335],
[-0.2569, 0.7854, 0.3151, -0.8586, 1.5699],
[-1.0538, -0.7021, -0.2123, 1.1679, 0.2272]], requires_grad=True)
tensor([[[ 0.6118, -1.1465, 0.0256, -0.1333, -0.2335],
[-0.2569, 0.7854, 0.3151, -0.8586, 1.5699],
[-1.0538, -0.7021, -0.2123, 1.1679, 0.2272]],
[[ 0.6118, -1.1465, 0.0256, -0.1333, -0.2335],
[-0.2569, 0.7854, 0.3151, -0.8586, 1.5699],
[ 0.6118, -1.1465, 0.0256, -0.1333, -0.2335]]],
grad_fn=<EmbeddingBackward>)
注意:
上面可以通过设置embedding的requires_grad属性来指定训练过程中词向量是否需要更新,如下
import torch.nn as nn
import torch
# 加载训练好的词向量
embedding = nn.Embedding(4, 5)
# 设置词向量在训练过程中不更新,默认为TRUE
embedding.weight.requires_grad = False
inputs = torch.LongTensor([[1,2],[1,3]])
print("embedding weight:", embedding.weight)
print("embedding:", embedding(inputs))
2. nn.Embedding.from_pretrained
方法1:
nn.Embedding.from_pretrained(embeddings,
freeze=True,
padding_idx=None,
max_norm=None,
norm_type=2.,
scale_grad_by_freq=False,
sparse=False)
重要参数说明
embeddings
:训练好的词向量
freeze
:若为True在训练过程中不会再更新词向量,若为False则训练过程中继续更新词向量,默认为True
import torch.nn as nn
import torch
# 已训练好的向量
word_vec = torch.randn(4,5)
# 加载训练好的词向量
embedding = nn.Embedding.from_pretrained(word_vec)
inputs = torch.LongTensor([[1,2],[1,3]])
print("word_vec:", word_vec)
print("embedding weight:", embedding.weight)
print("embedding:", embedding(inputs))
word_vec: tensor([[-2.6406, -1.7227, 0.7873, -1.6107, -0.9043],
[ 0.5398, 0.1855, -0.0469, -0.8214, 1.1612],
[-0.4362, -0.5964, -0.1713, 0.5957, 1.1884],
[ 0.0313, 0.2418, 0.6130, 0.3484, 1.2539]])
embedding weight: Parameter containing:
tensor([[-2.6406, -1.7227, 0.7873, -1.6107, -0.9043],
[ 0.5398, 0.1855, -0.0469, -0.8214, 1.1612],
[-0.4362, -0.5964, -0.1713, 0.5957, 1.1884],
[ 0.0313, 0.2418, 0.6130, 0.3484, 1.2539]])
embedding: tensor([[[ 0.5398, 0.1855, -0.0469, -0.8214, 1.1612],
[-0.4362, -0.5964, -0.1713, 0.5957, 1.1884]],
[[ 0.5398, 0.1855, -0.0469, -0.8214, 1.1612],
[ 0.0313, 0.2418, 0.6130, 0.3484, 1.2539]]])
上面也可以使用requires_grad属性来设置词向量是否更新