nn.Embedding( num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None, )一个保存了固定字典和大小的简单查找表。这个模块常用来保存词嵌入和用下标检索它们。模块的输入是一个下标的列表,输出是对应的词嵌入。
num_embeddings (int): size of the dictionary of embeddings 嵌入字典的大小
embedding_dim (int): the size of each embedding vector 每个嵌入向量的大小
padding_idx (int, optional): If given, pads the output with the embedding vector at :attr:`padding_idx` (initialized to zeros) whenever it encounters the index. 如果给定,则每当输出遇到索引时,使用位于:attr:`padding_idx`(初始化为零)的嵌入向量填充输出
max_norm (float, optional): If given, each embedding vector with norm larger than :attr:`max_norm` is renormalized to have norm :attr:`max_norm`.
norm_type (float, optional): The p of the p-norm to compute for the :attr:`max_norm` option. Default ``2``.
scale_grad_by_freq (boolean, optional): If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default ``False``. 这将按小批量中单词频率的倒数缩放渐变
sparse (bool, optional): If ``True``, gradient w.r.t. :attr:`weight` matrix will be a sparse tensor. See Notes for more details regarding sparse gradients.
Attributes:
weight (Tensor): the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from :math:`\mathcal{N}(0, 1)` 初始化形状模块(num_embeddings,embedding_dim)的可学习权重
Shape: - Input: :math:`(*)`, LongTensor of arbitrary shape containing the indices to extract
- Output: :math:`(*, H)`, where `*` is the input shape and :math:`H=\text{embedding\_dim}`
.. note:: Keep in mind that only a limited number of optimizers support sparse gradients: currently it's :class:`optim.SGD` (`CUDA` and `CPU`), :class:`optim.SparseAdam` (`CUDA` and `CPU`) and :class:`optim.Adagrad` (`CPU`) 请记住,只有有限数量的优化器支持稀疏渐变:当前为:SGD('CUDA'和'CPU')Adam`(`CUDA`和`CPU`)和optim.Adagrad(`CPU`)
.. note:: With :attr:`padding_idx` set, the embedding vector at :attr:`padding_idx` is initialized to all zeros. However, note that this vector can be modified afterwards, e.g., using a customized initialization method, and thus changing the vector used to pad the output. The gradient for this vector from :class:`~torch.nn.Embedding` is always zero. 当设置了:`padding_idx`时,位于:`padding_idx`的嵌入向量将初始化为所有零。但是,请注意,可以在之后修改该向量,例如,使用自定义的初始化方法,从而更改用于填充输出的向量。此向量的渐变自:torch.nn.Embedding总是零。
embedding = nn.Embedding(10, 3)
input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
embedding(input)
embedding = nn.Embedding(10, 3, padding_idx=0)
input = torch.LongTensor([[0,2,0,5]])
embedding(input)
nn.Embedding(10, 3, padding_idx=0).weight