nn.Embedding参数说明

CLASStorch.nn.Embedding(num_embeddings: intembedding_dim: intpadding_idx: Optional[int] = Nonemax_norm: Optional[float] = Nonenorm_type: float = 2.0scale_grad_by_freq: bool = Falsesparse: bool = False_weight: Optional[torch.Tensor] = None)[SOURCE]

A simple lookup table that stores embeddings of a fixed dictionary and size.

This module is often used to store word embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings.

Parameters

  • num_embeddings (int) – size of the dictionary of embeddings

  • embedding_dim (int) – the size of each embedding vector

  • padding_idx (intoptional) – If given, pads the output with the embedding vector at padding_idx (initialized to zeros) whenever it encounters the index.

  • max_norm (floatoptional) – If given, each embedding vector with norm larger than max_norm is renormalized to have norm max_norm.

  • norm_type (floatoptional) – The p of the p-norm to compute for the max_norm option. Default 2.

  • scale_grad_by_freq (booleanoptional) – If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. Default False.

  • sparse (booloptional) – If True, gradient w.r.t. weight matrix will be a sparse tensor. See Notes for more details regarding sparse gradients.

Variables

~Embedding.weight (Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from \mathcal{N}(0, 1)N(0,1)

Shape:

  • Input: (*)(∗) , LongTensor of arbitrary shape containing the indices to extract

  • Output: (*, H)(∗,H) , where * is the input shape and H=\text{embedding\_dim}H=embedding_dim

NOTE

Keep in mind that only a limited number of optimizers support sparse gradients: currently it’s optim.SGD (CUDA and CPU), optim.SparseAdam (CUDA and CPU) and optim.Adagrad (CPU)

NOTE

With padding_idx set, the embedding vector at padding_idx is initialized to all zeros. However, note that this vector can be modified afterwards, e.g., using a customized initialization method, and thus changing the vector used to pad the output. The gradient for this vector from Embedding is always zero.

Examples:

>>> # an Embedding module containing 10 tensors of size 3
>>> embedding = nn.Embedding(10, 3)
>>> # a batch of 2 samples of 4 indices each
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> embedding(input)
tensor([[[-0.0251, -1.6902,  0.7172],
         [-0.6431,  0.0748,  0.6969],
         [ 1.4970,  1.3448, -0.9685],
         [-0.3677, -2.7265, -0.1685]],

        [[ 1.4970,  1.3448, -0.9685],
         [ 0.4362, -0.4004,  0.9400],
         [-0.6431,  0.0748,  0.6969],
         [ 0.9124, -2.3616,  1.1151]]])


>>> # example with padding_idx
>>> embedding = nn.Embedding(10, 3, padding_idx=0)
>>> input = torch.LongTensor([[0,2,0,5]])
>>> embedding(input)
tensor([[[ 0.0000,  0.0000,  0.0000],
         [ 0.1535, -2.0309,  0.9315],
         [ 0.0000,  0.0000,  0.0000],
         [-0.1655,  0.9897,  0.0635]]])

 

说明:

embedding = nn.Embedding(10, 3)

第一个参数10表示的是字典大小(字典中的word总数)

第二个维度3表示的是被embedding后的维度,将一个index  embedding为多少维度的向量

 

input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
embs = embedding(input)
此处input的维度不限制

 

  • 2
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值