torch笔记-----torch.nn.embedding的理解

DLst_liu

于 2024-09-02 18:49:11 发布

阅读量544

点赞数 12

分类专栏： torch笔记文章标签：笔记人工智能深度学习 pytorch

本文链接：https://blog.csdn.net/qq_45022754/article/details/141826271

版权

torch笔记专栏收录该内容

2 篇文章 0 订阅

订阅专栏

nn.embedding(num_embedding, embedding_dim)

num_embedding：词典的大小

embedding_dim：embedding的维度

对于输入大小为x.shape(batch_size, seq_len) 经过nn.embedding(num_embedding, embedding_dim)后，输出为y.shape(batch_size, seq_len, embedding_dim)，而这个nn.embedding是训练的。这个全部操作可以看成 toker--->num--->vector。

例如：输入x.shape(batch_size=4，seq_len=10)

nn.embedding.shape(num_embedding=200, embedding_dim=20)

=====>nn.embedding(x)=====>输出的y.shape(4, 10, 20)

可以理解为对于输入x，有4个句子，其中每个句子有10个单词，然后通过nn.embedding操作，将每个词转换成一个个向量，而这些向量是从nn.embedding中查询（每个单词会有一个索引，然后通过索引进行查询）得到的，其中nn.embedding存储的有200个单词，每个单词的向量长度是20，所以最后输出的y.shape(batch_size=4, seq_len=10, embedding_dim=20)。

可以通过torch手册看出：

This module is often used to store word embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings.

可以通过b站的视频进一步理解：

代码进行理解：

例1：

import torch
import torch.nn as nn

x = torch.rand(4, 10).long()

embedding = nn.Embedding(200, 20)

out = embedding(x)

# 输入的形状
print("x.shape:", x.shape)
# 输出的形状
print("out.shape:", out.shape)
# 词嵌入权重
print("embedding.weight.shape:", embedding.weight.shape)

输出：

x.shape: torch.Size([4, 10])
out.shape: torch.Size([4, 10, 20])
embedding.weight.shape: torch.Size([200, 20])

例2：

对于输入向量input，嵌入层（embedding）会查找每一个索引对应的向量，并返回一个形状为(2, 4, 3)的向量，其中2是样本数，4个是每个样本中索引，3个是每个索引对应的向量维度。

比如：对于例2，输入中的 1 对应的向量是[-0.0251, -1.6902, 0.7172]，输入的 2 对应的向量是[-0.6431, 0.0748, 0.6969]，以此类推，则第一个样本[1, 2, 4, 5]经过嵌入层返回的如下：

[[-0.0251, -1.6902,  0.7172],
[-0.6431,  0.0748,  0.6969],
[ 1.4970,  1.3448, -0.9685],
[-0.3677, -2.7265, -0.1685]]

第二个样本[4, 3, 2, 9]以此类推。

DLst_liu

关注

12
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录