词嵌入embedding

什么是词嵌入
  • 通过一定的方式将词汇映射到指定维度的空间,神经网络中加入embedding层,表示 对整个网络进行训练的同时产生的embedding矩阵,这个embedding矩阵就是训练过程中所有输入词汇的向量组成的矩阵

代码实现

import torch
import torch.nn as nn
# 定义embedding类实现词嵌入层
class embedding(nn.Module):
	def __init__(self,vocab,dim_size):
		"""类的初始化函数, 有两个参数, dim_size: 指词嵌入的维度, vocab: 指词表的大小"""
		super().__init__()
		self.embedded=nn.Embedding(vocab,dim_size
	def forward(self,input):
		""" input 表示输入给模型的文本是通过词汇映射后的张量"""
		embedded_output=self.embedded(input)
		return embedded_output
if __name__=='__main__':
	vocab=5
	dim_size=512
	embedded=embedding(vocab,dim_size)
	#x_input=torch.tensor([[0,1,2,3,4,0,4,3,2,1],[3,4]])
	
	# ValueError: expected sequence of length 10 at dim 1 (got 2)
	#x_input=torch.tensor([[0,1],[3,4]])
	#x_input=torch.tensor([[0,1,2,3,4,0,4,3,2,1],[3,4,1,1,1,1,1,1,1,6]])
	# RuntimeError: index out of range: Tried to access index 6 out of table with 4 rows. at /pytorch/aten/src/TH/gen
eric/THTen
	x_input=torch.tensor([[0,1],[3,4],[2,3],[2,3],[0,0],[1,1]])
	embedded_output=embedded(x_input)
	print(embedded_output)
	print(embedded_output.size())

输出
tensor([[[ 1.5128,  0.8534, -0.6737,  ..., -0.4364,  0.5552, -1.0452],
         [ 0.9564, -0.5121,  0.0598,  ..., -1.0514,  1.6454, -0.0367]],

        [[ 0.3165, -0.1220,  1.2152,  ..., -0.8849, -0.6818,  1.1256],
         [ 0.2173,  0.2434,  0.7883,  ..., -0.1016,  0.6637, -0.6564]],

        [[-1.4701, -0.2199, -1.1331,  ...,  2.8803, -1.2399, -1.7725],
         [ 0.3165, -0.1220,  1.2152,  ..., -0.8849, -0.6818,  1.1256]],

        [[-1.4701, -0.2199, -1.1331,  ...,  2.8803, -1.2399, -1.7725],
         [ 0.3165, -0.1220,  1.2152,  ..., -0.8849, -0.6818,  1.1256]],

        [[ 1.5128,  0.8534, -0.6737,  ..., -0.4364,  0.5552, -1.0452],
         [ 1.5128,  0.8534, -0.6737,  ..., -0.4364,  0.5552, -1.0452]],

        [[ 0.9564, -0.5121,  0.0598,  ..., -1.0514,  1.6454, -0.0367],
         [ 0.9564, -0.5121,  0.0598,  ..., -1.0514,  1.6454, -0.0367]]],
       grad_fn=<EmbeddingBackward>)
torch.Size([6, 2, 512])

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值