rnn主要处理具有序列连接的输入
例如天气 、股票、自然语言(我爱北京天安门)
图像生成文本:cnn+rnn
目录
3、代码:举例:seq-》seq序列到序列 ”hello“-》”ohlol“
1、rnn的引入
单纯的线性层去处理带有时间序列的问题是会有一下问题
(1)太多参数-解决:权值共享
(2)没有上下文信息
便引入了rnn
RNN 图解
左图 展开 成有图
2、代码方法一:RNNCell
代码主要部分
初始化input_size、batch_size、seq_len、hidden_size
input.shape=(batch_size,input_size)
hidden.shape=(batch_size,hidden_size)
dataset.shape=(seqlen,batch_size,input_size)
代码方法二:RNN
1、代码主要部分
初始化input_size、batch_size、seq_len、hidden_size、num_layers
input.shape=(seq_len,batch_size,input_size)
output.shape=(seq_len,batch_size,hidden_size)
hidden.shape=(num_layers,batch_size,hidden_size)
2、num_layers:层数
num_layers=1
num_layers=3
不同颜色不一个线性层 有三个线性层
3、代码:举例:seq-》seq序列到序列 ”hello“-》”ohlol“
字符向量化:根据字符构造词典,为每一个字符分配一个索引,根据词典将其变成相应词典,再变成向量
rnn代码
import torch
import matplotlib.pyplot as plt
# 1、准备数据
input_size = 4#有4个不同的字符
hidden_size = 4
num_layers = 1
batch_size = 1
seq_len = 5#一共有5个字符
idx2char=['e','h','l','o']
x_data=[1,0,2,2,3]
y_data=[3,1,2,3,2]
one_hot_lookup=[[1,0,0,0],
[0,1,0,0],
[0,0,1,0],
[0,0,0,1]]
x_one_hot=[one_hot_lookup[x] for x in x_data]
# x_one_hot:[seq_len,input_size]
inputs=torch.Tensor(x_one_hot).view(seq_len,batch_size,input_size)
labels=torch.LongTensor(y_data)
class Model(torch.nn.Module):
def __init__(self,input_size,hidden_size,batch_size,num_layers=1):
super(Model, self).__init__()
self.num_layers = num_layers
self.batch_size = batch_size
self.hidden_size = hidden_size
self.input_size = input_size
self.rnn=torch.nn.RNN(num_layers = self.num_layers,
hidden_size = self.hidden_size,
input_size = self.input_size)
def forward(self, input):
hidden=torch.zeros(self.num_layers,self.batch_size,self.hidden_size)
out,_=self.rnn(input,hidden)
# print(out.shape)=self.batch_size
out=out.view(seq_len *self.batch_size, self.hidden_size)
# print(out.shape)=torch.Size([5, 4])
# return out.view(self.seq_len*self.batch_size,self.hidden_size)错
# return out.view(-1, self.hidden_size)对
return out
net=Model(input_size,hidden_size,batch_size,num_layers)
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=0.05)
loss_list = []
epoch_list = []
for epoch in range(10):
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
epoch_list.append(epoch)
loss_list.append(loss.item())
loss.backward()
optimizer.step()
_, idx=outputs.max(dim=1)
# print(idx,idx.shape)=tensor([0, 0, 2, 3, 3]) torch.Size([5])
idx = idx.data.numpy()
# print(idx, idx.shape)=[0 0 2 3 3] *5,)
print('Predicted:', ''.join([idx2char[x] for x in idx]), end='')
print(',Epoch[%d/15] loss=%.3f' %(epoch+1,loss.item()))
plt.plot(epoch_list, loss_list)
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.show()
结果