用numpy直接构建RNN，并用keras中的simpleRNN来实现IMDB电影评论项目

Dream_Bri

已于 2022-10-30 10:23:45 修改

阅读量962

点赞数 1

文章标签： keras 深度学习 python

于 2022-10-26 11:11:19 首次发布

本文链接：https://blog.csdn.net/ximu__l/article/details/127487329

版权

概要：本文是先简单介绍卷积神经网络，随后分成三块来介绍：
1、用python中的numpy库实现简单RNN；
2、keras中使用simpleRNN来实现RNN；
3、将keras中的simpleRNN用于IMDB电影评论项目。

概要

很多神经网络(如密集连接网络和卷积神经网络)都有一个特点，就是都没有记忆。他们单独处理每个输入，在输入与输入之间没有任何的关联。对于这样的网络，要想处理数据点的序列或者时间序列，就需要将网络同时展示整个序列，即将网络转换成单个的数据点，然后再一次性进行处理。

但是真的在阅读时，人类是在记住前面内容的时候往后阅读，如何将模型根据过去的信息进行构建，并随着新的信息的输入而不断进行更新。

循环神经网络(RNN，recurrent neural network)采用这样的原理，不过是简化后的版本：它处理序列的方式是，遍历所有序列元素，并保存一个状态，其中包含与已查看内容相关的信息。实际上，RNN是一类具有内部环的神经网络。在处理两个不同的独立序列之间，RNN状态会被重置。因此，仍可以将一个序列看作单个数据点，即网络的单个输入。真正改变的是，数据点不再是在单个步骤中进行处理，相反，网络内部会对序列元素进行遍历。
循环网络：带有环的网络
我们使用numpy来实现简单的RNN前向传递。RNN的输入是一个张量序列，我们将其编码为大小为(timesteps,input_features)的二维张量。

其中每一个时间步的状态设置为上一个实践步的输出。对于第一个时间步，上一个时间步的输出没有定义，所以需要设置网络的初始化。

将RNN的前向传播写一个简单的代码实现：

state_t = 0
for input_t in input_sequence:
	output_t = activation(dot(W, input_t) + dot(U, state_t) + b)
	state_t = output_t

用Numpy实现简单RNN

import numpy as np
# 输入序列的时间步数
timesteps = 100 
#输入特征的维度
input_features = 32 
# 输出特征的维度
output_features = 64 
inputs = np.random.random((timesteps, input_features)) 
#初始状态是全零向量
state_t = np.zeros((output_features,)) 
# 创建随机的权重矩阵
W = np.random.random((output_features, input_features)) 
U = np.random.random((output_features, output_features))
b = np.random.random((output_features,))
successive_outputs = []
for input_t in inputs: 
	output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b) 
	# 将输出保存再一个列表中
	successive_outputs.append(output_t) 
	# 更新网络的状态，用于下一个时间步
	state_t = output_t 
	#最终输出是一个形状为(timesteps,output_features)的二维张量
final_output_sequence = np.stack(successive_outputs, axis=0)

总之，RNN是一个for循环，重复使用循环前一次迭代的计算结果，仅此而已。
一个简单RNN，沿时间展开

用keras中的SimpleRNN循环层实现RNN

上一段的numpy的简单实现，对应一个实际的keras层即simpleRNN层。

from keras.layers import SimpleRNN

二者有一个小小的区别：simpleRNN能够像其它keras层一样处理序列批量，而不是想numpy示例那样只能处理单个序列，因此，它可以接受形状为(batch_size,timesteps,input_features)的三维张量，而不是(timesteps，input_features)。

Keras 中的所有循环层一样，SimpleRNN 可以在两种不同的模式下运行：一种是返回每个时间步连续输出的完整序列，即形状为(batch_size,timesteps, output_features)
的三维张量；另一种是只返回每个输入序列的最终输出，即形状为 (batch_size,output_features) 的二维张量。这两种模式由 return_sequences 这个构造函数参数来控制。

将keras中的模型用于IMDB电影评论分类

准备IMDB数据

from keras.datasets import imdb
from keras.preprocessing import sequence

# 作为特征的单词个数
max_features = 10000 
# 在maxlen个单词后截断文本
maxlen = 500 
batch_size = 32
print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(
 num_words=max_features)
print(len(input_train), 'train sequences')
print(len(input_test), 'test sequences')
print('Pad sequences (samples x time)')
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)
print('input_train shape:', input_train.shape)
print('input_test shape:', input_test.shape)

用Embedding层和一个SimpleRNN层来训练模型

我们用Embedding层和一个SimpleRNN层来训练一个简单的循环网络。
其中Embedding层的介绍和用法可以查看文本向量化与文本处理中的Embedding

from keras.layers import Dense
model = Sequential() 
model.add(Embedding(max_features, 32)) 
model.add(SimpleRNN(32)) 
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model.fit(input_train, y_train,
					epochs=10, 
					batch_size=128,
					validation_split=0.2)

绘制结果

import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()