从NN到CNN到RNN

最新推荐文章于 2024-03-04 14:05:35 发布

麻雀2025

最新推荐文章于 2024-03-04 14:05:35 发布

阅读量215

点赞数

文章标签： tensorflow 深度学习

本文链接：https://blog.csdn.net/qq_43208303/article/details/124409524

版权

本文详细介绍了RNN、LSTM在深度学习中的工作原理和应用场景，包括RNN的时间步概念、LSTM的门控机制以及如何在实际问题中使用动态RNN和多层LSTM。通过实例展示了如何构建和运行多层LSTM模型，并讨论了不同RNN结构的输出和状态管理。

摘要由CSDN通过智能技术生成

title: Group Normalization 论文笔记
categories:

论文笔记
tags: 实习
date: 2018-11-28 21:31:55

输入数据层面

以minist数据为例，输入[55000, 784]，第一维为batchsize，中间层-1代表可自动调整

对NN来说，不看batch维，输入为[-1, 784]，假设1隐层625节点，用W[784, 625]来调节，即W后一个数代表隐层节点数，传入下一层的维度由7841变为6251减少了。
对CNN来说，同样是784维数据，由于后续要用不同的Kernal，我们把数据延展成3维度，[batch, row, col, kernal]，即平面卷积操作相当于一次少部分的全连接
```
trX, trY, teX, teY = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels
trX = trX.reshape(-1, 28, 28, 1)  # 28x28x1 input img
```
对RNN来说，独有的时间步，也是对一维向量输入做二维延展，[batch, timestep_size, input_size] ，相当于每行一个时间步输入，每个时间步节点看做是NN的节点，包括输入x和输出h，处理过后的h即下一层的输入，h的维度即隐层节点数，也是通过W改变的，多层的RNN对比多层的NN，就是输入不仅接收当前的输入，还要接受隐层神经元前一时刻的输入，注意input_size是按行抽象后的输入信息。

LSTM实现MINIST例子：https://blog.csdn.net/Jerr__y/article/details/61195257, 参数解析

RNN中的call函数：https://www.twblogs.net/a/5ca59985bd9eee5b1a072277

我們通常是將一個batch送入模型計算，設輸入數據的形狀爲 [batch_size, input_size]，那麼計算時得到的隱層狀態就是[batch_size, state_size]，虽然我们在输入时进行了扩维度(抽出时间步维度），但输出时仍组合为类似输入时的一维(一般取时间序列上的最后一个状态为输出，例如mnist相当于读了28行像素整个图片后，由于RNN可以横向累计之前算的结果，我们取最后一个state再softmax得分类结果)，輸出就是[batch_size, output_size]。

def call(self, inputs, state):
    """Most basic RNN: output = new_state = act(W * input + U * state + B)."""
    output = self._activation(_linear([inputs, state], self._num_units, True))
    return output, output

這句“return output, output”說明在BasicRNNCell中，output其實和隱狀態的值是一樣的。因此，我們還需要額外對輸出定義新的變換，才能得到圖中真正的輸出y。由於output和隱狀態是一回事，所以在BasicRNNCell中，state_size永遠等於output_size，如果输出y可再进行一次线性变换。

'''code: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py'''

  def __call__(self, inputs, state, scope=None):
      """Long short-term memory cell (LSTM)."""
      with vs.variable_scope(scope or "basic_lstm_cell"):
          # Parameters of gates are concatenated into one multiply for efficiency.
          if self._state_is_tuple:
              c, h = state
          else:
              c, h = array_ops.split(value=state, num_or_size_splits=2, axis=1)
          concat = _linear([inputs, h], 4 * self._num_units, True, scope=scope)

          # ** 下面四个 tensor，分别是四个 gate 对应的权重矩阵
          # i = input_gate, j = new_input, f = forget_gate, o = output_gate
          i, j, f, o = array_ops.split(value=concat, num_or_size_splits=4, axis=1)

          # ** 更新 cell 的状态： 
          # ** c * sigmoid(f + self._forget_bias) 是保留上一个 timestep 的部分旧信息
          # ** sigmoid(i) * self._activation(j)  是有当前 timestep 带来的新信息
          new_c = (c * sigmoid(f + self._forget_bias) + sigmoid(i) *
               self._activation(j))

          # ** 新的输出
          new_h = self._activation(new_c) * sigmoid(o)

          if self._state_is_tuple:
              new_state = LSTMStateTuple(new_c, new_h)
          else:
              new_state = array_ops.concat([new_c, new_h], 1)
          # ** 在(一般都是) state_is_tuple=True 情况下， new_h=new_state[1]
          # ** 在上面博文中，就有 cell_output = state[1]
          return new_h, new_state

我們只需要關注self._state_is_tuple == True的情況，因爲self._state_is_tuple == False的情況將在未來被棄用。返回的隱狀態是new_c和new_h的組合，而output就是單獨的new_h。如果我們處理的是分類問題，那麼我們還需要對new_h添加單獨的Softmax層才能得到最後的分類概率輸出。

即最后还是返回output和state，但由于LSTM中有cell加上h相当于两个状态，因此state = tuple(cell, hidden)。

一次執行多步：tf.nn.dynamic_rnn

基礎的RNNCell，我們使用它的call函數進行運算時，只是在序列時間上前進了一步。比如使用x1、h0得到h1，通過x2、h1得到h2等。這樣的h話，如果我們的序列長度爲10，就要調用10次call函數，比較麻煩，每次使用上次的结果，就不用保存中间step的结果。

TensorFlow提供了一個tf.nn.dynamic_rnn函數，使用該函數就相當於調用了n次call函數。即通過{h0,x1, x2, …., xn}直接得{h1,h2…,hn}，且这里每次的中间步hi都会保存，即输出维度从[batch, output_size]增加成[batch, time_step, output_size]。

設我們輸入數據的格式爲 [ batch_size, time_steps, input_size ]，其中time_steps表示序列本身的長度。最後的input_size就表示輸入數據單個序列單個時間維度上固有的長度（我们也可以人为切分time_step)。另外我們已經定義好了一個RNNCell，調用該RNNCell的call函數time_steps次:

# inputs: shape = (batch_size, time_steps, input_size) 
# cell: RNNCell
# initial_state: shape = (batch_size, cell.state_size)。初始狀態。一般可以取零矩陣
#outputs, state = tf.nn.dynamic_rnn(cell, inputs, initial_state=initial_state)

import tensorflow as tf
import numpy as np
from tensorflow.python.ops import variable_scope as vs

output_size = 5   #  输出的神经元
batch_size = 4   #  输入的 x 的长度
time_step = 3
dim = 3   #  输入x 的宽度
cell = tf.nn.rnn_cell.LSTMCell(num_units=output_size)
inputs = tf.placeholder(dtype=tf.float32, shape=[time_step, batch_size, dim])
h0 = cell.zero_state(batch_size=batch_size, dtype=tf.float32) #此处调用RNNcell.里的函数初始状态仅提供batch即可，num_units自动计算
X = np.array([[[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]],  # x1
              [[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]],  # x2
              [[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]]])  # x3
outputs, final_state = tf.nn.dynamic_rnn(cell, inputs, initial_state=h0, time_major=True)

sess = tf.Session()
sess.run(tf.global_variables_initializer())
rnn_output, rnn_final_state = sess.run([outputs, final_state], feed_dict={inputs:X})
print(rnn_output)#[time_step, batch, output_dim],对比输入为[time_step, batch, input_dim]
print(rnn_final_state)#outputs里包含了(outputs1,outputs2,outputs3)3*4*5 而final_stat就只是h3 ，并且outputs3,h3 是相等的。

此時，得到的outputs就是time_steps步裏所有的輸出，注意time_major=True，则它的形狀爲 [ time_steps, batch_size, cell.output_size ]。state是最後一步的隱狀態，它的形狀爲 [batch_size, cell.state_size]。

堆疊RNNCell：MultiRNNCell

outputs’ is a tensor of shape [batch_size, max_time, cell.output_size]
‘state’ is a N-tuple where N is the number of LSTMCells containing a tf.contrib.rnn.LSTMStateTuple for each cell
state是最後一步的隱狀態，有多少层把他们组合成一个tuple([batch, last_state]*n)

import tensorflow as tf
import numpy as np

batch_size = 10
depth = 128
num_units = [100, 400, 300]

cells = [tf.nn.rnn_cell.BasicLSTMCell(num_unit) for num_unit in num_units]
print(cells)
mul_cells = tf.nn.rnn_cell.MultiRNNCell(cells)

previous_state0 = (tf.random_normal([batch_size, 100]), tf.random_normal([batch_size, 100]))
previous_state1 = (tf.random_normal([batch_size, 400]), tf.random_normal([batch_size, 400]))
previous_state2 = (tf.random_normal([batch_size, 300]), tf.random_normal([batch_size, 300]))

inputs = tf.Variable(tf.random_normal([batch_size, depth]))#depth=dims*time_step

outputs, states = mul_cells.call(inputs=inputs, state=(previous_state0, previous_state1, previous_state2))
    
print(outputs)#只输出最后一层的结果，加了dynamic后增加time_step维度
print(states)#只输出最后一个状态，加了muti后输出每层最后一个状态，以tuple形式多了layer_num维度((c,h),(c,h),(c,h))

将一次执行多步和多层结合构件深层LSTM

注意***是基于上述代码增加的部分

import tensorflow as tf
import numpy as np
from tensorflow.python.ops import variable_scope as vs

output_size = 5   #  输出的神经元
batch_size = 4   #  输入的 x 的长度
time_step = 3
dim = 3   #  输入x 的宽度
cell = tf.nn.rnn_cell.LSTMCell(num_units=output_size)

### ***添加单元，并用MuliRNNCell“组合”起来
cell2 = tf.nn.rnn_cell.LSTMCell(num_units=output_size)
TwoCell = tf.nn.rnn_cell.MultiRNNCell([cell, cell2])

inputs = tf.placeholder(dtype=tf.float32, shape=[time_step, batch_size, dim])
h0 = cell.zero_state(batch_size=batch_size, dtype=tf.float32)

### ***添加每层初状态，不用改变输入，因为中间层的输入是前层的输出，同理加层数时输出矩阵不变，状态矩阵增加层维度
h1 = cell.zero_state(batch_size=batch_size, dtype=tf.float32)
Two_h = (h0, h1)#不能用tuple(h0, h1) 这个函数接收一个参数(数组或dict转tuple)

X = np.array([[[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]],  # x1
              [[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]],  # x2
              [[1, 2, 1], [2, 0, 0], [2, 1, 0], [1, 1, 0]]])  # x3
outputs, final_state = tf.nn.dynamic_rnn(TwoCell, inputs, initial_state=(h0, h1), time_major=True)

sess = tf.Session()
sess.run(tf.global_variables_initializer())
rnn_output, rnn_final_state = sess.run([outputs, final_state], feed_dict={inputs:X})
print(rnn_output)#[time_step, batch, output_dim],对比输入为[time_step, batch, input_dim]
print(rnn_final_state)#outputs里包含了(outputs1,outputs2,outputs3)3*4*5 而final_stat就只是h3 ，并且outputs3,h3 是相等的。

麻雀2025

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
从NN到CNN到RNN

title: Group Normalization 论文笔记categories:论文笔记tags: 实习date: 2018-11-28 21:31:55输入数据层面以minist数据为例，输入[55000, 784]，第一维为batchsize，中间层-1代表可自动调整对NN来说，不看batch维，输入为[-1, 784]，假设1隐层625节点，用W[784, 625]来调节，即W后一个数代表隐层节点数，传入下一层的维度由7841变为6251减少了。对CNN来说，同样是.
复制链接

扫一扫