Coursera 吴恩达DeepLearning.AI 第五课 sequence model 序列模型第一周Building your Recurrent Neural Network - Ste

最新推荐文章于 2020-09-11 15:06:06 发布

forqzy

最新推荐文章于 2020-09-11 15:06:06 发布

阅读量3k

点赞数 1

分类专栏： Coursera_NG

本文链接：https://blog.csdn.net/forqzy/article/details/79266549

版权

这是Coursera上吴恩达深度学习课程的第五课，主要介绍如何构建基本的循环神经网络（RNN）和长短期记忆网络（LSTM）。内容包括RNN单元的前向传播、LSTM网络的结构和门控机制，以及RNN和LSTM的反向传播过程。作业涉及实现RNN和LSTM的前向传播，并理解其在自然语言处理等序列任务中的应用。

摘要由CSDN通过智能技术生成

本周的习题有点多，主要是python不熟悉，然后时间不够，提醒说马上过期才开始看的视频，optional部分没有写完

Building your Recurrent Neural Network - Step by Step

Welcome to Course 5's first assignment! In this assignment, you will implement your first Recurrent Neural Network in numpy.

Recurrent Neural Networks (RNN) are very effective for Natural Language Processing and other sequence tasks because they have "memory". They can read inputs x⟨t⟩ (such as words) one at a time, and remember some information/context through the hidden layer activations that get passed from one time-step to the next. This allows a uni-directional RNN to take information from the past to process later inputs. A bidirection RNN can take context from both the past and the future.

Notation:

Superscript [l] denotes an object associated with the lth layer.
- Example: a[4] is the 4th layer activation. W[5] and b[5] are the 5th layer parameters.
Superscript (i) denotes an object associated with the ith example.
- Example: x(i) is the ith training example input.
Superscript ⟨t⟩ denotes an object at the tth time-step.
- Example: x⟨t⟩ is the input x at the tth time-step. x(i)⟨t⟩ is the input at the tth timestep of example i.
Lowerscript i denotes the ith entry of a vector.
- Example: a[l]i denotes the ith entry of the activations in layer l.

We assume that you are already familiar with numpy and/or have completed the previous courses of the specialization. Let's get started!

Let's first import all the packages that you will need during this assignment.

       In [ ]: 
     

 
              import numpy as np 
              from rnn_utils import *

1 - Forward propagation for the basic Recurrent Neural Network

Later this week, you will generate music using an RNN. The basic RNN that you will implement has the structure below. In this example, Tx=Ty.

Figure 1: Basic RNN model

Here's how you can implement an RNN:

Steps:

Implement the calculations needed for one time-step of the RNN.
Implement a loop over Tx time-steps in order to process all the inputs, one at a time.

Let's go!

1.1 - RNN cell

A Recurrent neural network can be seen as the repetition of a single cell. You are first going to implement the computations for a single time-step. The following figure describes the operations for a single time-step of an RNN cell.

Figure 2: Basic RNN cell. Takes as input

x⟨t⟩x⟨t⟩ (current input) and

a⟨t−1⟩a⟨t−1⟩ (previous hidden state containing information from the past), and outputs

a⟨t⟩a⟨t⟩ which is given to the next RNN cell and also used to predict

y⟨t⟩y⟨t⟩

Exercise: Implement the RNN-cell described in Figure (2).

Instructions:

Compute the hidden state with tanh activation: a⟨t⟩=tanh(Waaa⟨t−1⟩+Waxx⟨t⟩+ba).
Using your new hidden state a⟨t⟩, compute the prediction ŷ ⟨t⟩=softmax(Wyaa⟨t⟩+by). We provided you a function: softmax.
Store (a⟨t⟩,a⟨t−1⟩,x⟨t⟩,parameters) in cache
Return a⟨t⟩ , y⟨t⟩ and cache

We will vectorize over m examples. Thus, x⟨t⟩ will have dimension (nx,m), and a⟨t⟩ will have dimension (na,m).

       In [ ]: 
     

 
              # GRADED FUNCTION: rnn_cell_forward 
              ​ 
              def rnn_cell_forward(xt, a_prev, parameters): 
                  """ 
                  Implements a single forward step of the RNN-cell as described in Figure (2) 
              ​ 
                  Arguments: 
                  xt -- your input data at timestep "t", numpy array of shape (n_x, m). 
                  a_prev -- Hidden state at timestep "t-1", numpy array of shape (n_a, m) 
                  parameters -- python dictionary containing: 
                                      Wax -- Weight matrix multiplying the input, numpy array of shape (n_a, n_x) 
                                      Waa -- Weight matrix multiplying the hidden state, numpy array of shape (n_a, n_a) 
                                      Wya -- Weight matrix relating the hidden-state to the output, numpy array of shape (n_y, n_a) 
                                      ba --  Bias, numpy array of shape (n_a, 1) 
                                      by -- Bias relating the hidden-state to the output, numpy array of shape (n_y, 1) 
                  Returns: 
                  a_next -- next hidden state, of shape (n_a, m) 
                  yt_pred -- prediction at timestep "t", numpy array of shape (n_y, m) 
                  cache -- tuple of values needed for the backward pass, contains (a_next, a_prev, xt, parameters) 
                  """ 
                   
                  # Retrieve parameters from "parameters" 
                  Wax = parameters["Wax"] 
                  Waa = parameters["Waa"] 
                  Wya = parameters["Wya"] 
                  ba = parameters["ba"] 
                  by = parameters["by"] 
                   
                  ### START CODE HERE ### (≈2 lines) 
                  # compute next activation state using the formula given above 
                  a_next = None 
                  # compute output of the current cell using the formula given above 
                  yt_pred = None    
                  ### END CODE HERE ### 
                   
                  # store values you need for backward propagation in cache 
                  cache = (a_next, a_prev, xt, parameters) 
                   
                  return a_next, yt_pred, cache 
             

       In [ ]: 
     

 
              np.random.seed(1) 
              xt = np.random.randn(3,10) 
              a_prev = np.random.randn(5,10) 
              Waa = np.random.randn(5,5) 
              Wax = np.random.randn(5,3) 
              Wya = np.random.randn(2,5) 
              ba = np.random.randn(5,1) 
              by = np.random.randn(2,1) 
              parameters = {
                "Waa": Waa, "Wax": Wax, "Wya": Wya, "ba": ba, "by": by} 
              ​ 
              a_next, yt_pred, cache = rnn_cell_forward(xt, a_prev, parameters) 
              print("a_next[4] = ", a_next[4]) 
              print("a_next.shape = ", a_next.shape) 
              print("yt_pred[1] =", yt_pred[1]) 
              print("yt_pred.shape = ", yt_pred.shape) 
             

Expected Output:

a_next[4]:	[ 0.59584544 0.18141802 0.61311866 0.99808218 0.85016201 0.99980978 -0.18887155 0.99815551 0.6531151 0.82872037]
a_next.shape:	(5, 10)
yt[1]:	[ 0.9888161 0.01682021 0.21140899 0.36817467 0.98988387 0.88945212 0.36920224 0.9966312 0.9982559 0.17746526]
yt.shape:	(2, 10)

1.2 - RNN forward pass

You can see an RNN as the repetition of the cell you've just built. If your input sequence of data is carried over 10 time steps, then you will copy the RNN cell 10 times. Each cell takes as input the hidden state from the previous cell (a⟨t−1⟩) and the current time-step's input data (x⟨t⟩). It outputs a hidden state (a⟨t⟩) and a prediction (y⟨t⟩) for this time-step.

Figure 3: Basic RNN. The input sequence

x=(x⟨1⟩,x⟨2⟩,...,x⟨Tx⟩)x=(x⟨1⟩,x⟨2⟩,...,x⟨Tx⟩) is carried over

TxTx time steps. The network outputs

y=(y⟨1⟩,y⟨2⟩,...,y⟨Tx⟩)y=(y⟨1⟩,y⟨2⟩,...,y⟨Tx⟩).

Exercise: Code the forward propagation of the RNN described in Figure (3).

Instructions:

Create a vector of zeros (a) that will store all the hidden states computed by the RNN.
Initialize the "next" hidden state as a0 (initial hidden state).
Start looping over each time step, your incremental index is t :
- Update the "next" hidden state and the cache by running rnn_cell_forward
- Store the "next" hidden state in a (

最低0.47元/天解锁文章

forqzy

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
3
评论
Coursera 吴恩达DeepLearning.AI 第五课 sequence model 序列模型第一周Building your Recurrent Neural Network - Ste

本周的习题有点多，主要是python不熟悉，然后时间不够，提醒说马上过期才开始看的视频，optional部分没有写完Building your Recurrent Neural Network - Step by StepWelcome to Course 5's first assignment! In this assignment, you will implement your first...
复制链接

扫一扫