deeplearning.ai-lecture1-building deep neural network steps

最新推荐文章于 2022-06-30 02:04:29 发布

winnie91

最新推荐文章于 2022-06-30 02:04:29 发布

阅读量197

点赞数

分类专栏： deaplearning.ai andrew

deaplearning.ai 同时被 2 个专栏收录

8 篇文章 0 订阅

订阅专栏

andrew

8 篇文章 0 订阅

订阅专栏

该实验主要是实现一些“Helper function”,为下一步实现两层神经网络和L层神经网络做准备，实现一个两层网络或深层网络的步骤如下：

Step 1.分别初始化一个两层神经网络和L层神经网络的参数

Step 2: 前向传播的实现：

1.完成一个网络的前向传播的线性部分（linear part）,即计算出 Z [l]

2.实现relu和 sigmoid激活函数

3.联合前两步，实现网络前向传播的一个【linear->activation】层函数

4.实现前向传播的前L-1层【linear->relu】最后一层的【linear->sigmoid】函数

Step 3:计算损失函数

Step 4:反向传播的实现：

1.计算神经网络线性部分（linear part）的反向传播

2.求出relu和sigmoid函数的梯度函数（relu_backward/relu_backward）

3.联合前两步，实现一个新的【linear->Activation】反向函数

4.整合，实现最后一层的【linear->sigmoid】和前L-1层的【linear->relu】的反向函数

Step 5:更新参数

下面开始实现神经网络的函数

Step 1:

1. 2层神经网络参数初始化

def initialize_parameters(n_x, n_h, n_y):
    """
    Argument:
    n_x -- size of the input layer
    n_h -- size of the hidden layer
    n_y -- size of the output layer
    
    Returns:
    parameters -- python dictionary containing your parameters:
                    W1 -- weight matrix of shape (n_h, n_x)
                    b1 -- bias vector of shape (n_h, 1)
                    W2 -- weight matrix of shape (n_y, n_h)
                    b2 -- bias vector of shape (n_y, 1)
    """
    
    np.random.seed(1)
    
    W1 = np.random.randn(n_h, n_x)*0.01
    b1 = np.zeros((n_h, 1))
    W2 = np.random.randn(n_y, n_h)*0.01
    b2 = np.zeros((n_y, 1))
    
    assert(W1.shape == (n_h, n_x))
    assert(b1.shape == (n_h, 1))
    assert(W2.shape == (n_y, n_h))
    assert(b2.shape == (n_y, 1))
    
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    
    return parameters

2. L层神经网络参数初始化

def initialize_parameters_deep(layer_dims):
    """
        Arguments:
        layer_dims -- python array (list) containing the dimensions of each layer in our network

        Returns:
        parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                        Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                        bl -- bias vector of shape (layer_dims[l], 1)
        """
    np.random.seed(3)
    parameters={}
    L=len(layer_dims)
    for l in range(1,L):
        parameters['W'+str(l)]=np.random.randn(layer_dims[l],layer_dims[l-1])*0.01
        parameters['b'+str(l)]=np.zeros((layer_dims[l],1))
        assert(parameters['W'+str(l)].shape==(layer_dims[l],layer_dims[l-1]))
        assert(parameters['b'+str(l)].shape==(layer_dims[l],1))
    return parameters

Step 2:

1.网络的前向传播的线性部分

def linear_forward(A,W,b):
    """
        Implement the linear part of a layer's forward propagation.
    
        Arguments:
        A -- activations from previous layer (or input data): (size of previous layer, number of examples)
        W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
        b -- bias vector, numpy array of shape (size of the current layer, 1)
    
        Returns:
        Z -- the input of the activation function, also called pre-activation parameter
        cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently
        """

    Z=np.dot(W,A)+b
    assert (Z.shape==(W.shape[0],A.shape[1]))
    cache=(A,W,b)
    return Z,cache