深度学习作业L1W4：深层神经网络

最新推荐文章于 2023-12-06 17:39:20 发布

awake020

最新推荐文章于 2023-12-06 17:39:20 发布

阅读量272

点赞数

分类专栏：深度学习笔记文章标签： python 深度学习机器学习神经网络

本文链接：https://blog.csdn.net/weixin_44334615/article/details/105471954

版权

本周作业涉及构建深层神经网络，包括两层神经网络和L层神经网络。首先，数据预处理，接着进行参数初始化，通过前向传播、损失函数计算、反向传播以及梯度下降更新参数。在猫图片识别实践中，两层神经网络在测试集上达到100%准确率，而L层神经网络在降低训练集准确率的同时，提升了测试集的准确率至0.8，显示出深层网络的优势。

摘要由CSDN通过智能技术生成

本周的作业分为两部分，第一部分是构建一个深层神经网络的模板，第二部分是进行一个识别猫图片的实践。
我先进行我的经验总结，再贴出实验代码。

构建一个神经网络模型的步骤：
1.进行数据处理，如将图片拍扁
2.进行参数初始化（注意w应该进行随机初始化）
3.构建前向传播：利用向量化公式，同时注意保留cache，以方便反向传播的计算。
4.计算损失函数：方便我们直观的判断损失函数的下降过程，以调整超参数
5.构建反向传播：利用前向传播提供的cache求解
6.更新参数：梯度下降法
7.统计测试集训练集的准确率

上代码！

part1

两层简单神经网络参数初始化，利用np.random.randn

# GRADED FUNCTION: initialize_parameters

def initialize_parameters(n_x, n_h, n_y): 
    """
    Argument:
    n_x -- size of the input layer
    n_h -- size of the hidden layer
    n_y -- size of the output layer
    
    Returns:
    parameters -- python dictionary containing your parameters:
                    W1 -- weight matrix of shape (n_h, n_x)
                    b1 -- bias vector of shape (n_h, 1)
                    W2 -- weight matrix of shape (n_y, n_h)
                    b2 -- bias vector of shape (n_y, 1)
    """
    
    np.random.seed(1)
    
    ### START CODE HERE ### (≈ 4 lines of code)
    W1 = np.random.randn(n_h, n_x)*0.01
    b1 = np.zeros((n_h, 1))
    W2 = np.random.randn(n_y, n_h)*0.01
    b2 = np.zeros((n_y, 1))
    ### END CODE HERE ###
    
    assert(W1.shape == (n_h, n_x))
    assert(b1.shape == (n_h, 1))
    assert(W2.shape == (n_y, n_h))
    assert(b2.shape == (n_y, 1))
    
    parameters = {
   "W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    
    return parameters

对于多层参数初始化，利用for循环

# GRADED FUNCTION: initialize_parameters_deep

def initialize_parameters_deep(layer_dims):
    """
    Arguments:
    layer_dims -- python array (list) containing the dimensions of each layer in our network
    
    Returns:
    parameters -- python dictionary containing your parameters "W1", "b1", ..., "WL", "bL":
                    Wl -- weight matrix of shape (layer_dims[l], layer_dims[l-1])
                    bl -- bias vector of shape (layer_dims[l], 1)
    """
    
    np.random.seed(3)
    parameters = {
   }
    L = len(layer_dims)            # number of layers in the network

    for l in range(1, L):
        ### START CODE HERE ### (≈ 2 lines of code)
        parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l-1])*0.01
        parameters['b' + str(l)] = np.zeros((layer_dims[l], 1)) 
        ### END CODE HERE ###
        
        assert(parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l-1]))
        assert(parameters['b' + str(l)].shape == (layer_dims[l], 1))

        
    return parameters

计算z值并存储cache

# GRADED FUNCTION: linear_forward

def linear_forward(A, W, b):
    """
    Implement the linear part of a layer's forward propagation.

    Arguments:
    A -- activations from previous layer (or input data): (size of previous layer, number of examples)
    W -- weights matrix: numpy array of shape (size of current layer, size of previous layer)
    b -- bias vector, numpy array of shape (size of the current layer, 1)

    Returns:
    Z -- the input of the activation function, also called pre-activation parameter 
    cache -- a python dictionary containing "A", "W" and "b" ; stored for computing the backward pass efficiently
    """
    
    ### START CODE HERE ### (≈ 1 line of code)
    Z = W.dot(A)+b
    ### END CODE HERE ###
    
    assert(Z.shape == (W.shape[0], A.shape[1]))
    cache = (A, W, b)
    
    return Z, cache

计算A值并记录cache

# GRADED FUNCTION: linear_activation_forward

def linear_activation_forward(A_prev, W, b, activation):
    """
    Impleme

最低0.47元/天解锁文章