神经网络与深度学习（四）- Deep Neural Network

最新推荐文章于 2024-05-15 17:07:32 发布

mike112223

最新推荐文章于 2024-05-15 17:07:32 发布

阅读量798

点赞数

分类专栏： deeplearning.ai课程学习笔记文章标签：深度学习神经网络

本文链接：https://blog.csdn.net/mike112223/article/details/78074801

版权

deeplearning.ai课程学习笔记专栏收录该内容

10 篇文章 2 订阅

订阅专栏

有了第二、三篇的铺垫，本篇讲述深层神经网络的内部结构已经实现过程，虽然在概念上没什么需要重复多讲的，但在程序上，针对多层的神经网络来说，需要构建更多的东西来完成网络的运算

Why deep representations?

为什么我们要采用深层的神经网络呢，而不是简单的运用浅层神经网络就可以了。理由就要参照下面这副图，我们假设要实现n的输入的异或运算，如果采用浅层神经网络（仅含一层隐藏层）的话，那么我们的隐藏神经元将需要 $O(2^n)$ ，而对于深层的神经网络来说，仅需要 $O(\log n)$ 层， $O (n)$ 个神经元即可，虽然网络结构相对来说复杂了，但运算量将大大降低。
这里写图片描述
深层的神经网络能够更好的选择和体现特征，对于人脸识别来说，第一层探测出图像中的edge，第二层由这些edge组合成人脸的不同部位，第三层则合成特征脸。

Getting matrix dimensions right

首先是确保各参数的维度正确，多层神经网络不比浅层或者简单的逻辑回归，它方程数量可能很多而且环环相扣，很容易出现bug，一开始就确保维度正确有利于程序的正常运行。下图为程序中会用到的参数的相应维度。
这里写图片描述

Building blocks of deep neural networks

这里写图片描述
前向传播相对简单就不赘述，对于后向传播来说如下图

Practice

Initialize parameters/ Define hyperparameters
Loop for num_iterations:
a. Forward propagation
b. Compute cost function
c. Backward propagation
d. Update parameters
Use trained parameters to predict

def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
    """
    Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.
    
    Arguments:
    X -- data, numpy array of shape (number of examples, num_px * num_px * 3)
    Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)
    layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).
    learning_rate -- learning rate of the gradient descent update rule
    num_iterations -- number of iterations of the optimization loop
    print_cost -- if True, it prints the cost every 100 steps
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """

    np.random.seed(1)
    costs = []                         # keep track of cost
    
    # Parameters initialization.
    parameters = initialize_parameters_deep(layers_dims)
    
    # Loop (gradient descent)
    for i in range(0, num_iterations):

        # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
        AL, caches = L_model_forward(X, parameters)
        
        # Compute cost.
        cost = compute_cost(AL, Y)
    
        # Backward propagation.
        grads = L_model_backward(AL, Y, caches)
 
        # Update parameters.
        parameters = update_parameters(parameters, grads, learning_rate)
                
        # Print the cost every 100 training example
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
        if print_cost and i % 100 == 0:
            costs.append(cost)
            
    # plot the cost
    plt.plot(np.squeeze(costs))
    plt.ylabel('cost')
    plt.xlabel('iterations (per tens)')
    plt.title("Learning rate =" + str(learning_rate))
    plt.show()
    
    return parameters