自编码实现深度神经网络

本文介绍如何仅使用numpy从头实现深度神经网络。内容涵盖初始化、前向传播、反向传播和参数更新,涉及sigmoid和ReLU激活函数。通过线性前向、线性激活前向及L层模型的建立,详细阐述了神经网络的运作过程。
摘要由CSDN通过智能技术生成

简介

只用numpy库,从底层实现深度神经网络。底层的数学逻辑可参见吴恩达的深度学习。
温馨建议:
为了便于对整体进行观察,把主要子函数的输入输出列写如下,可快速了解各函数如何相互作用。
可结合整体再深入细节看每个函数的具体实现

parameters = initialize_parameters_deep(layer_dims)

# forward propagation
Z, cache = linear_forward(A, W, b)
A, cache = linear_activation_forward(A_prev, W, b, activation)
AL, caches = L_model_forward(X, parameters)

# cost function
cost = compute_cost(AL, Y)

# backward propagation
dA_prev, dW, db = linear_activation_backward(dA, cache, activation)
grads = L_model_backward(AL, Y, caches)
parameters = update_parameters(parameters, grads, learning_rate)

# compute sigmoid and ReLU function, and corresponding dZ
A, cache = sigmoid(Z)
A, cache = relu(Z)
dZ = relu_backward(dA, cache)
dZ = sigmoid_backward(dA, cache)

1 - Packages

import numpy as np

2 - Outline of the Assignment

在这里插入图片描述

实现流程

3 - Initialization

3.2 - L-layers Neural Network

n [ l ] n^{[l]} n[l]表示第 l l l层的单元数(units)。
假如输入 X X X的大小是(12288, 209)( m = 209 m=209 m=209 examples),那么:
在这里插入图片描述
Initialization of a L-layers Neural Network

def initialize_parameters_deep(layers_dims):
    """
    input:
    layers_dims -- python list,维度矩阵. 
                   eg.layers_dims=[2,3,2]: input layers 有 2个 units,包含3个unit的一个hidden layers,output layer has 2 units
    output/return:
    parameters -- pathon dictiionary, initialize parameters containing parameters:
                  Wl : ['W' + str(l)]
                  bl : ['b' + str(l)] 
    """
    np.random.seed(3)
    parameters = {
   }         # 先申明dict,然后利用 for loop 在 dict 中添加 key
    L = len(layers_dims)    # 层的维度的个数即是层的个数
    
    for l in range(1, L):
        parameters["W" + str(l)] = np.random.rand(layers_dims[l-1], layers_dims[l])  #  layers_dims[l]:第l层的units
        parameters["b" + str(l)] = np.zeros(( layers_dims[l], 1))
        
        # 验证 parameters 的 shape
        assert(parameters["W" + str(l)].shape == ( layers_dims[l-1], layers_dims[l]))
        assert(parameters["b" + str(l)].shape == ( layers_dims[l], ))
    
    return parameters
    

4 - Forward propagation module

4.1 - Linear Forward

The linear forward 函数 (vectorized over all the examples) 计算 下面的等式:
Z [ l ] = W [ l ] A [ l − 1 ] + b [ l ] Z^{[l]} = W^{[l]}A^{[l-1]} + b^{[l]} Z[l]=W[l]A[l1]+b[l]
where A [ 0 ] = X A^{[0]} = X A[0]=X

def linear_forward(A, W, b):
    """
    input:
    A -- 前一层的activations,(or input data X): (size of previous layer, numbel of examples)
    W -- weight matrix: 矩阵 shape (size of current layer, size of previous layer)
    b -- bias vector, 矩阵 shape (size of current layer, 1)
    
    output/return:
    Z -- the input of activations function(前激活参数)
    cache -- python dictionary,containing A ,W, b.  存储在cache中,用于计算后向传播过程
    """
    Z = np.dot(W, A) + b      # broadcasting rule
    
    assert(Z.shape == (W.shape[0], A.shape[1]))
    cache = (A, W, b)
    
    return Z, cache

4.2 - Linear Activation Forward

在整个网络中,使用两种activation functions:

  • Sigmoid: σ ( Z ) = σ ( W A + b ) = 1 1 + e − ( W A + b ) \sigma(Z) = \sigma(W A + b) = \frac{1} {1 + e^{-(W A + b)}} σ(Z)=σ(WA+b)=1+e(WA+b)1. 已定义好的sigmoid函数返回两个参数: the activation value “A” 和 a “cache” 存储变量 “Z” (作为相关后向传播函数的输入)。 To use it following:
A, activation_cache = sigmoid(Z)
  • ReLU: A = R E L U ( Z ) = m a x ( 0 , Z ) A = RELU(Z) = max(0, Z) A=RELU(Z)=max(0,Z) 已定义的函数relu返回两个参数:the activation value “A” 和&#
Code provided by Ruslan Salakhutdinov and Geoff Hinton Permission is granted for anyone to copy, use, modify, or distribute this program and accompanying programs and documents for any purpose, provided this copyright notice is retained and prominently displayed, along with a note saying that the original programs are available from our web page. The programs and documents are distributed without any warranty, express or implied. As the programs were written for research purposes only, they have not been tested to the degree that would be advisable in any important application. All use of these programs is entirely at the user's own risk. How to make it work: 1. Create a separate directory and download all these files into the same directory 2. Download from http://yann.lecun.com/exdb/mnist the following 4 files: o train-images-idx3-ubyte.gz o train-labels-idx1-ubyte.gz o t10k-images-idx3-ubyte.gz o t10k-labels-idx1-ubyte.gz 3. Unzip these 4 files by executing: o gunzip train-images-idx3-ubyte.gz o gunzip train-labels-idx1-ubyte.gz o gunzip t10k-images-idx3-ubyte.gz o gunzip t10k-labels-idx1-ubyte.gz If unzipping with WinZip, make sure the file names have not been changed by Winzip. 4. Download Conjugate Gradient code minimize.m 5. Download Autoencoder_Code.tar which contains 13 files OR download each of the following 13 files separately for training an autoencoder and a classification model: o mnistdeepauto.m Main file for training deep autoencoder o mnistclassify.m Main file for training classification model o converter.m Converts raw MNIST digits into matlab format o rbm.m Training RBM with binary hidden and binary visible units o rbmhidlinear.m Training RBM with Gaussian hidden and binary visible units o backprop.m Backpropagation for fine-tuning an autoencoder o backpropclassify.m Backpropagation for classification using "encoder" network o CG_MNIST.m Conjugate Gradient optimization for fine-tuning an autoencoder o CG_CLASSIFY_INIT.m Co
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值