Deep NN(深度神经网络)---deeplearning.ai---笔记及Python源码(15)

深度神经网络指的是多隐层的神经网络。

本文采用一个两输入,3个隐藏层,1个输出层来进行说明深度神经网络的前向和后向传播,以及具体的公式推导和代码撰写。

一、符号定义

参看笔记13和笔记14

二、基本模型

三、模型计算

(1)      前向传播:在样本数为1的情况

$$\begin{array}{l}{a^{[0]}} = {\left( {{x_1},{x_2}} \right)^T}\\{z^{[1]}} = {w^{[1]}}{a^{[0]}} + {b^{[1]}}\\{a^{[1]}} = {g^{[1]}}({z^{[1]}})\\{z^{[2]}} = {w^{[2]}}{a^{[1]}} + {b^{[2]}}\\{a^{[2]}} = {g^{[2]}}({z^{[2]}})\\{z^{[3]}} = {w^{[3]}}{a^{[2]}} + {b^{[3]}}\\{a^{[3]}} = {g^{[3]}}({z^{[3]}})\\{z^{[4]}} = {w^{[4]}}{a^{[3]}} + {b^{[4]}}\\{a^{[4]}} = {g^{[4]}}({z^{[4]}}) = \widehat y\end{array}$$

可以得出:

$$\begin{array}{l}{z^{[L]}} = {w^{[L]}}{a^{[L - 1]}} + {b^{[L]}}\\{a^{[L]}} = {g^{[L]}}({z^{[L]}})\end{array}$$

(2)      后向传播:求导数采用链式法则。

$$\begin{array}{l}d{a^{[4]}} = ( - \frac{y}{{{a^{[4]}}}} + \frac{{(1 - y)}}{{(1 - {a^{[4]}})}})\\d{z^{[4]}} = {a^{[4]}} - y\\d{w^{[4]}} = d{z^{[4]}}{a^{[3]}}\\d{b^{[4]}} = d{z^{[4]}}\\d{a^{[3]}} = d{z^{[4]}}{w^{[4]}}\\d{z^{[3]}} = d{a^{[3]}}{a^{[3]}}(1 - {a^{[3]}})\\d{w^{[3]}} = d{z^{[3]}}{a^{[2]}}\\d{b^{[3]}} = d{z^{[3]}}\\....\end{array}$$

隐层的权重和偏置的导数可以表示为:

$$\begin{array}{l}d{a^{[L - 1]}} = {w^{[L]T}}d{z^{[L]}}\\d{z^{[L]}} = d{a^{[L]}}{a^{[L]}}(1 - {a^{[L]}})\\d{w^{[L]}} = d{z^{[L]}}{a^{[L - 1]}}\\d{b^{[L]}} = d{z^{[L]}}\\....\end{array}$$

四、Python 代码实现

输入的样本和输出样本的格式同笔记14一样。来实现一中的深度神经网络。采取梯度下降的方法进行。为了更加清晰的展示深度神经网络中隐层的每个权值和偏置,代码中没有使用数组的形式存储,而是直接采取变量的形式进行展现,也没有使用for循环进行隐层的计算,而是将推导的每一步都编入代码。


# -*- coding: utf-8 -*-
"""
Created on Wed Apr 25 16:06:58 2018

@author: HGaviN
"""

import numpy as np  
m = 10  
n_0 = 2  
n_1 = 4 
n_2 = 3
n_3 = 2
n_4 = 1  #these can be save as an array  
X = np.array([[1,1],[1.5,1.5],[2,2],[2.5,2.5],[2.75,2.75],[3.15,3.15],[3.5,3.5],[3.75,3.75],[4,4],[4.5,4.5]])#create some examples  
X = X.T # transposition  
Y = np.array([[0],[0],[0],[0],[0],[1],[1],[1],[1],[1]])  
#initialization  
alpha = 0.01
# Forward Propagation
W1 = 0.01*np.random.rand(n_0,n_1)  
W2 = 0.01*np.random.rand(n_1,n_2)
W3 = 0.01*np.random.rand(n_2,n_3)
W4 = 0.01*np.random.rand(n_3,n_4)

b1 = np.zeros([n_1,1])
b2 = np.zeros([n_2,1])
b3 = np.zeros([n_3,1])
b4 = np.zeros([n_4,1])  
# backward propagation
dW1 = np.zeros([n_0,n_1])  
dZ1 = np.zeros([m,n_1])  
db1 = np.zeros([n_1,1])
  
dW2 = np.zeros([n_1,n_2])  
dZ2 = np.zeros([m,n_2])  
db2 = np.zeros([n_2,1])  

dW3 = np.zeros([n_2,n_3])  
dZ3 = np.zeros([m,n_3])  
db3 = np.zeros([n_3,1])

dW4 = np.zeros([n_3,n_4])  
dZ4 = np.zeros([m,n_4])  
db4 = np.zeros([n_4,1])  
  
j = 0  
for iter in range(50):  
    #Forward Propagation
    Z1 = np.dot(W1.T,X)+b1 # n_1 X m  
    A1 = 1/(1+np.exp(-Z1)) # n_1 X m
    Z2 = np.dot(W2.T,A1)+b2 # n_2 X m  
    A2 = 1/(1+np.exp(-Z2)) # n_2 X m
    Z3 = np.dot(W3.T,A2)+b3# n_3 X m  
    A3 = 1/(1+np.exp(-Z3))#  n_3 X m
    Z4 = np.dot(W4.T,A3)+b4# n_4 X m  
    A4 = 1/(1+np.exp(-Z4))#  n_4 X m
    #backward propagation
    dZ4 = A4.T - Y# m X n_4  
    dW4 = 1/m*np.dot(A3,dZ4)#n_3 X n_4
    db4 = 1/m*np.sum(dZ4,axis=0).T # n_4X 1
    
    dA3 = np.dot(dZ4,W4.T) # m X n_3
    dZ3 = np.multiply(dA3,np.multiply(A3,1-A3).T) #m X n_3 
    dW3 = 1/m*np.dot(A2,dZ3)#n_2 X n_3
    db3 = 1/m*np.sum(dZ3,axis=0,keepdims=True).T #n_3 X 1
    
    dA2 = np.dot(dZ3,W3.T) # m X n_2
    dZ2 = np.multiply(dA2,np.multiply(A2,1-A2).T) #m X n_2 
    dW2 = 1/m*np.dot(A1,dZ2)#n_1 X n_2
    db2 = 1/m*np.sum(dZ2,axis=0,keepdims=True).T #n_2 X 1
    
    dA1 = np.dot(dZ2,W2.T) # m X n_1
    dZ1 = np.multiply(dA1,np.multiply(A1,1-A1).T) #m X n_1 
    dW1 = 1/m*np.dot(X,dZ1)#n_0 X n_1
    db1 = 1/m*np.sum(dZ1,axis=0,keepdims=True).T #n_1 X 1
  
    W4 = W4 - alpha*dW4
    W3 = W3 - alpha*dW3
    W2 = W2 - alpha*dW2
    W1 = W1 - alpha*dW1
    
    b4 = b4 - alpha*db4 
    b3 = b3 - alpha*db3
    b2 = b2 - alpha*db2
    b1 = b1 - alpha*db1
    #np.multiply is like .* in matlab  
    j = -1/m*np.sum(np.multiply(Y.T,np.log(A2))+np.multiply((1-Y).T,np.log(1-A2)))#  
print (j)  
print ('\n')      
xp = np.array([[4],[3.5]])  
z1p = np.dot(W1.T,xp)+b1  
a1p = 1/(1+np.exp(-z1p))  
z2p = np.dot(W2.T,a1p)+b2  
a2p = 1/(1+np.exp(-z2p))
z3p = np.dot(W3.T,a2p)+b3  
a3p = 1/(1+np.exp(-z3p))
z4p=np.dot(W4.T,a3p)+b4  
a4p = 1/(1+np.exp(-z4p))   
print (a4p)  

五、参数与超参数

一般而言,神经网络的参数指的是 权重和偏置,而超参数指的是隐层的数量、每个隐层隐元的数量、激活函数类型、学习参数、迭代次数等影响神经网络参数的参数。






  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值