【深度学习】第一阶段 —— 第三课


声明: 此笔记为吴恩达(Andrew Ng)的深度学习课程学习后的总结,会根据自己的学习进度更新。

浅层神经网络

What’s a Nerual Network?

neural network

1.多层神经网络计算过程表示

用 [ ] 的上标来表示所在层的参数

            #两层神经网络的计算过程
               for i = 1 to m : 

z [ 1 ] ( i ) = W [ 1 ] x ( i ) + b [ 1 ] z^{[1](i)}=W^[1]x^{(i)}+b^{[1]} z[1](i)=W[1]x(i)+b[1]

a [ 1 ] ( i ) = σ ( z [ 1 ] ( i ) ) a^{[1](i)}=\sigma(z^{[1](i)}) a[1](i)=σ(z[1](i))

z [ 2 ] ( i ) = W [ 2 ] x ( i ) + b [ 2 ] z^{[2](i)}=W^[2]x^{(i)}+b^{[2]} z[2](i)=W[2]x(i)+b[2]

a [ 2 ] ( i ) = σ ( z [ 2 ] ( i ) ) a^{[2](i)}=\sigma(z^{[2](i)}) a[2](i)=σ(z[2](i))


X = [ ∣ ∣ ∣ x ( 1 ) x ( 1 ) ⋅ ⋅ ⋅ x ( m ) ∣ ∣ ∣ ] (1) X = \left[ \begin{matrix} | & |& & |\\ x^{(1)} & x^{(1)} &··· & x^{(m)} \\ | & | & &| \end{matrix} \right]\tag{1} X=x(1)x(1)x(m)(1)
A [ 1 ] = [ ∣ ∣ ∣ a [ 1 ] ( 1 ) a [ 1 ] ( 2 ) ⋅ ⋅ ⋅ a [ 1 ] ( m ) ∣ ∣ ∣ ] (2) A^{[1]}= \left[ \begin{matrix} | & |& & |\\ a^{[1](1)} & a^{[1](2)} &··· & a^{[1](m)} \\ | & | & &| \end{matrix} \right]\tag{2} A[1]=a[1](1)a[1](2)a[1](m)(2)

以上为A和X的矩阵实体化,A的例子为第一层的计算值
2.激活函数(Activation Function )

  (1). Sigmod Activation Function

sigmoid

g ( z ) = 1 1 + e − z g(z) = \frac{1}{1+e^{-z}} g(z)=1+ez1

g ′ ( z ) = g ( z ) × [ 1 − g ( z ) ] g'(z) = g(z) \times [1-g(z)] g(z)=g(z)×[1g(z)]

  (2). Tanh Activation Function
tanh

g ( z ) = e z − e − z e z + e − z g(z)=\frac{e^z-e^{-z}}{e^z+e^{-z}} g(z)=ez+ezezez

g ′ ( z ) = 1 − g ( z ) 2 g'(z)=1-g(z)^2 g(z)=1g(z)2

  (3). ReLu Activation Function

ReLU

g ( z ) = m a x ( 0 , z ) g(z)=max(0,z) g(z)=max(0,z)

g ′ ( z ) = { 0 i f   z < 0 1 i f   z ≥ 0 (1.ReLu函数及导数) g'(z)=\begin{cases} 0 & if \ z<0 \\ 1 & if \ z \geq 0 \end{cases} \tag{1.ReLu函数及导数} g(z)={01if z<0if z0(1.ReLu)

g ( z ) = m a x ( 0.01 ⋅ z , z ) g(z)=max(0.01 \cdot z,z) g(z)=max(0.01z,z)

g ′ ( z ) = { 0.01 i f   z < 0 1 i f   z ≥ 0 (2.Leaky ReLu函数及导数) g'(z)=\begin{cases} 0.01 & if \ z<0 \\ 1 & if \ z \geq 0 \end{cases} \tag{2.Leaky ReLu函数及导数} g(z)={0.011if z<0if z0(2.Leaky ReLu)

3. 神经网络的梯度下降(Gradient descent for neural network
  • forward propagation

Z [ 1 ] = w [ 1 ] ⋅ X + b [ 1 ] Z^{[1]}=w^{[1]} \cdot X+b^{[1]} Z[1]=w[1]X+b[1]

A [ 1 ] = g [ 1 ] ( Z [ 1 ] ) A^{[1]}=g^{[1]}(Z^{[1]}) A[1]=g[1](Z[1])

Z [ 2 ] = w [ 2 ] ⋅ A [ 1 ] + b [ 2 ] Z^{[2]}=w^{[2]} \cdot A^{[1]}+b^{[2]} Z[2]=w[2]A[1]+b[2]

A [ 2 ] = g [ 2 ] ( Z [ 2 ] ) = σ ( Z [ 2 ] ) (A,Z, X 都已向量化) A^{[2]}=g^{[2]}(Z^{[2]})= \sigma(Z^{[2]}) \tag{A,Z, X 都已向量化} A[2]=g[2](Z[2])=σ(Z[2])(AZ X )

  • back propagation

    d Z [ 2 ] = A [ 2 ] − Y dZ^{[2]} = A^{[2]} - Y dZ[2]=A[2]Y

d W [ 2 ] = 1 m ⋅ d Z [ 2 ] ⋅ A [ 1 ] T dW^{[2]} = \frac 1 m \cdot dZ^{[2]} \cdot A^{[1]T} dW[2]=m1dZ[2]A[1]T

d b [ 2 ] = 1 m ⋅ n p . s u m ( d Z [ 2 ] , a x i s = 1 , k e e p d i m s = t r u e ) db^{[2]} = \frac 1 m \cdot np.sum(dZ^{[2]},axis = 1, keepdims = true) db[2]=m1np.sum(dZ[2],axis=1,keepdims=true)

d Z [ 1 ] = W [ 2 ] T ⋅ d Z [ 2 ] ⋅ g ′ [ 1 ] ( Z [ 1 ] ) dZ^{[1]} = W^{[2]T} \cdot dZ^{[2]} \cdot g'^{[1]}(Z^{[1]}) dZ[1]=W[2]TdZ[2]g[1](Z[1])

d W [ 1 ] = 1 m ⋅ d Z [ 1 ] ⋅ X T dW^{[1]} = \frac 1 m \cdot dZ^{[1]} \cdot X^T dW[1]=m1dZ[1]XT

d b [ 1 ] = 1 m ⋅ n p . s u m ( d Z [ 1 ] , a x i s = 1 , k e e p d i m s = t r u e ) db^{[1]} = \frac 1 m \cdot np.sum(dZ^{[1]},axis = 1, keepdims = true ) db[1]=m1np.sum(dZ[1],axis=1,keepdims=true)

4.参数随机初始化(Random initialization)
    import numpy as np
    
    W^[1] = np.random.randn({2,2}) * 0.01  #0.01就是较为合适的学习率,一般初始化数值都较小,这样梯度下降才能实现
    b^[1] = np.zeros((2,1))
    W^[1] = np.random.randn({2,2}) * 0.01
    b^[2] = 0  #python语法有延展性

第三课编程作业附上 (非本人撰写) :带有一个隐藏层的平面数据分类
  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值