Class1-Week4-Deep Neural Network

Compute Process

Forward Propagation

Layer-l:

  • Input: A [ l − 1 ] A^{[l-1]} A[l1]
  • Compute Process:
    Z [ l ] = W [ l ] A [ l − 1 ] + b [ l ] Z^{[l]}=W^{[l]}A^{[l-1]}+b^{[l]} Z[l]=W[l]A[l1]+b[l]
    A [ l ] = g ( Z [ l ] ) A^{[l]}=g(Z^{[l]}) A[l]=g(Z[l])
  • Output: A [ l ] A^{[l]} A[l]
  • Cache: Z [ l ] , W [ l ] , b [ l ] Z^{[l]},W^{[l]},b^{[l]} Z[l],W[l],b[l]

Backward Propagation

Layer-l:

  • Input: d A [ l ] dA^{[l]} dA[l]
  • Compute Process:
    d Z [ l ] = d A [ l ] ∗ g ′ ( Z [ l ] ) dZ^{[l]} = dA^{[l]} * g'(Z^{[l]}) dZ[l]=dA[l]g(Z[l])
    d W [ l ] = 1 m ∗ d Z [ l ] A [ l − 1 ] T dW^{[l]} = \frac{1}{m} * dZ^{[l]}A^{[l-1]T} dW[l]=m1dZ[l]A[l1]T
    d b [ l ] = 1 m ∗ n p . s u m ( d Z [ l ] , a x i s = 1 , k e e p d i m s = T r u e ) db^{[l]} = \frac{1}{m} * np.sum(dZ^{[l]}, axis=1, keepdims=True) db[l]=m1np.sum(dZ[l],axis=1,keepdims=True)
    d A [ l − 1 ] = W [ l ] T d Z [ l ] dA^{[l-1]} = W^{[l]T}dZ^{[l]} dA[l1]=W[l]TdZ[l]
  • Output: d A [ l − 1 ] dA^{[l-1]} dA[l1]
  • Update:
    W [ l ] = W [ l ] − α d W [ l ] W^{[l]} = W^{[l]} - \alpha dW^{[l]} W[l]=W[l]αdW[l]
    b [ l ] = b [ l ] − α d b [ l ] b^{[l]} = b^{[l]} - \alpha db^{[l]} b[l]=b[l]αdb[l]

Matrix Dimensions

Layer-l:

d W [ l ] = W [ l ] : ( n [ l ] , n [ l − 1 ] ) d b [ l ] = b [ l ] : ( n [ l ] , 1 ) d Z [ l ] = Z [ l ] : ( n [ l ] , m ) d A [ l ] = A [ l ] : ( n [ l ] , m ) \begin{aligned} dW^{[l]} = W^{[l]} &: (n^{[l]}, n^{[l-1]}) \\ db^{[l]} = b^{[l]} &: (n^{[l]}, 1) \\ dZ^{[l]} = Z^{[l]} &: (n^{[l]}, m) \\ dA^{[l]} = A^{[l]} &: (n^{[l]}, m) \end{aligned} dW[l]=W[l]db[l]=b[l]dZ[l]=Z[l]dA[l]=A[l]:(n[l],n[l1]):(n[l],1):(n[l],m):(n[l],m)


Parameters vs Hyperparameters

Defination

In machine learning, a hyperparameter is a parameter whose value is set before the learning process begins. By contrast, the values of other parameters are derived via training.
1. Parameters: W, b
2. Hyperparameters:
- Learning_rate α \alpha α – we can set a proper learning rate by drawing the relationship graph between iterations and cost in different learning rate.
- Iteration_numbers
- Network architecture
- Activation functions
- …

Tune for Hyperparameters

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值