【深度学习】第一阶段 —— 第三课-CSDN博客

本文链接：https://blog.csdn.net/wujun_20/article/details/104715366

声明：此笔记为吴恩达(Andrew Ng)的深度学习课程学习后的总结，会根据自己的学习进度更新。

浅层神经网络

What’s a Nerual Network?

neural network

1.多层神经网络计算过程表示

用 [ ] 的上标来表示所在层的参数

            #两层神经网络的计算过程
               for i = 1 to m :

$z^{[1](i)}=W^[1]x^{(i)}+b^{[1]}$

$a^{[1](i)}=\sigma(z^{[1](i)})$

$z^{[2](i)}=W^[2]x^{(i)}+b^{[2]}$

$a^{[2](i)}=\sigma(z^{[2](i)})$

$\left[ \begin{matrix} | & |& & |\\ x^{(1)} & x^{(1)} &··· & x^{(m)} \\ | & | & &| \end{matrix} \right]\tag{1}$
$A^{[1]}= \left[ \begin{matrix} | & |& & |\\ a^{[1](1)} & a^{[1](2)} &··· & a^{[1](m)} \\ | & | & &| \end{matrix} \right]\tag{2}$

以上为A和X的矩阵实体化，A的例子为第一层的计算值

2.激活函数(Activation Function )

(1). Sigmod Activation Function

sigmoid

$\frac{1}{1+e^{-z}}$

$\times [1-g(z)]$

(2). Tanh Activation Function
tanh

$g(z)=\frac{e^z-e^{-z}}{e^z+e^{-z}}$

$g'(z)=1-g(z)^2$

(3). ReLu Activation Function

ReLU

$g (z) = m a x (0, z)$

$g'(z)=\begin{cases} 0 & if \ z<0 \\ 1 & if \ z \geq 0 \end{cases} \tag{1.ReLu函数及导数}$

$\cdot z,z)$

$g'(z)=\begin{cases} 0.01 & if \ z<0 \\ 1 & if \ z \geq 0 \end{cases} \tag{2.Leaky ReLu函数及导数}$

3. 神经网络的梯度下降（Gradient descent for neural network

forward propagation

$Z^{[1]}=w^{[1]} \cdot X+b^{[1]}$

$A^{[1]}=g^{[1]}(Z^{[1]})$

$Z^{[2]}=w^{[2]} \cdot A^{[1]}+b^{[2]}$

$A^{[2]}=g^{[2]}(Z^{[2]})= \sigma(Z^{[2]}) \tag{A，Z， X 都已向量化}$

back propagation

$dZ^{[2]} = A^{[2]} - Y$

$dW^{[2]} = \frac 1 m \cdot dZ^{[2]} \cdot A^{[1]T}$

$db^{[2]} = \frac 1 m \cdot np.sum(dZ^{[2]},axis = 1, keepdims = true)$

$dZ^{[1]} = W^{[2]T} \cdot dZ^{[2]} \cdot g'^{[1]}(Z^{[1]})$

$dW^{[1]} = \frac 1 m \cdot dZ^{[1]} \cdot X^T$

$db^{[1]} = \frac 1 m \cdot np.sum(dZ^{[1]},axis = 1, keepdims = true )$

4.参数随机初始化（Random initialization）

    import numpy as np
    
    W^[1] = np.random.randn({2,2}) * 0.01  #0.01就是较为合适的学习率，一般初始化数值都较小，这样梯度下降才能实现
    b^[1] = np.zeros((2,1))
    W^[1] = np.random.randn({2,2}) * 0.01
    b^[2] = 0  #python语法有延展性