DeepLearning.ai code笔记1：神经网络与深度学习

最新推荐文章于 2022-12-22 15:56:22 发布

Dod_Jdi

最新推荐文章于 2022-12-22 15:56:22 发布

阅读量966

点赞数 1

分类专栏：机器学习深度学习吴恩达深度学习编程作业梳理文章标签：吴恩达深度学习作业神经网络网络搭建过程征信成本反向传播

本文链接：https://blog.csdn.net/dod_jdi/article/details/79796669

版权

机器学习同时被 3 个专栏收录

28 篇文章 1 订阅

订阅专栏

深度学习

17 篇文章 0 订阅

订阅专栏

吴恩达深度学习编程作业梳理

4 篇文章 11 订阅

订阅专栏

说明一下，这和系列是对编程作业的作一些我认为比较重要的摘抄、翻译和解释，主要是为了记录不同的模型的主要思想或者流程，以及一些coding中常见的错误，作为查漏补缺之用。

作业链接：https://github.com/Wasim37/deeplearning-assignment。感谢大佬们在GitHub上的贡献。

1、随机数的生成

np.random.randn() 和 np.random.rand() 的差别：前者n表示按正太分布，后者按线性产生随机数。我在编程中开始总是因为少个 n 发现产生的随机数和作业不一致。

np.random.seed() ：通过设定一个随机数种子，相当于产生了一个固定的数组列表，每次按顺序返回数组中对应索引的数据。

import numpy as np
# np.random.seed(1)     # 取消注释查看差异就明白了seed的作用
print(np.random.random())
for i in range(5):
    print(np.random.random())

未去掉	去掉
0.22199317108973948	0.22199317108973948
0.8707323061773764	0.8707323061773764
0.20671915533942642	0.20671915533942642
0.9186109079379216	0.9186109079379216
0.48841118879482914	0.48841118879482914
0.6117438629026457

2、建立神经网络的基本步骤

1、Define the model structure (such as number of input features)
2、Initialize the model’s parameters
3、Loop:
     Calculate current loss (forward propagation)
     Calculate current gradient (backward propagation)
     Update parameters (gradient descent)

You often build 1-3 separately and integrate them into one function we call model().

翻译：

1、定义模型结构（如输入特征的个数）
2、初始化模型的参数
3、循环：
    计算当前损失（正向传播）
    计算当前梯度（反向传播）
    更新参数（梯度下降）

你经常分别建立1-3，并把它们整合到我们所说的一个函数中model()。

def initialize_parameters_deep(layer_dims):
    ...
    return parameters 
def L_model_forward(X, parameters):
    ...
    return AL, caches # 返回最后一层的激活值，所有层激活值的集合
def compute_cost(AL, Y):
    ...
    return cost
def L_model_backward(AL, Y, caches):
    ...
    return grads
def update_parameters(parameters, grads, learning_rate):
    ...
    return parameters

前向传播的主要公式：

z (i) = w T x (i) + b (1)

$z^{(i)} = w^T x^{(i)} + b \tag{1}$

y^(i) = a (i) = s i g m o i d (z (i)) (2)

$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$

L (a (i), y (i)) = - y (i) log (a (i)) - (1 - y (i)) log (1 - a (i)) (3)

$\mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})\tag{3}$
The cost is then computed by summing over all training examples:

J = 1 m \sum i = 1 m L (a (i), y (i)) (4)

$J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{4}$

反向传播的主要公式：

For layer $l$ , the linear part is: $Z^{[l]} = W^{[l]} A^{[l-1]} + b^{[l]}$ (followed by an activation).
Suppose you have already calculated the derivative $dZ^{[l]} = \frac{\partial \mathcal{L} }{\partial Z^{[l]}}$ . You want to get $(dW^{[l]}, db^{[l]} dA^{[l-1]})$ .
The three outputs $(dW^{[l]}, db^{[l]}, dA^{[l]})$ are computed using the input $dZ^{[l]}$ .Here are the formulas you need:

d W [l] = \partial L \partial W [ l ] = 1 m d Z [l] A [l - 1] T (1)

$dW^{[l]} = \frac{\partial \mathcal{L} }{\partial W^{[l]}} = \frac{1}{m} dZ^{[l]} A^{[l-1] T} \tag{1}$

d b [l] = \partial L \partial b [ l ] = 1 m \sum i = 1 m d Z [l] (i) (2)

$db^{[l]} = \frac{\partial \mathcal{L} }{\partial b^{[l]}} = \frac{1}{m} \sum_{i = 1}^{m} dZ^{[l](i)}\tag{2}$

d A [l - 1] = \partial L \partial A [ l - 1 ] = W [l] T d Z [l] (3)

$dA^{[l-1]} = \frac{\partial \mathcal{L} }{\partial A^{[l-1]}} = W^{[l] T} dZ^{[l]} \tag{3}$

def linear_backward(dZ, cache):
    """
    反向传播计算梯度
    :param dZ: 当前层损失函数的导数，L层一般为 A-y 
    :param cache:
    :return:
    """
    A_pre, W, b = cache
    m = A_pre.shape[1]

    dW = np.dot(dZ, A_pre.T) / m
    db = np.sum(dZ, axis=1, keepdims=True) / m
    # dA/dA_pre = (dA/dZ * dZ/dA_pre) = (dA/dZ * w), 为了表示方便去掉了"dA/", 故乘法不变
    dA_pre = np.dot(W.T, dZ)  # 注意 dA 和 dZ 不需要 / m
    return dA_pre, dW, db

Dod_Jdi

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
DeepLearning.ai code笔记1：神经网络与深度学习

说明一下，这和系列是对编程作业的作一些我认为比较重要的摘抄、翻译和解释，主要是为了记录不同的模型的主要思想或者流程，以及一些coding中常见的错误，作为查漏补缺之用。作业链接：https://github.com/Wasim37/deeplearning-assignment。感谢大佬们在GitHub上的贡献。1、随机数的生成np.random.randn() 和 np.random...
复制链接

扫一扫