深度学习笔记(吴恩达)

每章框架

COURSE1

WEEK TWO:

  1. Preprocessing the dataset is important.
  2. You implemented each function separately: initialize(), propagate(), optimize(). Then you built a model().
  3. Tuning the learning rate (which is an example of a “hyperparameter”) can make a big difference to the algorithm.

WEEK THREE

  1. Define the neural network structure ( # of input units, # of hidden units, etc).
  2. Initialize the model’s parameters
  3. Loop:
    • Implement forward propagation
    • Compute loss
    • Implement backward propagation to get the gradients
    • Update parameters (gradient descent)

WEEK FOUR
As usual you will follow the Deep Learning methodology to build the model:

  1. Initialize parameters / Define hyperparameters
  2. Loop for num_iterations:
    a. Forward propagation
    b. Compute cost function
    c. Backward propagation
    d. Update parameters (using parameters, and grads from backprop)
  3. Use trained parameters to predict labels

COURSE2

WEEK FIVE
1.Initialization

3-layer NN with zeros initializationfails to break symmetry
3-layer NN with large random initializationtoo large weights
3-layer NN with He initializationrecommended method

2.Regularization
(1)L2-regularization
The value of ? is a hyperparameter that you can tune using a dev set.
L2 regularization makes your decision boundary smoother. If ? is too large, it is also possible to “oversmooth”, resulting in a model with high bias.
What is L2-regularization actually doing?:
L2-regularization relies on the assumption that a model with small weights is simpler than a model with large weights. Thus, by penalizing the square values of the weights in the cost function you drive all the weights to smaller values. It becomes too costly for the cost to have large weights! This leads to a smoother model in which the output changes more slowly as the input changes.
What you should remember

  • he implications of L2-regularization on:
  • The cost computation:
  • A regularization term is added to the cost
  • The backpropagation function:
  • There are extra terms in the gradients with respect to weight matrices - Weights end up smaller (“weight decay”): - Weights are pushed to smaller values.
    (2)Dropout
    A common mistake when using dropout is to use it both in training and testing. You should use dropout (randomly eliminate nodes) only in training.
    Deep learning frameworks like tensorflow, PaddlePaddle, keras or caffe come with a dropout layer implementation. Don’t stress - you will soon learn some of these frameworks.
    What you should remember about dropout:
  • Dropout is a regularization technique.
  • You only use dropout during training. Don’t use dropout (randomly eliminate nodes) during test time.
  • Apply dropout both during forward and backward propagation.
  • During training time, divide each dropout layer by keep_prob to keep the same expected value for the activations. For example, if keep_prob is 0.5, then we will on average shut down half the nodes, so the output will be scaled by 0.5 since only the remaining half are contributing to the solution. Dividing by 0.5 is equivalent to multiplying by 2. Hence, the output now has the same expected value. You can check that this works even when keep_prob is other values than 0.5.

What we want you to remember from this notebook: - Regularization will help you reduce overfitting. - Regularization will drive your weights to lower values. - L2 regularization and Dropout are two very effective regularization techniques.

3.Gradient Checking
What you should remember from this notebook:

  • Gradient checking verifies closeness between the gradients from backpropagation and the numerical approximation of the gradient (computed using forward propagation).
  • Gradient checking is slow, so we don’t run it in every iteration of training. You would usually run it only to make sure your code is correct, then turn it off and use backprop for the actual learning process.

一些关键词与重点

softmax 函数

交叉熵(cross entropy)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值