深度学习的实践方面Quiz 3

最新推荐文章于 2022-08-04 16:45:29 发布

nichengwuxiao

最新推荐文章于 2022-08-04 16:45:29 发布

阅读量1.3k

点赞数 2

分类专栏： Andrew Ng Coursera 深度学习微专业Quiz 文章标签：深度学习深度学习的实践方面Q

本文链接：https://blog.csdn.net/nichengwuxiao/article/details/78624421

版权

Andrew Ng Coursera 深度学习微专业Quiz 专栏收录该内容

12 篇文章 2 订阅

订阅专栏

1。If searching among a large number of hyperparameters, you should try values in a grid rather than random values, so that you can carry out the search more systematically and not rely on chance. True or False?

True

False
解析：
在机器学习领域，超参数比较少的情况下，我们之前利用设置网格点的方式来调试超参数；
但在深度学习领域，超参数较多的情况下，不是设置规则的网格点，而是随机选择点进行调试。这样做是因为在我们处理问题的时候，是无法知道哪个超参数是更重要的，所以随机的方式去测试超参数点的性能，更为合理，这样可以探究更超参数的潜在价值。

2。Every hyperparameter, if set poorly, can have a huge negative impact on training, and so all hyperparameters are about equally important to tune well. True or False?

True

False
解析：并不是具有同等的重要性

3。During hyperparameter search, whether you try to babysit one model (“Panda” strategy) or train a lot of models in parallel (“Caviar”) is largely determined by:

Whether you use batch or mini-batch optimization

The presence of local minima (and saddle points) in your neural network

The amount of computational power you can access

The number of hyperparameters you have to tune
解析：
这里写图片描述
在计算资源有限的情况下，使用第一种，仅调试一个模型，每天不断优化；
在计算资源充足的情况下，使用第二种，同时并行调试多个模型，选取其中最好的模型。

4。If you think β (hyperparameter for momentum) is between on 0.9 and 0.99, which of the following is the recommended way to sample a value for beta?

r = np.random.rand()
beta = r*0.09 + 0.9

r = np.random.rand()
beta = 1-10**(- r - 1)

r = np.random.rand()
beta = 1-10**(- r + 1)

r = np.random.rand()
beta = r*0.9 + 0.09

解析：在选择的时候，在不同比例范围内进行均匀随机取值，如 0.001~0.001 、 0.001~0.01 、 0.01~0.1 、 0.1~1 范围内选择。
一般地，如果在 10^a~10^b 之间的范围内进行按比例的选择，则 r 范围为[a, b] ， alpha = 10^r 。
同样，在使用指数加权平均的时候，超参数beta 也需要用上面这种方向进行选择。
1和4都是均匀分布，3范围不对

5。Finding good hyperparameter values is very time-consuming. So typically you should do it once at the start of the project, and try to find very good hyperparameters so that you don’t ever have to revisit tuning them again. True or false?

True

False
解析：按课程中的说法，应该每过3个月到半年，重新check以下

6。In batch normalization as presented in the videos, if you apply it on the lth layer of your neural network, what are you normalizing?

z^[l]

a^[l]

W^[l]

b^[l]
解析：正则化的是z^[l]

7。In the normalization formula z⁽ⁱ⁾_norm=(z⁽ⁱ⁾−μ)/(σ²+ε)^0.5, why do we use epsilon?

In case μ is too small

To have a more accurate normalization

To speed up convergence

To avoid division by zero

8。Which of the following statements about γ and β in Batch Norm are true?

They set the mean and variance of the linear variable z[l] of a given layer.

There is one global value of γ∈R and one global value of β∈R for each layer, and applies to all the hidden units in that layer.

The optimal values are γ=(σ²+ε)^0.5, and β=μ.

They can be learned using Adam, Gradient descent with momentum, or RMSprop, not just with gradient descent.

β and γ are hyperparameters of the algorithm, which we tune via random sampling.

9。After training a neural network with Batch Norm, at test time, to evaluate the neural network on a new example you should:

If you implemented Batch Norm on mini-batches of (say) 256 examples, then to evaluate on one test example, duplicate that example 256 times so that you’re working with a mini-batch the same size as during training.

Use the most recent mini-batch’s value of μ and σ² to perform the needed normalizations.

Skip the step where you normalize using μ and σ² since a single test example cannot be normalized.

Perform the needed normalizations, use μ and σ² estimated using an exponentially weighted average across mini-batches seen during training.
解析：
通常的方法就是在我们训练的过程中，对于训练集的Mini-batch，使用指数加权平均，当训练结束的时候，得到指数加权平均后的均值和方差，而这些值直接用于Batch Norm公式的计算，用以对测试样本进行预测。

10。Which of these statements about deep learning programming frameworks are true? (Check all that apply)

Deep learning programming frameworks require cloud-based machines to run.

A programming framework allows you to code up deep learning algorithms with typically fewer lines of code than a lower-level language such as Python.

Even if a project is currently open source, good governance of the project helps ensure that the it remains open even in the long term, rather than become closed or modified to benefit only one company.