吴恩达课程深度学习错题集

1.Logistic regression’s weights w should be initialized randomly rather than to all zeros, because if you initialize to all zeros, then logistic regression will fail to learn a useful decision boundary because it will fail to “break symmetry”, True/False?

No, Logistic Regression doesn't have a hidden layer. If you initialize the weights to zeros, the first example x fed in the logistic regression will output zero but the derivatives of the Logistic Regression depend on the input x (because there's no hidden layer) which is not zero. So at the second iteration, the weights values follow x's distribution and are different from each other if x is not a constant vector.

2 If you have 10,000,000 examples, how would you split the train/dev/test set?

98% train . 1% dev . 1% test

3. 在exponentially weighted average曲线中,减小beta值会让曲线左移。

个人解释:越高的beta值,某点处的值被平均到后面的就越多。减小beta值可以让出现在后面的值较多得“补回来”,视觉上看应该是曲线左移。

4.

After setting up your train/dev/test sets, the City Council comes across another 1,000,000 images, called the “citizens’ data”. Apparently the citizens of Peacetopia are so scared of birds that they volunteered to take pictures of the sky and label them, thus contributing these additional 1,000,000 images. These images are different from the distribution of images the City Council had originally given you, but you think it could help your algorithm.You should not add the citizens’ data to the training set, because this will cause the training and dev/test set distributions to become different, thus hurting dev and test set performance. True/False?

False

5.市议会的一位成员对机器学习知之甚少, 并认为应该将100万公民的数据图像添加到测试组中。您的意见是:(B、C)

A、一个更大的测试集将减慢迭代的速度, 因为在测试集上评估模型的计算费用。

B、这将导致开发和测试集分布变得不同。这是一个坏主意, 因为你没有瞄准你想要击中的地方。

C、测试集不再反映您最关心的数据 (安全摄像机拍的) 的分布。

D、与其余的数据相比,100万公民的数据图像没有一个一致的 x->> y 映射 (类似于纽约市/底特律住房价格的例子, 从讲座)


6. Because pooling layers do not have parameters, they do not affect the backpropagation (derivatives) calculation.

False

7.You have an input volume that is 32x32x16, and apply max pooling with a stride of 2 and a filter size of 2. What is the output volume?

16x16x16

池化层总是2维的

8.

Which ones of the following statements on Residual Networks are true? (Check all that apply.)

Using a skip-connection helps the gradient to backpropagate and thus helps you to train deeper networks

A ResNet with L layers would have on the order of L2 skip connections in total.

The skip-connections compute a complex non-linear function of the input to pass to a deeper layer in the network.

The skip-connection makes it easy for the network to learn an identity mapping between the input and the output within the ResNet block.

9.

You are working on a factory automation task. Your system will see a can of soft-drink coming down a conveyor belt, and you want it to take a picture and decide whether (i) there is a soft-drink can in the image, and if so (ii) its bounding box. Since the soft-drink can is round, the bounding box is always square, and the soft drink can always appears as the same size in the image. There is at most one soft drink can in each image. Here’re some typical images in your training set: 

What is the most appropriate set of output units for your neural network?

Logistic unit, bx, by

10. 

Alice proposes to simplify the GRU by always removing the Γu. I.e., setting Γu = 1. Betty proposes to simplify the GRU by removing the Γr. I. e., setting Γr = 1 always. Which of these models is more likely to work without vanishing gradient problems even when trained on very long input sequences?

Betty’s model (removing Γr), because if Γu0 for a timestep, the gradient can propagate back through that timestep without much decay.

Yes. For the signal to backpropagate without vanishing, we need c<t> to be highly dependant on c<t1>.

11. 

Suppose you have a 10000 word vocabulary, and are learning 500-dimensional word embeddings.The GloVe model minimizes this objective:

min10,000i=110,000j=1f(Xij)(θTiej+bi+bjlogXij)2

Which of these statements are correct? Check all that apply.


θi and ej

 should be initialized randomly at the beginning of training

.


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值