Andrew Ng Deep Learning 第四周 选择题

1.What is the “cache” used for in our implementation of forward propagation and backward propagation
A.We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.

B.We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.

C.It is used to cache the intermediate values of the cost function during training.

D.It is used to keep track of the hyperparameters that we are searching over, to speed up computation.
答案:B
the “cache” records values from the forward propagation units and sends it to the backward propagation units because it is needed to compute the chain rule derivatives.

2.Among the following, which ones are “hyperparameters”? (Check all that apply.)
A.activation values a [ l ] a^{[l]} a[l]

B.number of layers L L L in the neural network

C.number of iterations

D.size of the hidden layers n [ l ] n^{[l]} n[l]

E.bias vectors b [ l ] b^{[l]} b[l]

F.weight matrices W [ l ] W^{[l]} W[l]

G.learning rate α \alpha α
答案:B C D G

3.Which of the following statements is true?
A.The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.

B.The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
答案:A

4.Vectorization allows you to compute forward propagation in an L L L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?
答案:False
Forward propagation propagates the input through the layers, although for shallow networks we may just write all the lines .
In a deeper network, we cannot avoid a for loop iterating over the layers

5.Assume we store the values for n [ l ] n^{[l]} n[l]in an array called layer_dims, as follows: layer_dims = [n_x, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?
A.
for i in range(1, len(layer_dims)/2):
parameter[‘W’ + str(i)]=
np.random.randn(layer_dims[i], layer_dims[i-1]) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layer_dims[i], 1) * 0.01

B.
for i in range(1, len(layer_dims)):
parameter[‘W’ + str(i)] = np.random.randn(layer_dims[i-1], layer_dims[i]) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layer_dims[i], 1) * 0.01

C.
for i in range(1, len(layer_dims)):
parameter[‘W’ + str(i)] = np.random.randn(layer_dims[i], layer_dims[i-1]) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layer_dims[i], 1) * 0.01

D.
for i in range(1, len(layer_dims)/2):
parameter[‘W’ + str(i)] = np.random.randn(layer_dims[i], layer_dims[i-1]) * 0.01
parameter[‘b’ + str(i)] = np.random.randn(layer_dims[i-1], 1) * 0.01
答案:C

6.Consider the following neural network.
在这里插入图片描述
How many layers does this network have?
A.The number of layers LL is 5. The number of hidden layers is 4.

B.The number of layers LL is 4. The number of hidden layers is 4.

C.The number of layers LL is 3. The number of hidden layers is 3.

D.The number of layers LL is 4. The number of hidden layers is 3.
答案:D
As seen in lecture, the number of layers is counted as the number of hidden layers + 1. The input and output layers are not counted as hidden layers.

7.During forward propagation, in the forward function for a layer l l l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l l l, since the gradient depends on it. True/False?
答案:True
during backpropagation you need to know which activation was used in the forward propagation to be able to compute the correct derivative.

8.There are certain functions with the following properties:
(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network),
but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?
答案:True

9.Consider the following 2 hidden layer neural network:
在这里插入图片描述
Which of the following statements are True? (Check all that apply)
A. W [ 3 ] W ^{[3]} W[3]will have shape (1, 3)

B. W [ 3 ] W^{[3]} W[3]will have shape (3, 1)

C. W [ 2 ] W^{[2]} W[2]will have shape (3, 4)

D. b [ 1 ] b^{[1]} b[1]will have shape (3, 1)

E.W^{[1]} will have shape (4, 4)

F. W [ 1 ] W^{[1]} W[1]will have shape (3, 4)

G. b [ 3 ] b^{[3]} b[3]will have shape (3, 1)

H. b [ 2 ] b^{[2]} b[2]will have shape (1, 1)

I. W [ 2 ] W^{[2]} W[2]will have shape (3, 1)

J. b [ 3 ] b^{[3]} b[3]will have shape (1, 1)

K. b [ 2 ] b^{[2]} b[2]will have shape (3, 1)

L.b^{[1]}will have shape (4, 1)

答案:A C E J K L
the shape of W [ l ] W^{[l]} W[l] is ( n [ l ] n^{[l]} n[l], n [ l − 1 ] n^{[l-1]} n[l1])
the shape of b [ l ] b^{[l]} b[l]is ( n [ l ] n^{[l]} n[l], 1)

10.Whereas the previous question used a specific network, in the general case what is the dimension of W [ l ] W^{[l]} W[l], the weight matrix associated with layer l?
A. W [ l ] W^{[l]} W[l] has shape ( n [ l ] n^{[l]} n[l], n [ l − 1 ] n^{[l-1]} n[l1])

B. W [ l ] W^{[l]} W[l] has shape ( n [ l − 1 ] n^{[l-1]} n[l1], n [ l ] n^{[l]} n[l])

C. W [ l ] W^{[l]} W[l]has shape ( n [ l + 1 ] n^{[l+1]} n[l+1], n [ l ] n^{[l]} n[l])

D. W [ l ] W^{[l]} W[l]has shape ( n [ l ] n^{[l]} n[l], n [ l + 1 ] n^{[l+1]} n[l+1])
答案:A

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值