神经网络与深度学习(Quiz 4)

1。What is the “cache” used for in our implementation of forward propagation and backward propagation?

It is used to keep track of the hyperparameters that we are searching over, to speed up computation.

We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.

It is used to cache the intermediate values of the cost function during training.

We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
解析:cache存储的是dw,db,w,b,用于后向传播的计算

2。Among the following, which ones are “hyperparameters”? (Check all that apply.)

activation values a[l]

size of the hidden layers n[l]

number of iterations

weight matrices W[l]

bias vectors b[l]

learning rate α

number of layers L in the neural network
解析:
参数:参数即是我们在过程中想要模型学习到的信息,W[l],b[l]
超参数:超参数即为控制参数的输出值的一些网络信息,也就是超参数的改变会导致最终得到的参数W[l],b[l]的改变。
举例:
学习速率:α
迭代次数:N
隐藏层的层数:L
每一层的神经元个数:n[1],n[2],⋯
激活函数g(z)的选择
3。Which of the following statements is true?

The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.

The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.
解析:随着神经网络的深度加深,模型能学习到更加复杂的问题,功能也更加强大。
4。Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?

True

False
5。Assume we store the values for n[l] in an array called layers, as follows: layer_dims = [nx, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?

for(i in range(1, len(layer_dims)/2)):
  parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i-1])) * 
    0.01
  parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
for(i in range(1, len(layer_dims)/2)):
  parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i-1])) * 
      0.01
  parameter[‘b’ + str(i)] = np.random.randn(layers[i-1], 1) * 0.01
for(i in range(1, len(layer_dims))):
  parameter[‘W’ + str(i)] = np.random.randn(layers[i-1], layers[i])) * 
      0.01
  parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01
for(i in range(1, len(layer_dims))):
  parameter[‘W’ + str(i)] = np.random.randn(layers[i], layers[i-1])) * 
      0.01
  parameter[‘b’ + str(i)] = np.random.randn(layers[i], 1) * 0.01

解析:选第四个,对于第l层神经网络,单个样本其各个参数的矩阵维度为:
W[l]:(n[l],n[l−1])
b[l]:(n[l],1)
dW[l]:(n[l],n[l−1])
db[l]:(n[l],1)
Z[l]:(n[l],1)
A[l]=Z[l]:(n[l],1)
6。Consider the following neural network.
这里写图片描述
How many layers does this network have?

The number of layers L is 4. The number of hidden layers is 3.

The number of layers L is 3. The number of hidden layers is 3.

The number of layers L is 4. The number of hidden layers is 4.

The number of layers L is 5. The number of hidden layers is 4.
解析:神经网络的层数不考虑输入层,所以有4层,隐藏层有3层
7。During forward propagation, in the forward function for a layer l you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer l, since the gradient depends on it. True/False?

True

False
解析:需要知道激活函数,以便进行A的计算和dZ的计算
8。
There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

True

False
解析:参考视频中电路例子

9。Consider the following 2 hidden layer neural network:

这里写图片描述
Which of the following statements are True? (Check all that apply).

W[1] will have shape (4, 4)

b[1] will have shape (4, 1)

W[1] will have shape (3, 4)

b[1] will have shape (3, 1)

W[2] will have shape (3, 4)

b[2] will have shape (1, 1)

W[2] will have shape (3, 1)

b[2] will have shape (3, 1)

W[3] will have shape (3, 1)

b[3] will have shape (1, 1)

W[3] will have shape (1, 3)

b[3] will have shape (3, 1)
解析:参考5
10。Whereas the previous question used a specific network, in the general case what is the dimension of W^{[l]}, the weight matrix associated with layer l?

W[l] has shape (n[l],n[l+1])

W[l] has shape (n[l+1],n[l])

W[l] has shape (n[l-1],n[l])

W[l] has shape (n[l],n[l-1])
解析:参考5

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值