Neural Networks and Deep Learning week4 Key concepts on Deep Neural Networks

看别人见解违法coursera荣誉,看懂和做对是两码事

What is the "cache" used for in our implementation of forward propagation and backward propagation?

  • We use it to pass variables computed during forward propagation to the corresponding backward propagation step. It contains useful values for backward propagation to compute derivatives.
  • It is used to cache the intermediate values of the cost function during training.
  • We use it to pass variables computed during backward propagation to the corresponding forward propagation step. It contains useful values for forward propagation to compute activations.
  • It is used to keep track of the hyperparameters that we are searching over, to speed up computation.

缓存意义 W = W - learning_rate * dW 前向保存值,后向直接使用

Among the following, which ones are "hyperparameters"? (Check all that apply.)

  • weight matrices W^{[l]}
  • size of the hidden layers n^{[l]}
  • number of layers L in the neural network
  • activation values a^{[l]}
  • number of iterations
  • learning rate α
  • bias vectors b^{[l]}

超参量,不需要计算,(W,b等需要计算),用于描述神经网络一些特征的,比如深度,个数,学习程度等

Which of the following statements is true?

  • The deeper layers of a neural network are typically computing more complex features of the input than the earlier layers.
  • The earlier layers of a neural network are typically computing more complex features of the input than the deeper layers.

建议记那个电路模块,深=层数多=元器件少=节省空间,浅=层数少=元器件多=节省时间

 

Vectorization allows you to compute forward propagation in an L-layer neural network without an explicit for-loop (or any other explicit iterative loop) over the layers l=1, 2, …,L. True/False?

False

第一层没办法,如果你把X导入为A0,或许不错,但是相应的数据会混乱

Assume we store the values for n^{[l]} in an array called layer_dims, as follows: layer_dims = [n_x​, 4,3,2,1]. So layer 1 has four hidden units, layer 2 has 3 hidden units and so on. Which of the following for-loops will allow you to initialize the parameters for the model?

for i in range(1, len(layer_dims)):
    parameter['W' + str(i)] = np.random.randn(layer_dims[i], layer_dims[i-1]) * 0.01
    parameter['b' + str(i)] = np.random.randn(layer_dims[i], 1) * 0.01

初始化的意义在于打破对称性(非线性关系)(线性关系不需要,其只和输入有关),因此需要对所有参数进行初始化。其次是大小对应关系,不会就画图,A=WX+b,X是由x构成,x列向量,W里每一个单元应该是行向量(wi*x),再将该层参数叠在一起,所以 行数=n层 ,列数=n-1层

Consider the following neural network.How many layers does this network have?

  • The number of layers L is 4. The number of hidden layers is 3.
  • The number of layers L is 3. The number of hidden layers is 3.
  • The number of layers L is 4. The number of hidden layers is 4.
  • The number of layers L is 5. The number of hidden layers is 4.

x层是输入层,不算到网络里面,在代码的画图演示里一般用A0代替,y是输入,而我们的目的就是预测y,所以y输出层算入网络,所以一共4层网络,y层是输出层,x是输入层,中间的是隐藏层/掩蔽层

During forward propagation, in the forward function for a layer ll you need to know what is the activation function in a layer (Sigmoid, tanh, ReLU, etc.). During backpropagation, the corresponding backward function also needs to know what is the activation function for layer ll, since the gradient depends on it. True/False?

True

不知道用的什么非线性函数怎么知道变换的,不要因为我们一直用sigmoid函数就以为不需要算,sigmoid是因为导书很容易表示,方便我们初学者使用,在写代码过程中理解网络含义

There are certain functions with the following properties:

(i) To compute the function using a shallow network circuit, you will need a large network (where we measure size by the number of logic gates in the network), but (ii) To compute it using a deep network circuit, you need only an exponentially smaller network. True/False?

True

记住那个电路图,空间和时间的选择问题

第 9 个问题

Consider the following 2 hidden layer neural network:

Which of the following statements are True? (Check all that apply).

  • W^{[1]}will have shape (4, 4)
  • b^{[1]} will have shape (4, 1)
  • W^{[1]} will have shape (3, 4)
  • b^{[1]} will have shape (3, 1)
  • W^{[2]} will have shape (3, 4)
  • b^{[2]} will have shape (1, 1)
  • W^{[2]} will have shape (3, 1)
  • b^{[2]} will have shape (3, 1)
  • W^{[3]} will have shape (3, 1)
  • b^{[3]} will have shape (1, 1)
  • W^{[3]} will have shape (1, 3)
  • b^{[3]} will have shape (3, 1)

两周还是三后内容了,反复看角标,注意对应关系ai=Wi*a(i-1)+bi

Whereas the previous question used a specific network, in the general case what is the dimension of W^{[l]}, the weight matrix associated with layer l?

  • W^{[l]} has shape (n^{[l]}, n^{[l+1]})
  • W^{[l]} has shape (n^{[l]}, n^{[l-1]})
  • W^{[l]} has shape (n^{[l+1]}, n^{[l]})
  • W^{[l]} has shape (n^{[l-1], n^{[l]})

同上题 【】第i层,()第几个示例,下角标第几个参数

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值