1。Which of the following are true? (Check all that apply.)
X is a matrix in which each column is one training example.
a[2] (12) denotes activation vector of the 12th layer on the 2nd training example.
X is a matrix in which each row is one training example.
a[2] denotes the activation vector of the 2nd layer.
a[2] (12) denotes the activation vector of the 2nd layer for the 12th training example.
a[2]4 is the activation output of the 2nd layer for the 4th training example
a[2]4 is the activation output by the 4th neuron of the 2nd layer
解析:
X矩阵为n*m,n为特征个数,m为训练个数
a[n] (k)代表第n层第k个数
下标代表是第几个神经元
2。 The tanh activation usually works better than sigmoid activation function for hidden units because the mean of its output is closer to zero, and so it centers the data better for the next layer. True/False?
True
False
解析:
sigmoid函数和tanh函数比较:
隐藏层:tanh函数的表现要好于sigmoid函数,因为tanh取值范围为[−1,+1],输出分布在0值的附近,均值为0,从隐藏层到输出层数据起到了归一化(均值为0)的效果。
输出层:对于二分类任务的输出取值为{0,1},故一般会选择sigmoid函数。
3。 Whi