Summary Of notation
If layer l is a convolution layer
f[l] = filter过滤器 (有的也称kernel 核)
p[l] = padding
s[l] = stride步长
nc[l] = number of filter (也是输出图片通道数量)
Each filter is : f[l]×f[l]×nc[l-1](H×W×输入图片通道数)
Activation(激活值): a[l] -> nH[l]× nw[l] × nc[l] (激活值也是激活函数变换后的值)
Weight(权重): f[l]×f[l]×nc[l-1]×nc[l]
bias(偏差): 通常会用向量 1×1×1×nc[l]