1.什么是残差?
可见,残差就是说后面的层是由前面的层加上一组值得到的,这组值就是所谓的残差。
2. 残差存在的意义?
像vgg那样直接堆叠层进行训练,层越深错误率越高。传说是由于深层网络产生白噪声不断回传导致。
加上resnet后,深层网络可以得到训练,同时也说明了越深的网络效果越好,如下图:
3. 深层网络的基本单元?
对应的caffe网络协议部分:
layer {
bottom: "res2a_branch1"
bottom: "res2a_branch2c"
top: "res2a"
name: "res2a"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res2a"
top: "res2a"
name: "res2a_relu"
type: "ReLU"
}
layer {
bottom: "res2a"
top: "res2b_branch2a"
name: "res2b_branch2a"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 1
pad: 0
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "bn2b_branch2a"
type: "BatchNorm"
batch_norm_param {
use_global_stats: false
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "scale2b_branch2a"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2a"
name: "res2b_branch2a_relu"
type: "ReLU"
}
layer {
bottom: "res2b_branch2a"
top: "res2b_branch2b"
name: "res2b_branch2b"
type: "Convolution"
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "bn2b_branch2b"
type: "BatchNorm"
batch_norm_param {
use_global_stats: false
}
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "scale2b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "res2b_branch2b_relu"
type: "ReLU"
}
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2c"
name: "res2b_branch2c"
type: "Convolution"
convolution_param {
num_output: 256
kernel_size: 1
pad: 0
stride: 1
weight_filler {
type: "msra"
}
bias_term: false
}
}
layer {
bottom: "res2b_branch2c"
top: "res2b_branch2c"
name: "bn2b_branch2c"
type: "BatchNorm"
batch_norm_param {
use_global_stats: false
}
}
layer {
bottom: "res2b_branch2c"
top: "res2b_branch2c"
name: "scale2b_branch2c"
type: "Scale"
scale_param {
bias_term: true
}
}
layer {
bottom: "res2a"
bottom: "res2b_branch2c"
top: "res2b"
name: "res2b"
type: "Eltwise"
eltwise_param {
operation: SUM
}
}
layer {
bottom: "res2b"
top: "res2b"
name: "res2b_relu"
type: "ReLU"
}
4. 其中对于caffe特殊的地方是Eltwise层,那么什么是Eltwise层?
Eltwise
- element-wise operations such as product or sum between two blobs.
完成两个层之间的对应元素相加或相乘。
通过proto传入必要参数: