upsample
1. src/caffe/proto/caffe.proto
添加upsamper参数描述
message UpsampleParameter{
optional int32 scale = 1 [default = 1];
}
message LayerParameter{
...
optional UpsampleParameter upsample_param = 149; //序号不冲突即可
}
2. upsample 实现参考
https://github.com/SeanQ88/caffe_upsample
实现分析
假设输入为2x2, scale=2.0;则输出则为4x4
1. 原矩阵2x2
1 | 2 |
3 | 4 |
2. 经过upsample, scale=2.0,
1 | 1 | 2 | 2 |
1 | 1 | 2 | 2 |
3 | 3 | 4 | 4 |
3 | 3 | 4 | 4 |
3. 核心算法如下
用目的坐标映射到原坐标
- input NCHW=(1,1,2,2)
- ouput NCHW=(1,1,4,4)
int N = 1, C =1, H = 4, H = 4;
for (int n = 0; n <N; n++) {
for (int c = 0; c < C; c++) {
for (int h = 0; h < H; h++){
for (int w = 0; w< W; w++) {
//计算目的坐标到原坐标的对应关系;
int nw = w / scale_;
int nh = h / scale_;
//n * C * H * W + c * H * W + h * W + w
//output_dix 坐标计算可自行化简,可得如下
int output_idx = (((n * C + c) * H) + h) * W + w;
//input_dix坐标同理,可得如下, 就是把HxW scale到原HxW
int input_idx = (((n * C + c) * (H/scale)) + nh) * (W/sacle) + nw;
}
}
}
}
4. 反向传播
由于bottom_diff是NCHW=(1,1,2,2), top_diff(1,1,4,4)
并且scale是常数,不参数梯度,即可认为是f = x,导数为f’=1
从整个表达来说f = scale * x,top_diff * scale 应该也是可以的,然后再做如下运算,传递梯度到下一层。如有理解误的,望指正。
for (int n = 0; n < N; n++) {
for (int c = 0; c < C; c++) {
for (int h = 0; h < H; h++) {
for (int w = 0; w < W; w++) {
for (int i = 0; i < scale_; i++) {
for (int j = 0; j < scale_; j++) {
int nw = w * scale_ + i;
int nh = h * scale_ + j;
int out_idx = (((n * C + c) * H) + h) * W + w;
int in_idx = (((n * C + c) * (H * scale_))
+ nh) * (W * scale_) + nw;
bottom_diff[out_idx] += top_diff[in_idx];
}
}
}
}
}
}