caffe 关于Deconvolution的初始化注意事项

对于fcn,经常要使用到Deconvolution进行上采样。对于caffe使用者,使用Deconvolution上采样,其参数往往直接给定,不需要通过学习获得。

给定参数的方式很有意思,可以通过两种方式实现,但是这两种方式并非完全等价,各有各的价值。

第一种方式: 通过net_surgery给定,

这种方式最开始出现在FCN中。https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/voc-fcn32s/solve.py
代码如下:

import caffe
import surgery, score

import numpy as np
import os
import sys

try:
    import setproctitle
    setproctitle.setproctitle(os.path.basename(os.getcwd()))
except:
    pass

weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'

# init
caffe.set_device(int(sys.argv[1]))
caffe.set_mode_gpu()

solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from(weights)

# surgeries  (这里就是对于反卷积层的参数进行初始化)
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

# scoring
val = np.loadtxt('../data/segvalid11.txt', dtype=str)

for _ in range(25):
    solver.step(4000)
    score.seg_tests(solver, False, val, layer='score')

上采样的函数:

 #  make a bilinear interpolation kernel
    def upsample_filt(self,size):
        factor = (size + 1) // 2
        if size % 2 == 1:
            center = factor - 1
        else:
            center = factor - 0.5
        og = np.ogrid[:size, :size]
        return (1 - abs(og[0] - center) / factor) * \
               (1 - abs(og[1] - center) / factor)

    # set parameters s.t. deconvolutional layers compute bilinear interpolation
    # N.B. this is for deconvolution without groups
    def interp_surgery(self,net, layers):
        for l in layers:
            print l
            m, k, h, w = net.params[l][0].data.shape   #仅仅修改w,不需要修改bias,其为0
            print("deconv shape:\n")
            print m, k, h, w 
            if m != k and k != 1:
                print 'input + output channels need to be the same or |output| == 1'
                raise
            if h != w:
                print 'filters need to be square'
                raise
            filt = self.upsample_filt(h)
            print(filt)
            net.params[l][0].data[range(m), range(k), :, :] = filt

第二种方式:直接在Deconvolution中给定参数weight_filler,即:

代码如下:

layer {
  name: "fc8_upsample"
  type: "Deconvolution"
  bottom: "fc8"
  top: "fc8_upsample"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  convolution_param {
    num_output: 1
    kernel_size: 16
    stride: 8
    pad: 3
    weight_filler {   # 这里相当于上面的直接赋值
      type: "bilinear"
    }
  }
}

weight_filler初始化成双线性就等价于直接按照上面的方式赋值。

看起来好像以上两种方法一样,但是实际上有不同。主要区别在对于num_output>1的情形。

比如对于一个输入是2个通道的map,希望对其进行上采样,自然我们希望分别对于map放大即可。如果使用Deconvolution,则shape大小为2,2,16,16(设其大小为16*16).不考虑bias项。

假设按照上面的方式初始化,则对于第一种方法,得到结果:
[0,0,:,:]:

[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
[0,1,:,:]:

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,0,:,:]:

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,1,:,:]:

[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
而第二种方式全部都是[0,0,:,:]这样的矩阵。

以上两种方法应该是第一种对的。因为Deconvolution 其实与卷积类似,按照第一种结果才能分别单独地对map上采样,而采用第二种则将会得到两个相同的map。(因为综合了两个输入map的信息)

因此结论: 对于多个输入输出的Deconvolution,采用方法1,对于单个输入的,方法1,2通用。

附上Deconvolution的官方编码:

这里写图片描述

说明:
以上的称述有点瑕疵,其实caffe已经解决了上述的问题,我之前没有好好留意。 关键就在group这个选项。
如果num_output>1,则填上group: c 再加上weight_filler: { type: “bilinear” },即可完成初始化。

阅读更多

扫码向博主提问

xiamentingtao

随便问吧,无论学术还是人生
  • 擅长领域:
  • cv
  • math
  • 考博
  • 算法
  • AI
去开通我的Chat快问
版权声明:本文为博主原创文章,转载请注明原地址。 https://blog.csdn.net/xiamentingtao/article/details/79397350
上一篇sigmoid函数的数值稳定性
下一篇安装oracle-java,并覆盖原先的OpenJDK
想对作者说点什么? 我来说一句

没有更多推荐了,返回首页

关闭
关闭
关闭