Learning Deconvolution Network for Semantic Segmentation
首先介绍了基于FCN的语义分割,如论文:
- Fully Convolutional Networks for Semantic Segmentation
- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connectred CRFs
FCN解释如下:
fully connected layers in the standard CNNs are interpreted as convolutions with large receptive fields, and segmentation is achieved using coarse class score maps obtained by feedforwarding an input image.
An interesting idea in this work is that a simple interpolation filter is employed for deconvolution and only the CNN part of the network is fine-tuned to learn deconvolution indirectly.
FCN存在以下问题:
如图所示:
first, the network can handle only a single scale semantics within image due to the fixed-size receptive field
首先,由于固定大小的感受域,网络只能处理图像中的单个尺度语义。second, the detailed structures of an object are often lost or smoothed because the label map, input to the deconvolutional layer is too coarse and deconvolution procedure is overly simple
其次,一个对象的详细结构常常丢失或被平滑掉,这是因为输入到反卷积层的标签图太小,且反卷积过程过于简单。
反卷积网络引入:
Deconvolution network is introduced in [25] to reconstruct input images. As the reconstruction of an input
image is non-trivial due to max pooling layers, it prop