反卷积(Deconvolution)上采样(Upsampling)上池化(Unpooling)的区别——附翻译

最新推荐文章于 2024-08-18 17:50:54 发布

AC_Lee

最新推荐文章于 2024-08-18 17:50:54 发布

阅读量1.4w

点赞数 5

分类专栏：毕业设计计算机视觉

本文链接：https://blog.csdn.net/xiaoli_nu/article/details/79028528

版权

毕业设计同时被 2 个专栏收录

2 篇文章 0 订阅

订阅专栏

计算机视觉

2 篇文章 0 订阅

订阅专栏

http://blog.csdn.net/u012949263/article/details/54379996 提供了英文版

Question：

Deconvolution networks use deconvolution layers to infer sparse feature maps and filters (this is the same as convolutional sparse coding).

In the context of Fully Convolutional Networks that perform pixel-wise segmentation, does deconvolution just mean upsampling (e.g. bilinear interpolation)? Is this is different to deconvolutional layers in Deconv Nets?

As far as I understand unpooling just mean using switches to position values in the switch positions with the rest of the values in between set to zero?

It would be great if someone could clarify the differences between these terms.

2 Answers：

Christian Baumgartner
Christian Baumgartner, 5 years of experience in using machine learning for medical image analysis
Written Nov 10
Upsampling refers to any technique that, well, upsamples your image to a higher resolution.

The easiest way is using resampling and interpolation. This is taking an input image, rescaling it to the desired size and then calculating the pixel values at each point using a interpolation method such as bilinear interpolation.

Unpooling is commonly used in the context of convolutional neural networks to denote reverse max pooling. Citing from this paper: Unpooling: In the convnet, the max pooling operation is non-invertible, however we can obtain an approximate inverse by recording the locations of the maxima within each pooling region in a set of switch variables. In the deconvnet, the unpooling operation uses these switches to place the reconstructions from the layer above into appropriate locations, preserving the structure of the stimulus.

Deconvolution in the context of convolutional neural networks is often used to denote a sort of reverse convolution, which importantly and confusingly is not actually a proper mathematical deconvolution. In contrast to unpooling, using ‘deconvolution’ the upsampling of an image can be learned. It is often used for upsampling the output of a convnet to the original image resolution. I wrote another answer on this topic here. Deconvolution is more appropriately also referred to as convolution with fractional strides, or transpose convolution.

Then there is proper deconvolution which reverses the effect of a convolution (Deconvolution - Wikipedia). I don’t think people actually use this in the context of convolutional neural networks.

I don’t know much about convolutional sparse coding but it appears to me from glancing at a few papers, that those approaches use of the former kind of ‘deconvolution’, i.e. tranpose convolution, to allow you to go from a sparse image representation obtained using convnets, back to the original image resolution. (Happy to be corrected on this.)

对解答拙劣地翻译一发：

上采样是指将图像上采样到更高分辨率的任何技术。
最简单的方法是使用重新采样和插值。即取原始图像输入，将其重新缩放到所需的大小，然后使用插值方法（如双线性插值）计算每个点处的像素值。

在CNN上下文中，上池化通常指代最大池化的逆过程。在CNN中，最大池化操作是不可逆的，但是我们可以通过使用一组转换变量记录每个池化区域内最大值的位置来获得一个近似的逆操作结果。在反卷积（网络）中，上池化操作使用这些转换变量从前一层输入中安放这些复原物到（当前层）合适的位置，从而一定程度上保护了原有结构。

在CNN上下文中，反卷积通常用于指代卷积的逆过程，而非数学意义上真正的反卷积，这一点很重要，也很令人困惑。相比上池化，使用反卷积进行图像的上采样是可以被学习的。反卷积常被用于对CNN的输出进行上采样至原始图像分辨率。我在这里写下关于这个问题的另一个答案：

反卷积常被认为是空洞卷积或者转置卷积，这也更为恰当一些。

对于一个卷积，存在一个合适的反卷积反转/消除它的影响/作用（反卷积 - 维基百科）。我不认为人们实际上在CNN上下文中使用这个（定义）。

我对卷积稀疏编码知之甚少，但在浏览一些论文的过程中我发现，这些论文使用了前一种“反卷积”，即转置卷积，使图像从卷积生成的稀疏图像表示回到原始图像分辨率。（很高兴这一点能够被纠正。）

翻译水平有限，如有误请斧正。

由于上采样是指将图像上采样到更高分辨率的任何技术，因此我们可以讲：通过反卷积进行上采样。