Computer vision - low-level vision
Convolution in image processing (图像处理中的卷积)
- Basic conceptH(k,l) 是掩膜(又称模板或核),是一个权重的矩阵(类似于一小部分图像)。
每一个像素点(i,j)的值都可以通过加权平均邻居像素点的值的加权平均求得。
You can imagine that sliding the mask across the input image, filling in the values for the output (filtered) image as you go.
Alternatively, you can imagine the mask replicated at every pixel location in the output image, and the results generated in parallel (like the Difference of Gaussian filters in the retina).
(你可以想象这个掩膜在输入图像上滑动,给输出[滤波后]图像填充值。
相应的,你也可以想象掩膜在输出图像中的每一个像素点上复制,并行的输出结果,正如在视网膜上的高斯差分滤波器一样。)
- Method
For each image pixel in turn:- Centre the rotated mask over that pixel
- Multiply each mask element by the corresponding image pixel value
- Sum these products and write answer in corresponding pixel location in the output image
(轮流对每个像素进行操作
- 首先先将旋转后的掩膜对准要处理的像素中心
- 将掩膜里每个元素核对应图像中的每个像素点的值相乘
- 将这些点乘的结果相加,将结果写在输出图像对应的像素点)
旋转掩膜:沿对角翻折。也可以分作两步,先沿着横向中心线上下翻折,再沿着竖向中心线左右翻折。
对应元素相乘加和:这里我们选择对边界补0,也就是说,图像边缘没有的像素点值默认是0.
In the same way, the final filtered image is shown as follows,
How to solve the problem that the mask falls off the image in the image boundaries?
- Two most common:
- pad the input image with zeros (in the example above).
- make the output image smaller than the input image (red area in example below). i.e. only apply mask at locations on the input image where it does not fall off the edge.
(如何解决掩码在图像边缘超出部分的问题:
-
两种常用方法:
- 给输入图像周围补零[如上个例子中所展示的一样]。
- 使得输出图像比输入图像的大小小一些[下图中红色区域就是输出图像]。也就是说,只在放置掩膜时,都可以在输入图像中找到掩膜元素对应的值的位置输出结果)。
-
Masks as point-spread functions:
Convolving a mask with a simple image that has just an isolated white dot on a black background, the output will be the mask itself shifted by the row and column numbers of the isolated pixel in the input.
An ordinary image can be thought of as a combination of such points - one for each pixel. So the result of a convolution can be thought of as a superimposition of masks, each one weighted by the grey-level of an image pixel.
(让一个掩膜与只有一个白色点、其余部分均为黑色背景的图片卷积,结果时会是掩膜移动到输入图像的那个白色点的位置。
假设初始图像时一些单像素点的集合,那么卷积的结果就是对应的掩膜的叠加,每一个掩膜的权重[亮度]是依据单像素点的灰度决定的。
) -
Masks as templates:
The convolution output is a maximum when large values in the input get multiplied by large values in the mask. This means that convolution masks respond most strongly to image features that resemble the rotated mask.
The rotated mask is like a template which is scanned across the image to find image features that match that template.
(当输入图像的最大值乘以掩膜的最大值可以得到卷积结果的最大值。这意味着,当输入图像和设定的掩膜卷积时,在特征最像翻转的掩膜的地方会得到最强的输出相应。
)
- Masks examples
-
Silly
上图中,图像与上方的掩膜卷积的结果是它本身,没有变化;图像与下方的掩膜卷积,结果时所有的像素点均向右移动一个单位(注意与掩膜卷积要先将掩膜旋转)。 -
Smoothing