图像卷积：Image Convolutions

最新推荐文章于 2024-08-30 10:21:41 发布

GarfieldEr007

最新推荐文章于 2024-08-30 10:21:41 发布

阅读量3.4k

点赞数 1

分类专栏：计算机视觉CV 文章标签：图像卷积 Image Convolutions

计算机视觉CV 专栏收录该内容

327 篇文章 28 订阅

订阅专栏

1. Convolutions

name="f28a22de6c" width="595px" height="1000px" frameborder="0" allowtransparency="true" allowfullscreen="true" scrolling="no" title="fb:like Facebook Social Plugin" src="http://www.facebook.com/v2.4/plugins/like.php?action=like&app_id=&channel=http%3A%2F%2Fstaticxx.facebook.com%2Fconnect%2Fxd_arbiter.php%3Fversion%3D42%23cb%3Df270aa7338%26domain%3Daishack.in%26origin%3Dhttp%253A%252F%252Faishack.in%252Ffe0928ea%26relation%3Dparent.parent&container_width=597&href=http%3A%2F%2Ffacebook.com%2Faishack&layout=standard&locale=en_US&sdk=joey&show_faces=true&width=595" style="box-sizing: border-box; position: absolute; border-style: none; border-width: initial; visibility: visible; width: 0px; height: 0px;">

Convolutions is a technique for general signal processing. People studying electrical/electronics will tell you the near infinite sleepless nights these convolutions have given them. Entire books have been written on this topic. And the questions and theorems that need to be proved are [insurmountable]. But for computer vision, we'll just deal with some simple things.

The Kernel

A convolution lets you do many things, like calculate derivatives, detect edges, apply blurs, etc. A very wide variety of things. And all of this is done with a "convolution kernel".

The convolution kernel is a small matrix. This matrix has numbers in each cell and has an anchor point:

The convolution kernel

This kernel slides over an image and does its thing. The "anchor" point is used to determine the position of the kernel with respect to the image.

The transformation

The anchor point starts at the top-left corner of the image and moves over each pixel sequentially. At each position, the kernel overlaps a few pixels on the image. Each overlapping pair of numbers is multiplied and added. Finally, the value at the current position is set to this sum.

Here's an example:

An example of the transformation

The matrix on the left is the image and the one on the right is the kernel. Suppose the kernel is at the highlighted position. So the '9' of the kernel overlaps with the '4' of the image. So you calculate their product: 36. Next, '3' of the kernel overlaps the '3' of the image. So you multiply: 9. Then you add it to 36. So you get a sum of 36+9=45. Similarly, you do for all the remaining 7 overlapping values. You'll get a total sum. This sum is stored in place of '2' (in the image).

Speed optimizations

The most direct way to compute a convolution would be to use multiple for loops. But that causes a lot of repeated calculations. And as the size of the image and kernel increases, the time to compute the convolution increases too (quite drastically).

Techniques haves been developed to calculate convolutions rapidly. One such technique is using the Discrete Fourier Transform. It converts the entire convolution operation into a simple multiplication. Fortunately, you don't need to know the math to do this in OpenCV. It automatically decides whether to do it in frequency domain (after the DFT) or not.

Problematic corners and edges

The kernel is two dimensional. So you have problems when the kernel is near the edges or corners. Here's an example: If the kernel (in the above example) is on the top right position, the '0' of the kernel will be over the '3' in the image. But the '1' will be outside the image. So we have no idea what to do with it. Two things are possible:

Ignore the ones -or-
Do something about the edges Usually people choose to do something about it. They create extra pixels near the edges. There are a few ways to create extra pixels:
Set a constant value for these pixels
Duplicate edge pixels
Reflect edges (like a mirror effect)
Warp the image around (copy pixels from the other end)

This usually fixes the problems that might arise.

Summary

You learned a powerful technique that can be used for a lot of different purposes. We'll see a few of those next.

2. Image convolution examples

A convolution is very useful for signal processing in general. There is a lot of complex mathematical theory available for convolutions. For digital image processing, you don't have to understand all of that. You can use a simple matrix as an image convolution kernel and do some interesting things!

Simple box blur

Here's a first and simplest. This convolution kernel has an averaging effect. So you end up with a slight blur. The image convolution kernel is:

The convolution kernel for a simple blur

Note that the sum of all elements of this matrix is 1.0. This is important. If the sum is not exactly one, the resultant image will be brighter or darker.

Here's a blur that I got on an image:

After a simple blur done with a convolution

A simple blur done with convolutions

Gaussian blur

Gaussian blur has certain mathematical properties that makes it important for computer vision. And you can approximate it with an image convolution. The image convolution kernel for a Gaussian blur is:

Here's a result that I got:

Result of gaussian blur with a convolution

Line detection with image convolutions

With image convolutions, you can easily detect lines. Here are four convolutions to detect horizontal, vertical and lines at 45 degrees:

Convolution kernels for line detection I looked for horizontal lines on the house image. The result I got for this image convolution was:

Detecting horizontal lines with a convolution

Edge detection

The above kernels are in a way edge detectors. Only thing is that they have separate components for horizontal and vertical lines. A way to "combine" the results is to merge the convolution kernels. The new image convolution kernel looks like this:

The edge detection convolution kernel

Below result I got with edge detection:

Edge detection with convolutions

The Sobel Edge Operator

The above operators are very prone to noise. The Sobel edge operators have a smoothing effect, so they're less affected to noise. Again, there's a horizontal component and a vertical component.

The sobel operator's convolution kernel

On applying this image convolution, the result was:

Result of the horizontal sobel operator

The laplacian operator

The laplacian is the second derivative of the image. It is extremely sensitive to noise, so it isn't used as much as other operators. Unless, of course you have specific requirements.

The kernel for the laplacian operator

Here's the result with the convolution kernel without diagonals:

The result of convolution with with the laplacian operator

The Laplacian of Gaussian

The laplacian alone has the disadvantage of being extremely sensitive to noise. So, smoothing the image before a laplacian improves the results we get. This is done with a 5x5 image convolution kernel.

The kernel for the laplacial of gaussian operation

The result on applying this image convolution was:

The result of applying the laplacian of gaussian operator

Summary

You got to know about some important operations that can be approximated using an image convolution. You learned the exact convolution kernels used and also saw an example of how each operator modifies an image. I hope this helped!

from: http://aishack.in/tutorials/convolutions/