opencv-OpenCV中的图像处理 [1]

最新推荐文章于 2022-11-12 21:04:07 发布

风吴痕

最新推荐文章于 2022-11-12 21:04:07 发布

阅读量847

点赞数 1

分类专栏： opencv 文章标签： opencv

opencv 专栏收录该内容

104 篇文章 13 订阅

订阅专栏

参考：

1、http://docs.opencv.org/3.3.0/ 官方文档api

2、http://docs.opencv.org/3.3.0/d6/d00/tutorial_py_root.html 官方英文教程

3、https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html

4、https://github.com/makelove/OpenCV-Python-Tutorial# 进阶教程

5、https://docs.opencv.org/3.3.0/index.html 官方英文教程

6、https://github.com/abidrahmank/OpenCV2-Python-Tutorials

7、https://www.learnopencv.com/

8、http://answers.opencv.org/questions/ OpenCV论坛

注：安装的版本 opencv_python-3.3.0-cp36-cp36m-win_amd64.whl

参考：https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html

改变颜色空间

目的

In this tutorial, you will learn how to convert images from one color-space to another, like BGR $\leftrightarrow$ Gray, BGR $\leftrightarrow$ HSV etc.
In addition to that, we will create an application which extracts a colored object in a video
You will learn following functions : cv2.cvtColor(), cv2.inRange() etc.

改变颜色-空间

For BGR $\rightarrow$ Gray conversion we use the flags cv2.COLOR_BGR2GRAY . Similarly for BGR $\rightarrow$ HSV, we use the flag cv2.COLOR_BGR2HSV .

>>> import cv2
>>> flags = [i for i in dir(cv2) if i.startswith('COLOR_')]
>>> print flags

For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]. Different softwares use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.

对象跟踪

In HSV, it is more easier to represent a color than RGB color-space. In our application, we will try toextracta blue colored object. So here is the method:

Take each frame of the video
Convert from BGR to HSV color-space
We threshold the HSV image for a range of blue color
Now extract the blue object alone, we can do whatever on that image we want.

import cv2
import numpy as np

# cap = cv2.VideoCapture(0)
cap=cv2.VideoCapture('TownCentreXVID.mp4')

while(1):

    # Take each frame
    _, frame = cap.read()

    # Convert BGR to HSV
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) # 转成HSV

    # define range of blue color in HSV
    lower_blue = np.array([110,50,50])
    upper_blue = np.array([130,255,255])


    # Threshold the HSV image to get only blue colors
    mask = cv2.inRange(hsv, lower_blue, upper_blue)

    # Bitwise-AND mask and original image
    res = cv2.bitwise_and(frame,frame, mask= mask)

    cv2.imshow('frame',frame)
    cv2.imshow('mask',mask)
    cv2.imshow('res',res)
    k = cv2.waitKey(5) & 0xFF
    if k == 27:
        break

cv2.destroyAllWindows()

如何查找HSV值跟踪？

For example, to find the HSV value of Green, try following commands in Python terminal:

>>> green = np.uint8([[[0,255,0 ]]])
>>> hsv_green = cv2.cvtColor(green,cv2.COLOR_BGR2HSV)
>>> print hsv_green
[[[ 60 255 255]]]

Now you take [H-10, 100,100] and [H+10, 255, 255] as lower bound and upper bound respectively.

图像阈值

目标

In this tutorial, you will learn Simple thresholding, Adaptive thresholding, Otsu’s thresholding etc.
You will learn these functions : cv2.threshold, cv2.adaptiveThreshold etc.

简单的阈值

cv2.THRESH_BINARY
cv2.THRESH_BINARY_INV
cv2.THRESH_TRUNC
cv2.THRESH_TOZERO
cv2.THRESH_TOZERO_INV

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('lenna.png',0)
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
ret,thresh2 = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
ret,thresh3 = cv2.threshold(img,127,255,cv2.THRESH_TRUNC)
ret,thresh4 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO)
ret,thresh5 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO_INV)

titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]

for i in range(6):
    plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])

plt.show()

自适应阈值

Adaptive Method - It decides how thresholding value is calculated.

cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.
cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.

Block Size - It decides the size of neighbourhood area.

C - It is just a constant which is subtracted from the mean or weighted mean calculated.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('lenna.png',0)
img = cv2.medianBlur(img,5)

ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
            cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
            cv2.THRESH_BINARY,11,2)

titles = ['Original Image', 'Global Thresholding (v = 127)',
            'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]

for i in range(4):
    plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
    plt.title(titles[i])
    plt.xticks([]),plt.yticks([])
plt.show()

Otsu’s 二值化

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('lenna.png',0)

# global thresholding
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

# Otsu's thresholding
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(img,(5,5),0)
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# plot all the images and their histograms
images = [img, 0, th1,
          img, 0, th2,
          blur, 0, th3]
titles = ['Original Noisy Image','Histogram','Global Thresholding (v=127)',
          'Original Noisy Image','Histogram',"Otsu's Thresholding",
          'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]

for i in range(3):
    plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
    plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
    plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
    plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
    plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
plt.show()

Otsu’s二值化化怎样运行?

img = cv2.imread('noisy2.png',0)
blur = cv2.GaussianBlur(img,(5,5),0)

# find normalized_histogram, and its cumulative distribution function
hist = cv2.calcHist([blur],[0],None,[256],[0,256])
hist_norm = hist.ravel()/hist.max()
Q = hist_norm.cumsum()

bins = np.arange(256)

fn_min = np.inf
thresh = -1

for i in xrange(1,256):
    p1,p2 = np.hsplit(hist_norm,[i]) # probabilities
    q1,q2 = Q[i],Q[255]-Q[i] # cum sum of classes
    b1,b2 = np.hsplit(bins,[i]) # weights

    # finding means and variances
    m1,m2 = np.sum(p1*b1)/q1, np.sum(p2*b2)/q2
    v1,v2 = np.sum(((b1-m1)**2)*p1)/q1,np.sum(((b2-m2)**2)*p2)/q2

    # calculates the minimization function
    fn = v1*q1 + v2*q2
    if fn < fn_min:
        fn_min = fn
        thresh = i

# find otsu's threshold value with OpenCV function
ret, otsu = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
print thresh,ret

图像的几何变换

目标

Learn to apply different geometric transformation to images like translation, rotation, affine transformation etc.
You will see these functions: cv2.getPerspectiveTransform

转换

OpenCV provides two transformation functions, cv2.warpAffine and cv2.warpPerspective , with which you can have all kinds of transformations. cv2.warpAffine takes a2x3transformation matrix while cv2.warpPerspective takes a3x3transformation matrix as input.

缩放

Scaling is just resizing of the image.cv2.resize()

Preferable interpolation methods are cv2.INTER_AREA for shrinking and cv2.INTER_CUBIC (slow) & cv2.INTER_LINEAR for zooming. By default, interpolation method used is cv2.INTER_LINEAR for all resizing purposes.

import cv2
import numpy as np

img = cv2.imread('messi5.jpg')

res = cv2.resize(img,None,fx=2, fy=2, interpolation = cv2.INTER_CUBIC)

#OR

height, width = img.shape[:2]
res = cv2.resize(img,(2*width, 2*height), interpolation = cv2.INTER_CUBIC)

翻动

import cv2
import numpy as np

img = cv2.imread('messi5.jpg',0)
rows,cols = img.shape

M = np.float32([[1,0,100],[0,1,50]])
dst = cv2.warpAffine(img,M,(cols,rows))

cv2.imshow('img',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

Third argument of the cv2.warpAffine() function is the size of the output image, which should be in the form of (width, height) . Remember width = number of columns, and height = number of rows.

旋转

import cv2

img = cv2.imread('messi5.jpg',0)
rows,cols = img.shape

M = cv2.getRotationMatrix2D((cols/2,rows/2),90,1) # 逆时针旋转90度
dst = cv2.warpAffine(img,M,(cols,rows))

cv2.imshow('dst',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

仿射变换

Then cv2.getAffineTransform will create a 2x3 matrix which is to be passed to cv2.warpAffine .

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('lenna.png')
rows,cols,ch = img.shape

pts1 = np.float32([[50,50],[200,50],[50,200]])
pts2 = np.float32([[10,100],[200,50],[100,250]])

M = cv2.getAffineTransform(pts1,pts2)

dst = cv2.warpAffine(img,M,(cols,rows))

plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()

透视转换

Then transformation matrix can be found by the functioncv2.getPerspectiveTransform. Then apply cv2.warpPerspective with this 3x3 transformation matrix.

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('lenna.png')
rows,cols,ch = img.shape

pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])

M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(300,300))

plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()

平滑图像

目标

Learn to:

Blur imagess with various low pass filters
Apply custom-made filters to images (2D convolution)

二维卷积（图像过滤）

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('opencv_logo.png')

kernel = np.ones((5,5),np.float32)/25
dst = cv2.filter2D(img,-1,kernel)

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Averaging')
plt.xticks([]), plt.yticks([])
plt.show()

图像模糊（图像平滑）

1. 均值滤波

This is done by the function cv2.blur() or cv2.boxFilter()

If you don’t want to use a normalized box filter, use cv2.boxFilter() and pass the argumentnormalize=False to the function.

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('opencv_logo.png')

blur = cv2.blur(img,(5,5))

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

2. 高斯滤波

If you want, you can create a Gaussian kernel with the function, cv2.getGaussianKernel().

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('opencv_logo.png')

# blur = cv2.blur(img,(5,5))
blur = cv2.GaussianBlur(img,(5,5),0)

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

3. 中值滤波

cv2.medianBlur()

median = cv2.medianBlur(img,5)

4. 双边过滤

cv2.bilateralFilter() ,

blur = cv2.bilateralFilter(img,9,75,75)

Additional Resources

Details about the bilateral filtering can be found at

形态转化

目标

In this chapter,

We will learn different morphological operations like Erosion, Dilation, Opening, Closing etc.
We will see different functions like : cv2.erode(), cv2.dilate(), cv2.morphologyEx() etc.

理论

它通常是操作二进制图像。需要两个输入，一个是原始影像，一个是structuring element or kernel。两个基本形态操作是腐蚀和膨胀

1. Erosion（腐蚀）

腐蚀前景的边界（总是尝试保持前景为白色）。内核滑过图像（如2D卷积）。只有当内核下的所有像素为1时，才能将原始图像（1或0）中的像素视为1，否则将被侵蚀（使为零）。

取决于内核的大小，边界附近的所有像素将被丢弃。因此，前景对象的厚度或尺寸减小，或者仅仅是图像中的白色区域减小。它有助于去除小白噪声（如我们在颜色空间章节中看到的），分离两个连接的对象等。

Here, as an example, I would use a 5x5 kernel with full of ones. Let’s see it how it works:

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('j.png',0)
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(img,kernel,iterations = 1)

plt.subplot(121);plt.imshow(img);plt.title('origin');plt.axis('off')
plt.subplot(122);plt.imshow(erosion);plt.title('erosion');plt.axis('off')
plt.show()

2. Dilation（膨胀）

与腐蚀刚好相反。一个像素元素是' 1 '如果至少一个像素在内核是“1”。所以它增加了白色地区前景物体的图像或尺寸增加。通常,在噪声去除情况下,侵蚀是紧随其后的扩张。因为,侵蚀去除白噪声,但也减少我们的对象。所以我们扩张。因为噪音消失了,他们不会回来,但我们的对象面积增加。也有用的加入打破地区的一个对象。

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('j.png',0)
kernel = np.ones((5,5),np.uint8)  # 内核
erosion = cv2.erode(img,kernel,iterations = 1)  # 腐蚀去除噪声
dilation = cv2.dilate(erosion,kernel,iterations = 1) # 膨胀恢复原形状

plt.subplot(131);plt.imshow(img);plt.title('origin');plt.axis('off')
plt.subplot(132);plt.imshow(erosion);plt.title('erosion');plt.axis('off')
plt.subplot(133);plt.imshow(dilation);plt.title('dilation');plt.axis('off')
plt.show()

3. 开运算

Opening只是另一个侵蚀的名字，随后是扩张。是有用的在去除噪声,正如我们上面的解释。 Here we use the function, cv2.morphologyEx()

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('j.png',0)
kernel = np.ones((5,5),np.uint8)  # 内核
erosion = cv2.erode(img,kernel,iterations = 1)  # 腐蚀去除噪声
dilation = cv2.dilate(erosion,kernel,iterations = 1) # 膨胀恢复原形状

opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
dilation2 = cv2.dilate(opening,kernel,iterations = 1) # 膨胀恢复原形状


plt.subplot(231);plt.imshow(img);plt.title('origin');plt.axis('off')
plt.subplot(232);plt.imshow(erosion);plt.title('erosion');plt.axis('off')
plt.subplot(233);plt.imshow(dilation);plt.title('dilation');plt.axis('off')

plt.subplot(234);plt.imshow(opening);plt.title('opening');plt.axis('off')
plt.subplot(235);plt.imshow(dilation2);plt.title('dilation2');plt.axis('off')
plt.show()

4. 闭运算

封闭与开放相反，扩张之后是侵蚀。是有用的在关闭小洞内的前景对象,对象或黑色小点。

closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)

5. 形态梯度

图像的扩张和侵蚀的区别。

gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)

6. 顶帽

tophat = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel)

Top Hat

7. 黑帽

输入图像和输入图像的 closing 是不同的。

blackhat = cv2.morphologyEx(img, cv2.MORPH_BLACKHAT, kernel)

结构元素

在Numpy的帮助下，我们在前面的例子中手动创建了一个结构化元素。它是矩形的。但是在某些情况下，您可能需要椭圆/圆形的内核。所以为了这个目的，OpenCV有一个函数 cv2.getStructuringElement()。你只是传递内核的形状和大小，你会得到所需的内核。

# Rectangular Kernel
>>> cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]], dtype=uint8)

# Elliptical Kernel
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
array([[0, 0, 1, 0, 0],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [0, 0, 1, 0, 0]], dtype=uint8)

# Cross-shaped Kernel
>>> cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
array([[0, 0, 1, 0, 0],
       [0, 0, 1, 0, 0],
       [1, 1, 1, 1, 1],
       [0, 0, 1, 0, 0],
       [0, 0, 1, 0, 0]], dtype=uint8)

Additional Resources

Morphological Operations at HIPR2

图像梯度

目标

In this chapter, we will learn to:

Find Image gradients, edges etc
We will see following functions : cv2.Sobel(), cv2.Scharr(), cv2.Laplacian() etc

理论

OpenCV提供三种类型的梯度滤波器或高通滤波器，Sobel，Scharr和Laplacian。我们会看到他们中的每一个。

1. Sobel 和 Scharr 导数

Sobel操作是高斯平滑加分散操作的联合，因此更能抵抗噪音。您可以指定要采取的导数的方向，垂直方向或水平方向（分别为参数yorder和xorder）。您还可以通过参数ksize指定内核的大小。如果ksize = -1，则使用3x3 Scharr滤波器，该滤波器比3x3 Sobel滤波器更好的结果。请参阅使用的内核文档。

2. 拉普拉斯导数

它计算由关系给出的图像的拉普拉斯算子， $\Delta src = \frac{\partial ^2{src}}{\partial x^2} + \frac{\partial ^2{src}}{\partial y^2}$ 这里每一个导数都使用了Sobel导数。如果ksize = 1，则使用以下内核进行过滤：

$kernel = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{bmatrix}$

代码

以下代码显示了单个图中的所有运算符。所有内核的大小为5x5。输出图像的深度通过-1来获得np.uint8类型的结果。

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('dave.jpg',0)

laplacian = cv2.Laplacian(img,cv2.CV_64F)
sobelx = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
sobely = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5)

plt.subplot(2,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,2),plt.imshow(laplacian,cmap = 'gray')
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,3),plt.imshow(sobelx,cmap = 'gray')
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,4),plt.imshow(sobely,cmap = 'gray')
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])

plt.show()

一个重要的事项!

在我们的最后一个例子中，输出数据类型是cv2.CV_8U或np.uint8。但是有一个小问题。将黑白转换作为正斜率（它具有正值），而将白 - 黑转换作为负斜率（它具有负值）。所以当你将数据转换为np.uint8时，所有的负斜率均为零。简单来说，你错过了这个边缘。

如果要检测到这两个边，更好的选择是将输出数据类型保持为一些较高的形式，如cv2.CV_16S，cv2.CV_64F等，取其绝对值，然后转换回cv2.CV_8U。下面的代码演示了水平Sobel滤波器的这个过程以及结果的差异。

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('box.png',0)

# Output dtype = cv2.CV_8U
sobelx8u = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=5)

# Output dtype = cv2.CV_64F. Then take its absolute and convert to cv2.CV_8U
sobelx64f = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
abs_sobel64f = np.absolute(sobelx64f)
sobel_8u = np.uint8(abs_sobel64f)

plt.subplot(1,3,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(1,3,2),plt.imshow(sobelx8u,cmap = 'gray')
plt.title('Sobel CV_8U'), plt.xticks([]), plt.yticks([])
plt.subplot(1,3,3),plt.imshow(sobel_8u,cmap = 'gray')
plt.title('Sobel abs(CV_64F)'), plt.xticks([]), plt.yticks([])

plt.show()

Canny边缘检测

目标

In this chapter, we will learn about

Concept of Canny edge detection
OpenCV functions for that : cv2.Canny()

理论

Canny边缘检测是一种流行的边缘检测算法。它是由John F. Canny于1986年开发的。它是一个多阶段算法，我们将通过每个阶段。

降噪

由于边缘检测对图像中的噪声敏感，所以第一步是用5x5高斯滤波器去除图像中的噪声。我们已经在前几章看到过。

寻找图像的强度梯度

然后用Sobel核在水平和垂直方向上过滤平滑图像，以获得水平方向（G_x）和垂直方向（G_y）的一阶导数。从这两个图像，我们可以找到每个像素的边缘渐变和方向如下：

$Edge\_Gradient \; (G) = \sqrt{G_x^2 + G_y^2}Angle \; (\theta) = \tan^{-1} \bigg(\frac{G_y}{G_x}\bigg)$

梯度方向始终垂直于边缘。它被四舍五入为表示垂直，水平和两个对角线方向的四个角度之一。

非最大抑制

在获得渐变幅度和方向后，完成图像的全扫描以去除可能不构成边缘的任何不需要的像素。为此，在每个像素处，检查像素是否是沿梯度方向的邻域中的局部最大值。查看下图：

Non-Maximum Suppression

点A位于边缘（垂直方向）。梯度方向与边缘正交。点B和C处于梯度方向。点B和C检查点A是否形成局部最大值。如果是这样，则考虑下一阶段，否则被抑制（置零）。

简而言之，您获得的结果是具有“薄边”的二进制图像。

滞后阈值

这个阶段决定哪些边缘是真正的边缘，哪些不是边缘。为此，我们需要两个阈值，minVal和maxVal。任何具有强度梯度大于maxVal的边缘肯定是边缘，而MinVal之下的那些边缘肯定是非边缘，因此被丢弃。那些处于这两个门槛之间的人根据连接性分为边缘或非边缘。如果它们连接到“确定边缘”像素，则它们被认为是边缘的一部分。否则，它们也被丢弃。见下图：

Hysteresis Thresholding

边缘A在maxVal之上，因此被视为“肯定边缘”。虽然边缘C低于maxVal，但它连接到边缘A，因此也被认为是有效边缘，并且我们得到该完整曲线。但边缘B虽然高于minVal并且与边缘C的区域相同，但是它没有连接到任何“肯定边缘”，因此被丢弃。所以非常重要的是我们必须相应地选择minVal和maxVal才能得到正确的结果。

假设边缘是长线，这个阶段也会消除小像素的噪音。

所以我们终于得到的是图像中的强势。

OpenCV中的Canny边缘检测

OpenCV将以上所有内容放在单个函数 cv2.Canny()中。我们将看到如何使用它。第一个参数是我们的输入图像。第二和第三个参数分别是我们的minVal和maxVal。第三个参数是aperture_size。用于查找图像渐变的Sobel内核的大小。默认情况下是3。最后一个参数是L2gradient，它指定查找梯度大小的方程。如果为True，则使用上述等式更准确，否则使用此函数： $Edge\_Gradient \; (G) = |G_x| + |G_y|$ 。默认情况下，它是False。

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('messi5.jpg',0)
edges = cv2.Canny(img,100,200)

plt.subplot(121),plt.imshow(img,cmap = 'gray')
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(edges,cmap = 'gray')
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])

plt.show()

Canny Edge Detection

Additional Resources

Canny edge detector at Wikipedia
Canny Edge Detection Tutorial by Bill Green, 2002.

图像金字塔

目标

In this chapter,

We will learn about Image Pyramids
We will use Image pyramids to create a new fruit, “Orapple”
We will see these functions: cv2.pyrUp(), cv2.pyrDown()

理论

通常，我们曾经使用一个恒定大小的图像。但在某些情况下，我们需要处理相同图像的不同分辨率的图像。例如，在像图像中寻找某物的同时，我们不确定图像中的对象将以什么大小显示。在这种情况下，我们将需要创建一组不同分辨率的图像，并在所有图像中搜索对象。这些具有不同分辨率的图像被称为图像金字塔（因为当它们保持在堆叠中，最大图像在底部，最小图像在顶部看起来像金字塔）。

有两种图像金字塔。 1）高斯金字塔和2）拉普拉斯金字塔

高斯金字塔中的较高级别（低分辨率）是通过删除较低级别（较高分辨率）图像中的连续行和列来形成的。然后，较高级别中的每个像素由基础级别中的5个像素与高斯权重的贡献形成。通过这样做， $M \times N$ 图像成为 $M/2 \times N/2$ 图像。因此，面积减少到原始面积的四分之一。它被称为八度。我们在金字塔上升（即分辨率降低）时，相同的模式继续下去。类似地，在扩展时，每个层面的面积变成4倍。我们可以使用cv2.pyrDown() and cv2.pyrUp()函数找到高斯金字塔。

img = cv2.imread('messi5.jpg')
lower_reso = cv2.pyrDown(higher_reso)

Now you can go down the image pyramid with cv2.pyrUp() function.

higher_reso2 = cv2.pyrUp(lower_reso)

记住，higher_reso2不等于higher_reso，因为一旦你降低了分辨率，你会丢失信息。在以前的情况下，图像是从最小图像创建的金字塔3级。与原始图像进行比较：

拉普拉斯金字塔由高斯金字塔形成。没有排他的功能。拉普拉斯金字塔图像仅仅是边缘图像。它的大多数元素都是零。它们用于图像压缩。拉普拉斯金字塔中的一个层次是由高斯金字塔的高低与高斯金字塔上层扩展版本之间的差异形成的。拉普拉斯等级的三个级别将如下所示（对比度被调整以增强内容）：

使用金字塔的图像混合

金字塔的一个应用是图像混合。例如，在图像拼接中，您需要将两个图像堆叠在一起，但由于图像之间的不连续性，它可能看起来不太好。在这种情况下，与金字塔的图像混合可以让您无缝混合，而不会在图像中留下大量数据。一个典型的例子就是混合了两种水果，橙子和苹果。现在看看结果来了解我在说什么： Pyramid Blending

请参考附加资料，请参阅图片混合，拉普拉斯金字塔等详细资料。只需完成如下操作：

加载苹果和橙色的两个图像

找到苹果和橙色的高斯金字塔（在这个特殊的例子中，数量是6）

从高斯金字塔，找到他们的拉普拉斯金字塔

现在加入苹果的左半部分，在拉普拉斯金字塔各个级别的橙子右半部分

最后从这个联合图像金字塔，重建原始的形象。

以下是完整的代码。（为了简单起见，每个步骤都是单独完成的，这可能需要更多的内存，如果你想要的话可以优化它）。

import cv2
import numpy as np,sys

A = cv2.imread('apple.jpg')
B = cv2.imread('orange.jpg')

# generate Gaussian pyramid for A
G = A.copy()
gpA = [G]
for i in range(6):
    G = cv2.pyrDown(G)
    gpA.append(G)

# generate Gaussian pyramid for B
G = B.copy()
gpB = [G]
for i in range(6):
    G = cv2.pyrDown(G)
    gpB.append(G)

# generate Laplacian Pyramid for A
lpA = [gpA[5]]
for i in range(5,0,-1):
    GE = cv2.pyrUp(gpA[i])
    L = cv2.subtract(gpA[i-1],GE)
    lpA.append(L)

# generate Laplacian Pyramid for B
lpB = [gpB[5]]
for i in range(5,0,-1):
    GE = cv2.pyrUp(gpB[i])
    L = cv2.subtract(gpB[i-1],GE)
    lpB.append(L)

# Now add left and right halves of images in each level
LS = []
for la,lb in zip(lpA,lpB):
    rows,cols,dpt = la.shape
    ls = np.hstack((la[:,0:cols//2], lb[:,cols//2:]))
    LS.append(ls)

# now reconstruct
ls_ = LS[0]
for i in range(1,6):
    ls_ = cv2.pyrUp(ls_)
    ls_ = cv2.add(ls_, LS[i])

# image with direct connecting each half
real = np.hstack((A[:,:cols//2],B[:,cols//2:]))

cv2.imwrite('Pyramid_blending2.jpg',ls_)
cv2.imwrite('Direct_blending.jpg',real)

cv2.imshow('Pyramid_blending2',ls_)
cv2.imshow('Direct_blending',real)
cv2.waitKey(0)
cv2.destroyAllWindows()