前言
目前大多数手机中的“手机扫描仪”具体是可以实现对证件,银行卡,资料等进行扫描,与普通相机最大不同是可以实现几何变形的自动矫正,同时还能够实现对文字内容增强(显示 效果更佳)。举个例子,对于公交卡进行扫描,由于拍摄人员的技术以及客观的一些原因导致拍摄出来的图片一般情况下存在这一定的几何畸变和其他一些背景的干扰。
如下图所示,拍摄一张公交卡,背景是一张存在条纹的纸张。而我们不希望有这么大的几何变形以及背景干扰,会影响下一步的处理(如OCR识别)。
存在的问题:
- 几何变形
- 背景干扰
思路:
对于由于拍摄角度导致的变化,一般可以采用透射变换进行矫正。
透视变换原理
定义
透视变换是将图片投影到一个新的视平面,也称为投影映射。
换算公式
[
x
′
,
y
′
,
w
′
]
=
[
u
,
v
,
w
]
[x',y',w']=[u,v,w]
[x′,y′,w′]=[u,v,w]
[
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
]
\begin{bmatrix} a_{11} & a_{12}&a_{13} \\ a_{21}& a_{22}&a_{23}\\a_{31}&a_{32}&a_{33} \end{bmatrix}
⎣⎡a11a21a31a12a22a32a13a23a33⎦⎤
其中,
[
u
,
v
,
w
]
[u,v,w]
[u,v,w]表示原始图片左边,对应得到变换后的图片坐标x,y,其中
x
=
x
′
/
w
′
,
y
=
y
′
/
w
′
x=x'/w',y=y'/w'
x=x′/w′,y=y′/w′.
变换矩阵
[
a
11
a
12
a
13
a
21
a
22
a
23
a
31
a
32
a
33
]
\begin{bmatrix} a_{11} & a_{12}&a_{13} \\ a_{21}& a_{22}&a_{23}\\a_{31}&a_{32}&a_{33} \end{bmatrix}
⎣⎡a11a21a31a12a22a32a13a23a33⎦⎤可以拆成4个部分,
[
a
11
a
12
a
21
a
22
]
\begin{bmatrix}a_{11}&a_{12}\\a_{21}&a_{22}\end{bmatrix}
[a11a21a12a22]表示线性变换,比如scaling,shearing,ratotion.
[
a
31
a
32
]
[a_{31} a_{32}]
[a31a32]用于平移,
[
a
13
a
23
]
T
[a_{13} a_{23}]^T
[a13a23]T产生透射变换。变换后的公式重写如下:
x
=
x
′
/
w
′
=
(
a
11
u
+
a
21
v
+
a
31
)
/
(
a
13
u
+
a
23
v
+
a
33
)
x=x'/w'=(a_{11}u+a_{21}v+a_{31})/(a_{13}u+a_{23}v+a_{33})
x=x′/w′=(a11u+a21v+a31)/(a13u+a23v+a33)
y
=
y
′
/
w
′
=
(
a
12
u
+
a
22
v
+
a
32
)
/
(
a
13
u
+
a
23
v
+
a
33
)
y=y'/w'=(a_{12}u+a_{22}v+a_{32})/(a_{13}u+a_{23}v+a_{33})
y=y′/w′=(a12u+a22v+a32)/(a13u+a23v+a33),所以,已知变换对应的几个点就可以求取变换公式。反之,特定的变换公式也能得到新的变换后的图片。【还未探明w=1?】
这样,就可以将一个畸变的四边形矫正成一个长方形。
参考资料:
https://www.cnblogs.com/jsxyhelu/p/4219564.html
https://blog.csdn.net/wong_judy/article/details/6283019
此时,我们已经明白我们需要完成的任务是什么–找四组对应点。如何找?我们知道,上图的四个序号点角点都是相邻两条线之间相交的点。因此可以通过:
- 霍夫检测(直线)+求解直角点来查找拍摄图的四个角点,
- 寻找四边形的轮廓的四个顶点来确定拍摄图的四个角点,
- 或者手动输入四个角点。
如何获取对应的点?
对于扫描的文件,一般都会事先选型或者说选参数,实际上就是为在设置对应点的坐标(变换后的四边形尺寸)
代码实现
思路:
- 读入图像并进行预处理
- 寻找拍摄图的四个角点
- 根据预设的尺寸,设置对应的四个角点,并计算透射变换的矩阵参数
- 对透射图进行透射变换
读入图像并进行预处理
from imutils.perspective import four_point_transform
import imutils
import cv2
import numpy as np
from matplotlib import pyplot as plt
import math
'''
# 图像预处理--将它转换为灰阶,轻度模糊【消除高频噪声】,然后边缘检测
parameter:
input_dir---图形路径
Return:
image--原始图片
gray---灰度图
edged--边缘图
'''
def Get_outline(input_dir):
image=cv2.imread(input_dir)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2BGRA)
blurred=cv2.GaussianBlur(gray,(5,5),0)
edged=cv2.Canny(blurred,75,200)
return image,gray,edged
note:
- cv2.imread()–读取图像,直接读出RGB格式,数据格式在0~255.
cv2.imread(filepath,flags) #读入一张图像
filepath:要读入图片的完整路径
flags:读入图片的标志
cv2.IMREAD_COLOR:默认参数,读入一副彩色图片,忽略alpha通道
cv2.IMREAD_GRAYSCALE:读入灰度图片
cv2.IMREAD_UNCHANGED:顾名思义,读入完整图片,包括alpha通道
- cv2.cvtColor(p1,p2)—颜色空间转换函数,p1是需要转换的图片,p2是转换成何种格式。cv2.COLOR_BGR2RGB 将BGR格式转换成RGB格式
cv2.COLOR_BGR2GRAY 将BGR格式转换成灰度图片。灰度图片并不是指常规意义上的黑白图片,只用看是不是无符号八位整型(unit8),单通道即可判断。
cv2.cvtColor() #图像颜色空间转换
img2 = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY) #灰度化:彩色图像转为灰度图像
img3 = cv2.cvtColor(img,cv2.COLOR_GRAY2RGB) #彩色化:灰度图像转为彩色图像
cv2.COLOR_X2Y,其中X,Y = RGB, BGR, GRAY, HSV, YCrCb, XYZ, Lab, Luv, HLS
- GaussianBlur()–高斯滤波:对图像邻域内像素进行平滑,邻域内不同位置像素被赋予不同的权值,同时能够更多的保留图像的总体灰度分布特征。如果通过离散化窗口卷积,则主要利用高斯核,高斯核的大小为奇数(高斯卷积会在其覆盖区域的中心输出结果),常用的高斯模板如下几种形式:
高斯模板通过高斯函数计算出来,公式如下:
def GaussianBlur(src, ksize, sigmaX, dst=None, sigmaY=None, borderType=None): # real signature unknown; restored from __doc__
"""
. @brief Blurs an image using a Gaussian filter.
.
. The function convolves the source image with the specified Gaussian kernel【高斯核函数】. In-place filtering is
. supported.
.
. @param src input image; the image can have any number of channels, which are processed
. independently, but the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
. @param dst output image of the same size and type as src.
. @param ksize Gaussian kernel size. ksize.width and ksize.height can differ but they both must be
. positive and odd. Or, they can be zero's and then they are computed from sigma【希腊字母】.
. @param sigmaX Gaussian kernel standard deviation【偏差】 in X direction.
. @param sigmaY Gaussian kernel standard deviation in Y direction; if sigmaY is zero, it is set to be
. equal to sigmaX, if both sigmas are zeros, they are computed from ksize.width and ksize.height,
. respectively (see #getGaussianKernel for details); to fully control the result regardless of
. possible future modifications of all this semantics【语意】, it is recommended to specify all of ksize,
. sigmaX, and sigmaY.
. @param borderType pixel【像素】 extrapolation【外插】 method, see #BorderTypes. #BORDER_WRAP is not supported.
.
"""
- Canny()—边缘检测算法:使用Canny边缘检测器,图象边缘检测必须满足两个条件,1)能有效地抑制噪声;2)必须尽量精确确定边缘的位置。故之前使用了高斯滤波器。
def Canny(image, threshold1, threshold2, edges=None, apertureSize=None, L2gradient=None): # real signature unknown; restored from __doc__
"""
. @brief Finds edges in an image using the Canny algorithm【多级边缘检测算法:找寻图像中灰色强度变化最强的位置】 @cite Canny86 .
.
. The function finds edges in the input image and marks them in the output map edges【边缘】 using the
. Canny algorithm. The smallest value between threshold1 and threshold2 is used for edge linking. The
. largest value is used to find initial segments of strong edges.
.
. @param image 8-bit input image.
. @param edges output edge map; single channels 8-bit image, which has the same size as image .
. @param threshold1 first threshold for the hysteresis procedure【滞后处理】.
. @param threshold2 second threshold for the hysteresis procedure.
. @param apertureSize【孔径大小】 aperture size for the Sobel operator.
寻找拍摄图的四个角点
'''
获取公交卡的轮廓
parameter:
edged--边缘图
return:
docCnt--符合公交卡的轮廓图
'''
def Get_cnt(edged):
# 从边缘图中获取轮廓,然后初始化公交卡对应的轮廓
cnts=cv2.findContours(edged.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# 版本问题,不能使用:cnts=cnts[0] if imutils.is_cv2() else cnts[1]
cnts=cnts[1] if imutils.is_cv3() else cnts[0]
docCnt=None
if len(cnts)>0:
cnts=sorted(cnts,key=cv2.contourArea,reverse=True) # 轮廓按大小降序排序
for c in cnts:
# 获取近似的轮廓
peri=cv2.arcLength(c,True)
approx=cv2.approxPolyDP(c,0.02*peri,True)
# 如果我们的近似轮廓有四个顶点,那么就认为找到了公交卡
if len(approx)==4:
docCnt=approx
break
return docCnt
Note
- findContours()–寻找图像中物体的轮廓
def findContours(image, mode, method, contours=None, hierarchy=None, offset=None): # real signature unknown; restored from __doc__
"""
findContours(image, mode, method[, contours[, hierarchy[, offset]]]) -> contours, hierarchy
. @brief Finds contours[轮廓线] in a binary image.
.
. The function retrieves【恢复】 contours from the binary image using the algorithm @cite Suzuki85 . The contours
. are a useful tool for shape analysis and object detection and recognition. See squares.cpp in the
. OpenCV sample directory【指南】.
.
. @param image Source, an 8-bit single-channel image. Non-zero pixels are treated as 1's. Zero
. pixels remain 0's, so the image is treated as binary . You can use #compare, #inRange, #threshold ,
. #adaptiveThreshold, #Canny, and others to create a binary image out of a grayscale【灰度图】 or color one.
. If mode equals to #RETR_CCOMP or #RETR_FLOODFILL, the input can also be a 32-bit integer image of labels (CV_32SC1).
. @param contours Detected contours. Each contour is stored as a vector of points (e.g.
. std::vector<std::vector<cv::Point> >).
. @param hierarchy【层次】 Optional output vector (e.g. std::vector<cv::Vec4i>), containing information about the image topology【拓扑结构】. It has
. as many elements as the number of contours. For each i-th contour contours[i], the elements
. hierarchy[i][0] , hierarchy[i][1] , hierarchy[i][2] , and hierarchy[i][3] are set to 0-based indices【标注】
. in contours of the next and previous contours at the same hierarchical level, the first child
. contour and the parent contour, respectively. If for the contour i there are no next, previous,
. parent, or nested contours, the corresponding elements of hierarchy[i] will be negative[负数].
. @param mode Contour retrieval mode, see #RetrievalModes【轮廓检索模式】
. @param method Contour approximation method, see #ContourApproximationModes
. @param offset Optional offset by which every contour point is shifted. This is useful if the
. contours are extracted from the image ROI and then they should be analyzed in the whole image
. context.
"""
pass
- arcLength(curve, closed)—函数用于计算封闭轮廓的周长或曲线的长度。
def arcLength(curve, closed): # real signature unknown; restored from __doc__
"""
arcLength(curve, closed) -> retval
. @brief Calculates a contour perimeter【边界】 or a curve length.
. @param curve Input vector of 2D points, stored in std::vector or Mat.
. @param closed Flag indicating【显示】 whether the curve is closed or not.
"""
- approxPolyDP()函数—拟合多边形
def approxPolyDP(curve, epsilon, closed, approxCurve=None): # real signature unknown; restored from __doc__
"""
approxPolyDP(curve, epsilon, closed[, approxCurve]) -> approxCurve
. @brief Approximates a polygonal【多边形】 curve(s) with the specified precision.
.
. The function cv::approxPolyDP approximates a curve or a polygon with another curve/polygon with less
. vertices【顶点】 so that the distance between them is less or equal to the specified precision. It uses the
. Douglas-Peucker algorithm
. @param curve Input vector of a 2D point stored in std::vector or Mat
. @param approxCurve Result of the approximation. The type should match the type of the input curve.
. @param epsilon Parameter specifying the approximation accuracy. This is the maximum distance
. between the original curve and its approximation.
. @param closed If true, the approximated curve is closed (its first and last vertices are
. connected). Otherwise, it is not closed.
"""
根据预设的尺寸,设置对应的四个角点,并计算透视变换的矩阵参数
公交卡的比例是16:9,这里假设长高(320,180)注意:由于并不是正方形,所以这里要先确定长和高的对应假设,拍摄导致变形不会大到使长和高的尺寸发生变化。顶点的顺序是 左上、左下、右下、右上,分布计算并比较 左上到左下的距离 左上到右上的距离 确定长高
# 计算点的距离
def calculate_distance(point1,point2):
d_x=point1[0]-point2[0]
d_y=point1[1]-point2[1]
distance=math.sqrt(d_x**2+d_y**2)
return distance
if __name__=='__main__':
input_dir="gongjiaoka.png"
image,gray,edged=Get_outline(input_dir)
docCnt=Get_cnt(edged)
print(docCnt.reshape(4,2))
# 改变变化的模式,公交卡的比例为16:9
pts1=np.float32(docCnt.reshape(4,2))
# 加入一个判断,对不同长高采用不同的系数
p=docCnt.reshape(4,2)
if calculate_distance(p[0],p[1])<calculate_distance(p[0],p[3]):
pts2=np.float32([[0,0],[0,180],[320,180],[320,0]])
M=cv2.getPerspectiveTransform(pts1,pts2)
dst=cv2.warpPerspective(image,M,(320,180))
else:
pts2=np.float32([[0,0],[0,320],[180,320],[180,0]])
M=cv2.getPerspectiveTransform(pts1,pts2)
dst=cv2.warpPerspective(image,M,(180,320))
cv2.imwrite('0.png',dst)
Note
- getPerspectiveTransform()–透射变换函数:输入原始图像和变换后的图像对应四个点,便可以得到变换矩阵。
def getPerspectiveTransform(src, dst, solveMethod=None): # real signature unknown; restored from __doc__
"""
getPerspectiveTransform(src, dst[, solveMethod]) -> retval
. @brief Calculates a perspective transform from four pairs of the corresponding【相应的】 points.
.
. The function calculates the \f$3 \times 3\f$ matrix of a perspective transform so that:
.
. \f[\begin{bmatrix} t_i x'_i \\ t_i y'_i \\ t_i \end{bmatrix} = \texttt{map_matrix} \cdot \begin{bmatrix} x_i \\ y_i \\ 1 \end{bmatrix}\f]
.
. where
.
. \f[dst(i)=(x'_i,y'_i), src(i)=(x_i, y_i), i=0,1,2,3\f]
.
. @param src Coordinates【坐标值】 of quadrangle【四个角】 vertices【顶点】 in the source image.
. @param dst Coordinates of the corresponding quadrangle vertices in the destination【目标】 image.
. @param solveMethod method passed to cv::solve (#DecompTypes)
.
"""
- warpPerspective()—对图像进行透视变换
def warpPerspective(src, M, dsize, dst=None, flags=None, borderMode=None, borderValue=None): # real signature unknown; restored from __doc__
"""
warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) -> dst
. @brief Applies a perspective transformation to an image.
.
. The function warpPerspective transforms the source image using the specified matrix:
"""
pass
def circle(img, center, radius, color, thickness=None, lineType=None, shift=None): # real signature unknown; restored from __doc__
"""
circle(img, center, radius, color[, thickness[, lineType[, shift]]]) -> img
. @brief Draws a circle.
.
. The function cv::circle draws a simple or filled circle with a given center and radius.
. @param img Image where the circle is drawn.
. @param center Center of the circle.
. @param radius Radius of the circle.
. @param color Circle color.
. @param thickness Thickness of the circle outline, if positive. Negative values, like #FILLED,
. mean that a filled circle is to be drawn.
. @param lineType Type of the circle boundary. See #LineTypes
. @param shift Number of fractional bits in the coordinates of the center and in the radius value.
"""
pass
结果展示
point_size=2
point_color=(0,255,0)
thickness=2
for point in docCnt.reshape(4,2):
cv2.circle(image,tuple(point),point_size,point_color,thickness)
cv2.imshow('original',image)
cv2.imshow('gray',gray)
cv2.imshow('edged',edged)
cv2.imshow('result_img',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
效果图:
原始图:
轮廓图:
描点图:
矫正的图: