学习目标
- 学习图像金子塔
- 使用图像金子塔创建新的水果," Orapple"
- 学习函数:
cv2.pyrUp()
,cv2.pyrDown()
.
理论
通常情况下,我们使用图像都是固定的大小。但是,在某些情况下,我们需要使用不同分辨率的图像。比如,当在图像中搜索目标时,比如人脸,但是我们并不确定图像中人脸目标的大小。这种情况下,我们需要创建一系列不同分辨率的图像,然后在其中搜索人脸目标。这些不同分辨率的图像称之为图像金子塔(Image Pyramids),比如下图所示:
常用的两种图像金子塔:(1)高斯金子塔;(2)拉普拉斯图像金字塔。
通过连续移除低层次(高分辨率)图像中的偶数行和偶数列,可以得到更高层次(低分辨率)的高斯图像金子塔。那么高层次中的像素是由邻域内的5个像素与高斯权重作用所得。那么,M X N
的图像变为M/2 x N/2
的图像。同理,面积也变为原来图像的1/4。称之为-Octave.
高斯金字塔函数
low_resolution = cv2.pyrDown(img)
def pyrDown(src, dst=None, dstsize=None, borderType=None): # real signature unknown; restored from __doc__
"""
@brief Blurs an image and downsamples it.
By default, size of the output image is computed as `Size((src.cols+1)/2,
(src.rows+1)/2)`, but in any case, the following conditions should be satisfied 下
面的公式(1).
The function performs the downsampling step of the Gaussian pyramid construction.
First, it convolves the source image with the kernel, 卷积核参考下面的公式(2).
\f[\frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6
& 24 & 36 & 24 & 6 \\ 4 & 16 &
24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}\f]
Then, it downsamples the image by rejecting even rows and columns.
@param src input image.
@param dst output image; it has the specified size and the same type as src.
@param dstsize size of the output image.
@param borderType Pixel extrapolation method, see #BorderTypes (#BORDER_CONSTANT isn't supported)
"""
pass
下面2个公式分别是输出图像与输入图像的大小应该满足的关系,以及进行降采样的高斯卷积核:
∣ dstsize.width ∗ 2 − s r c . c o l s ∣ ≤ 2 ∣ dstsize.height ∗ 2 − s r c . r o w s ∣ ≤ 2 (1) \begin{array}{l} | \texttt{dstsize.width} *2-src.cols| \leq 2 \tag 1 \\ | \texttt{dstsize.height} *2-src.rows| \leq 2 \end{array} ∣dstsize.width∗2−src.cols∣≤2∣dstsize.height∗2−src.rows∣≤2(1) 1 256 [ 1 4 6 4 1 4 16 24 16 4 6 24 36 24 6 4 16 24 16 4 1 4 6 4 1 ] (2) \frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix} \tag2 2561⎣⎢⎢⎢⎢⎡1464141624164624362464162416414641⎦⎥⎥⎥⎥⎤(2)
示例代码
import cv2
img = cv2.imread('messi5.jpg')
row, col, _ = img.shape
print(int(row/2), int(col/2))
lower_reso1 = cv2.pyrDown(img)
lower_reso2 = cv2.pyrDown(lower_reso1)
lower_reso3 = cv2.pyrDown(lower_reso2)
cv2.imshow('origin image', img)
cv2.imshow('pyrDown./2', lower_reso1)
cv2.imshow('pyrDown./4', lower_reso2)
cv2.imshow('pyrDown./8', lower_reso3)
higher_reso3 = cv2.pyrUp(lower_reso3)
higher_reso2 = cv2.pyrUp(higher_reso3)
higher_reso1 = cv2.pyrUp(higher_reso2)
cv2.imshow('pyrUp*2', higher_reso3)
cv2.imshow('pyrUp*4', higher_reso2)
cv2.imshow('pyrUp*8', higher_reso1)
cv2.waitKey(0)
输出结果如下:从结果可以看出,降采样会损失图像的信息,无法通过上采样恢复到原来图像的样子。
拉普拉斯金字塔
拉普拉斯金子塔是由高斯金子塔生成,没有特有的拉普拉斯金字塔的函数。拉普拉斯金字塔图像更像是边界图像,它的大部分元素为零。常用于图像压缩。拉普拉斯金子塔生成公式:
L
i
=
G
i
−
U
p
(
G
i
+
1
)
L_i = G_i - Up(G_{i+1})
Li=Gi−Up(Gi+1)
其中,
L
i
L_i
Li表示第
i
i
i层拉普拉斯图像,
G
i
G_i
Gi表示第
i
i
i层的高斯金字塔图像,
U
p
(
G
i
+
1
)
Up(G_{i+1})
Up(Gi+1)表示对
G
i
G_i
Gi上采样。
下图是拉普拉斯金字塔的示意图,可以参考理解整个过程:
示例代码
import cv2 as cv
import numpy as np
def laplaian_demo(pyramid_images):
level = len(pyramid_images)
for i in range(level-1, -1, -1):
if (i-1) < 0:
h, w = src.shape[:2]
expand = cv.pyrUp(pyramid_images[i], dstsize=(w, h))
lpls = cv.subtract(src, expand)# + 127
cv.imshow("lpls_" + str(i), lpls)
else:
h, w = pyramid_images[i-1].shape[:2]
expand = cv.pyrUp(pyramid_images[i], dstsize=(w, h))
lpls = cv.subtract(pyramid_images[i-1], expand) #+ 127
cv.imshow("lpls_"+str(i), lpls)
def pyramid_up(image, level=3):
temp = image.copy()
# cv.imshow("input", image)
pyramid_images = []
for i in range(level):
dst = cv.pyrDown(temp)
pyramid_images.append(dst)
# cv.imshow("pyramid_up_" + str(i), dst)
temp = dst.copy()
return pyramid_images
src = cv.imread("messi5.jpg", 1)
cv.namedWindow("input", cv.WINDOW_AUTOSIZE)
cv.imshow("input", src)
# pyramid_up(src)
laplaian_demo(pyramid_up(src))
cv.waitKey(0)
cv.destroyAllWindows()
运行结果如下:
Image Blending using Pyramids(图像合成)
图像金子塔的一个应用就是图像合成。比如,在图像配准过程中,需要将2张图叠加在一起,由于2幅图像的差异性,必然会出现奇怪的样子,比如明显的缝隙等。这种情况下,金字塔会使得裂缝更少。一个经典的例子,将两个水果合成,大概的过程如下:
- 加载2张图像,苹果和橘子。
- 分别生成4层的高斯金字塔图像。
- 根据高斯金字塔,生成拉普拉斯图像。
- 将高斯金字塔的每一层中苹果的左半边和橘子的右半边合在一起。
- 重建出原图像。
代码如下
import cv2
import numpy as np, sys
A = cv2.imread('apple.jpg')
B = cv2.imread('orange.jpg')
print(A.shape)
print(B.shape)
# generate Gaussian pyramid for A
level = 4
G = A.copy()
gpA = [G]
for i in range(level):
G = cv2.pyrDown(G)
gpA.append(G)
# generate Gaussian pyramid for B
G = B.copy()
gpB = [G]
for i in range(level):
G = cv2.pyrDown(G)
gpB.append(G)
# generate Laplacian Pyramid for A
lpA = [gpA[level-1]]
for i in range(level-1, 0, -1):
GE = cv2.pyrUp(gpA[i])
L = cv2.subtract(gpA[i - 1], GE)
lpA.append(L)
# generate Laplacian Pyramid for B
lpB = [gpB[level-1]]
for i in range(level-1, 0, -1):
GE = cv2.pyrUp(gpB[i])
L = cv2.subtract(gpB[i - 1], GE)
lpB.append(L)
# Now add left and right halves of images in each level
LS = []
for la, lb in zip(lpA, lpB):
rows, cols, dpt = la.shape
print(cols)
ls = np.hstack((la[:, 0:int(cols / 2)], lb[:, int(cols / 2):]))
LS.append(ls)
# now reconstruct
ls_ = LS[0]
for i in range(1, level):
ls_ = cv2.pyrUp(ls_)
ls_ = cv2.add(ls_, LS[i])
# image with direct connecting each half
real = np.hstack((A[:, :int(cols / 2)], B[:, int(cols / 2):]))
cv2.imwrite('Pyramid_blending2.jpg', ls_)
cv2.imwrite('Direct_blending.jpg', real)
输出结果如下: