OpenCV-Python -- Image Pyramids(图像金子塔)

最新推荐文章于 2023-02-22 21:47:25 发布

X_Imagine

最新推荐文章于 2023-02-22 21:47:25 发布

阅读量1k

点赞数 1

分类专栏： OpenCV-Python 文章标签：图像金子塔高斯图像金字塔拉普拉斯图像金子塔图像合成 OpenCV

本文链接：https://blog.csdn.net/kxh123456/article/details/109631030

版权

OpenCV-Python 专栏收录该内容

32 篇文章 5 订阅

订阅专栏

学习目标

学习图像金子塔
使用图像金子塔创建新的水果，" Orapple"
学习函数：cv2.pyrUp()，cv2.pyrDown().

理论

通常情况下，我们使用图像都是固定的大小。但是，在某些情况下，我们需要使用不同分辨率的图像。比如，当在图像中搜索目标时，比如人脸，但是我们并不确定图像中人脸目标的大小。这种情况下，我们需要创建一系列不同分辨率的图像，然后在其中搜索人脸目标。这些不同分辨率的图像称之为图像金子塔（Image Pyramids），比如下图所示：
在这里插入图片描述
常用的两种图像金子塔：（1）高斯金子塔；（2）拉普拉斯图像金字塔。
通过连续移除低层次（高分辨率）图像中的偶数行和偶数列，可以得到更高层次（低分辨率）的高斯图像金子塔。那么高层次中的像素是由邻域内的5个像素与高斯权重作用所得。那么，M X N的图像变为M/2 x N/2的图像。同理，面积也变为原来图像的1/4。称之为-Octave.

高斯金字塔函数

low_resolution = cv2.pyrDown(img)

def pyrDown(src, dst=None, dstsize=None, borderType=None): # real signature unknown; restored from __doc__

    """
       @brief Blurs an image and downsamples it.
       
       By default, size of the output image is computed as `Size((src.cols+1)/2, 
       (src.rows+1)/2)`, but in any case, the following conditions should be satisfied 下
       面的公式（1）.
       
       The function performs the downsampling step of the Gaussian pyramid construction. 
       First, it convolves the source image with the kernel, 卷积核参考下面的公式（2）.
       
       \f[\frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1  \\ 4 & 16 & 24 & 16 & 4  \\ 6 
       & 24 & 36 & 24 & 6  \\ 4 & 16 & 
       24 & 16 & 4  \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix}\f]
       
       Then, it downsamples the image by rejecting even rows and columns.
       
       @param src input image.
       @param dst output image; it has the specified size and the same type as src.
       @param dstsize size of the output image. 
       @param borderType Pixel extrapolation method, see #BorderTypes (#BORDER_CONSTANT isn't supported)
    """
    pass

下面2个公式分别是输出图像与输入图像的大小应该满足的关系，以及进行降采样的高斯卷积核：

$\begin{array}{l} | \texttt{dstsize.width} *2-src.cols| \leq 2 \tag 1 \\ | \texttt{dstsize.height} *2-src.rows| \leq 2 \end{array}$ $\frac{1}{256} \begin{bmatrix} 1 & 4 & 6 & 4 & 1 \\ 4 & 16 & 24 & 16 & 4 \\ 6 & 24 & 36 & 24 & 6 \\ 4 & 16 & 24 & 16 & 4 \\ 1 & 4 & 6 & 4 & 1 \end{bmatrix} \tag2$

示例代码

import cv2

img = cv2.imread('messi5.jpg')
row, col, _ = img.shape
print(int(row/2), int(col/2))
lower_reso1 = cv2.pyrDown(img)
lower_reso2 = cv2.pyrDown(lower_reso1)
lower_reso3 = cv2.pyrDown(lower_reso2)
cv2.imshow('origin image', img)
cv2.imshow('pyrDown./2', lower_reso1)
cv2.imshow('pyrDown./4', lower_reso2)
cv2.imshow('pyrDown./8', lower_reso3)

higher_reso3 = cv2.pyrUp(lower_reso3)
higher_reso2 = cv2.pyrUp(higher_reso3)
higher_reso1 = cv2.pyrUp(higher_reso2)
cv2.imshow('pyrUp*2', higher_reso3)
cv2.imshow('pyrUp*4', higher_reso2)
cv2.imshow('pyrUp*8', higher_reso1)
cv2.waitKey(0)

输出结果如下：从结果可以看出，降采样会损失图像的信息，无法通过上采样恢复到原来图像的样子。
在这里插入图片描述

拉普拉斯金字塔

拉普拉斯金子塔是由高斯金子塔生成，没有特有的拉普拉斯金字塔的函数。拉普拉斯金字塔图像更像是边界图像，它的大部分元素为零。常用于图像压缩。拉普拉斯金子塔生成公式：
$L_i = G_i - Up(G_{i+1})$
其中， $L_i$ 表示第 $i$ 层拉普拉斯图像， $G_i$ 表示第 $i$ 层的高斯金字塔图像， $Up(G_{i+1})$ 表示对 $G_i$ 上采样。

下图是拉普拉斯金字塔的示意图，可以参考理解整个过程：
在这里插入图片描述

示例代码

import cv2 as cv
import numpy as np


def laplaian_demo(pyramid_images):
    level = len(pyramid_images)
    for i in range(level-1, -1, -1):
        if (i-1) < 0:
            h, w = src.shape[:2]
            expand = cv.pyrUp(pyramid_images[i], dstsize=(w, h))
            lpls = cv.subtract(src, expand)# + 127
            cv.imshow("lpls_" + str(i), lpls)
        else:
            h, w = pyramid_images[i-1].shape[:2]
            expand = cv.pyrUp(pyramid_images[i], dstsize=(w, h))
            lpls = cv.subtract(pyramid_images[i-1], expand) #+ 127
            cv.imshow("lpls_"+str(i), lpls)


def pyramid_up(image, level=3):
    temp = image.copy()
    # cv.imshow("input", image)
    pyramid_images = []
    for i in range(level):
        dst = cv.pyrDown(temp)
        pyramid_images.append(dst)
        # cv.imshow("pyramid_up_" + str(i), dst)
        temp = dst.copy()
    return pyramid_images


src = cv.imread("messi5.jpg", 1)
cv.namedWindow("input", cv.WINDOW_AUTOSIZE)
cv.imshow("input", src)
# pyramid_up(src)
laplaian_demo(pyramid_up(src))

cv.waitKey(0)
cv.destroyAllWindows()

运行结果如下：在这里插入图片描述

Image Blending using Pyramids（图像合成）

图像金子塔的一个应用就是图像合成。比如，在图像配准过程中，需要将2张图叠加在一起，由于2幅图像的差异性，必然会出现奇怪的样子，比如明显的缝隙等。这种情况下，金字塔会使得裂缝更少。一个经典的例子，将两个水果合成，大概的过程如下：

加载2张图像，苹果和橘子。
分别生成4层的高斯金字塔图像。
根据高斯金字塔，生成拉普拉斯图像。
将高斯金字塔的每一层中苹果的左半边和橘子的右半边合在一起。
重建出原图像。

代码如下

import cv2
import numpy as np, sys

A = cv2.imread('apple.jpg')
B = cv2.imread('orange.jpg')
print(A.shape)
print(B.shape)

# generate Gaussian pyramid for A
level = 4
G = A.copy()
gpA = [G]
for i in range(level):
    G = cv2.pyrDown(G)
    gpA.append(G)

# generate Gaussian pyramid for B
G = B.copy()
gpB = [G]
for i in range(level):
    G = cv2.pyrDown(G)
    gpB.append(G)

# generate Laplacian Pyramid for A
lpA = [gpA[level-1]]
for i in range(level-1, 0, -1):
    GE = cv2.pyrUp(gpA[i])

    L = cv2.subtract(gpA[i - 1], GE)
    lpA.append(L)

# generate Laplacian Pyramid for B
lpB = [gpB[level-1]]
for i in range(level-1, 0, -1):
    GE = cv2.pyrUp(gpB[i])
    L = cv2.subtract(gpB[i - 1], GE)
    lpB.append(L)

# Now add left and right halves of images in each level
LS = []
for la, lb in zip(lpA, lpB):
    rows, cols, dpt = la.shape
    print(cols)
    ls = np.hstack((la[:, 0:int(cols / 2)], lb[:, int(cols / 2):]))
    LS.append(ls)

# now reconstruct
ls_ = LS[0]
for i in range(1, level):
    ls_ = cv2.pyrUp(ls_)
    ls_ = cv2.add(ls_, LS[i])

# image with direct connecting each half
real = np.hstack((A[:, :int(cols / 2)], B[:, int(cols / 2):]))

cv2.imwrite('Pyramid_blending2.jpg', ls_)
cv2.imwrite('Direct_blending.jpg', real)