OpenCV教程（25） -- 图像金字塔Image Pyramids

最新推荐文章于 2024-07-30 11:33:53 发布

Techblog of HaoWANG

最新推荐文章于 2024-07-30 11:33:53 发布

阅读量778

点赞数

分类专栏： # OpenCV 文章标签： OpenCV 金字塔

本文链接：https://blog.csdn.net/hhaowang/article/details/102533040

版权

OpenCV 专栏收录该内容

42 篇文章 27 订阅

订阅专栏

Goal

Theory

Goal

In this chapter,

We will learn about Image Pyramids
We will use Image pyramids to create a new fruit, "Orapple"
We will see these functions: cv2.pyrUp(), cv2.pyrDown()

Theory

Normally, we used to work with an image of constant size. But in some occassions, we need to work with images of different resolution of the same image. For example, while searching for something in an image, like face, we are not sure at what size the object will be present in the image. In that case, we will need to create a set of images with different resolution and search for object in all the images. These set of images with different resolution are called Image Pyramids (because when they are kept in a stack with biggest image at bottom and smallest image at top look like a pyramid.

一般情况下，我们要处理是一副具有固定分辨率的图像。但是有些情况下，我们需要对同一图像的不同分辨率的子图像进行处理。比如，我们要在一幅图像中查找某个目标，比如脸，我们不知道目标在图像中的尺寸大小。这种情况下，我们需要创建创建一组图像，这些图像是具有不同分辨率的原始图像。我们把这组图像叫做图像金字塔（简单来说就是同一图像的不同分辨率的子图集合）。如果我们把最大的图像放在底部，最小的放在顶部，看起来像一座金字塔，故而得名图像金字塔。

There are two kinds of Image Pyramids. 1) Gaussian Pyramid and 2) Laplacian Pyramids

Higher level (Low resolution) in a Gaussian Pyramid is formed by removing consecutive rows and columns in Lower level (higher resolution) image. Then each pixel in higher level is formed by the contribution from 5 pixels in underlying level with gaussian weights. By doing so, a M×N image becomes M/2×N/2 image. So area reduces to one-fourth of original area. It is called an Octave. The same pattern continues as we go upper in pyramid (ie, resolution decreases). Similarly while expanding, area becomes 4 times in each level. We can find Gaussian pyramids using cv2.pyrDown() and cv2.pyrUp() functions.

有两类图像金字塔：高斯金字塔和拉普拉斯金字塔。高斯金字塔的顶部是通过将底部图像中的连续的行和列去除得到的。顶
部图像中的每个像素值等于下一层图像中5 个像素的高斯加权平均值。这样操作一次一个MxN 的图像就变成了一个M/2xN/2 的图像。所以这幅图像的面积就变为原来图像面积的四分之一。这被称为Octave。连续进行这样的操作我们就会得到一个分辨率不断下降的图像金字塔。我们可以使用函数cv2.pyrDown() 和cv2.pyrUp() 构建图像金字塔。

函数cv2.pyrDown() 从一个高分辨率大尺寸的图像向上构建一个金子塔（尺寸变小，分辨率降低）。

img = cv2.imread('messi5.jpg')
lower_reso = cv2.pyrDown(higher_reso)

函数cv2.pyrUp() 从一个低分辨率小尺寸的图像向下构建一个金子塔（尺寸变大，但分辨率不会增加）。

higher_reso2 = cv2.pyrUp(lower_reso)

你要记住的是是higher_reso2 和higher_reso 是不同的。因为一旦使用cv2.pyrDown()，图像的分辨率就会降低，信息就会被丢失。下图就是从cv2.pyrDown() 产生的图像金字塔的（由下到上）第三层图像使用函数cv2.pyrUp() 得到的图像，与原图像相比分辨率差了很多。

拉普拉斯金字塔可以有高斯金字塔计算得来，公式如下：

拉普拉金字塔的图像看起来就像边界图，其中很多像素都是0。他们经常被用在图像压缩中。下图就是一个三层的拉普拉斯金字塔：

Image Blending using Pyramids

One application of Pyramids is Image Blending. For example, in image stitching, you will need to stack two images together, but it may not look good due to discontinuities between images. In that case, image blending with Pyramids gives you seamless blending without leaving much data in the images. One classical example of this is the blending of two fruits, Orange and Apple. See the result now itself to understand what I am saying:

使用金字塔进行图像融合，图像金字塔的一个应用是图像融合。例如，在图像缝合中，你需要将两幅图叠在一起，但是由于连接区域图像像素的不连续性，整幅图的效果看起来会很差。这时图像金字塔就可以排上用场了，他可以帮你实现无缝连接。这里的一个经典案例就是将两个水果融合成一个，看看下图也许你就明白我在讲什么了。

你可以通过阅读后边的更多资源来了解更多关于图像融合，拉普拉斯金字塔的细节。

实现上述效果的步骤如下：

1. 读入两幅图像，苹果01和橘子02

2. 构建苹果和橘子的高斯金字塔（6 层）

3. 根据高斯金字塔计算拉普拉斯金字塔

4. 在拉普拉斯的每一层进行图像融合（苹果的左边与橘子的右边融合）

5. 根据融合后的图像金字塔重建原始图像。
下图是摘自《学习OpenCV》展示了金子塔的构建，以及如何从金字塔重建原始图像的过程。

整个过程的代码如下。（为了简单，每一步都是独立完成的，这回消耗更多的内存，如果你愿意的话可以对他进行优化）

# !/usr/bin/env python
# -*- coding: utf-8 -*-
"""
# @Time    : 2019/10/13 15:24
# @Author  : HaoWang
# @Site    : HongKong, China
# @project : $[PROJECT_NAME]
# @File    : 1102. org.py
# @Software: PyCharm
# @license: haowanghk@gmail.com 
"""
import cv2
import numpy as np,sys

A = cv2.imread('apple.PNG')
B = cv2.imread('orange.PNG')

# generate Gaussian pyramid for A
G = A.copyTo()
gpA = [G]
for i in range(6):
    G = cv2.pyrDown(G)
    gpA.append(G)

# generate Gaussian pyramid for B
G = B.copyTo()
gpB = [G]
for i in range(6):
    G = cv2.pyrDown(G)
    gpB.append(G)

# generate Laplacian Pyramid for A
lpA = [gpA[5]]
for i in range(5,0,-1):
    GE = cv2.pyrUp(gpA[i])
    L = cv2.subtract(gpA[i-1],GE)
    lpA.append(L)

# generate Laplacian Pyramid for B
lpB = [gpB[5]]
for i in range(5,0,-1):
    GE = cv2.pyrUp(gpB[i])
    L = cv2.subtract(gpB[i-1],GE)
    lpB.append(L)
# Now add left and right halves of images in each level
#numpy.hstack(tup)
#Take a sequence of arrays and stack them horizontally
#to make a single array.
LS = []
for la,lb in zip(lpA,lpB):
    rows,cols,dpt = la.shape
    ls = np.hstack((la[:,0:cols/2], lb[:,cols/2:]))
    LS.append(ls)
# now reconstruct
ls_ = LS[0]
for i in range(1,6):
    ls_ = cv2.pyrUp(ls_)
    ls_ = cv2.add(ls_, LS[i])
# image with direct connecting each half
real = np.hstack((A[:,:cols/2],B[:,cols/2:]))
cv2.imwrite('Pyramid_blending2.jpg',ls_)
cv2.imwrite('Direct_blending.jpg',real)