【IPMV】Lab4 Blending 线性混合与拉普拉斯金字塔混合对比-CSDN博客

本文链接：https://blog.csdn.net/BIYing_Aurora/article/details/147914052

Lab4 Blending 线性混合与拉普拉斯金字塔混合对比

参考自同济大学蒋磊老师编写的IPMV实验说明
同济大学自动化专业IMPV课程学习记录（包含知识点以及实验）

Lab0 实验环境准备与OpenCV Hello World
Lab1-1 OpenCV入门与实践——从基础操作到实时视频处理（一）
Lab1-2 OpenCV入门与实践——从基础操作到实时视频处理（二）
Lab2-1 从零玩转Eigen库：掌握线性代数与OpenCV互操作的神器（一）
Lab2-2 从零玩转Eigen库：掌握线性代数与OpenCV互操作的神器！(二)
Lab4 Blending 线性混合与拉普拉斯金字塔混合对比
…
待更新

0 前言

你是否见过半橘半白的"杂交虎"？在图像处理领域，我们可以通过图像融合技术创造这种神奇效果。本文将带你实现：

线性混合：简单粗暴的"五五开"融合
拉普拉斯金字塔混合：专业级的多尺度融合
深度对比两种算法的效果差异

1 🎯实验目标

掌握图像混合权重蒙版的生成技巧
实现线性混合算法
深入理解拉普拉斯金字塔混合原理
对比分析不同过渡宽度的融合效果

封面图

2 🌌 混合原理探秘

2.1 线性混合：最简单的融合术

$I_{blend}(x,y) = α(x,y)·I_1(x,y) + (1-α(x,y))·I_2(x,y)$
其中 $α (x, y)$ 为位置相关的混合权重

优点：计算速度快（O(1)时间复杂度）
缺点：过渡区会出现"鬼影"现象

2.2 拉普拉斯金字塔混合：多尺度融合

模仿人类视觉系统，分三个层次处理：

金字塔层级	处理内容	效果
高层（小图）	混合低频背景色	决定整体色调过渡
中层	混合中等频率纹理	控制虎纹的自然衔接
底层（原图）	混合高频细节	保留毛发等细微特征

构建高斯金字塔：
$G_i = Downsample(G_{i-1}), i=1,2,...,n$
构建拉普拉斯金字塔
$L_i = G_i - Upsample(G_{i+1})$
分层混合后重建：
$I_{blend} = Collapse({α_iL_i^1 + (1-α_i)L_i^2})$

融合流程示意图：
blending

3 🛠️ 核心代码实现

3.1 权重蒙版生成

cv::Mat weights(img.size(), CV_32FC1, Scalar(1.0f));
weights.colRange(cols/2, cols).setTo(0.0f);
cv::blur(weights, weights, Size(600,1)); // 300px过渡区

3.2 线性混合实现

cv::Mat linearBlend(const Mat& img1, const Mat& img2, const Mat& w) {
    return w.mul(img1) + (1-w).mul(img2);
}`

3.3 拉普拉斯金字塔混合

高斯金字塔构建

vector<Mat> buildGaussianPyramid(const Mat& img) {
    vector<Mat> pyramid;
    Mat current = img.clone();
    while(current.cols > 16) {
        pyramid.push_back(current);
        pyrDown(current, current);
    }
    return pyramid;
}

图像重建

Mat reconstruct(const vector<Mat>& pyramid) {
    Mat result = pyramid.back();
    for(int i=pyramid.size()-2; i>=0; --i) {
        pyrUp(result, result, pyramid[i].size());
        result += pyramid[i];
    }
    return result;
}

4 🐯 Guide

Step1. Get lab code template

Download lab code template using git in your Ubuntu terminal:

git clone https://git.tongji.edu.cn/ipmv/examples/lab_4.git

There are 3 images in PNG extension.

lion.png/tiger.png/white_tiger.png

These images are in 1200x1200 pixels each.

Step2. Read and convert images

Read image files with cv::imread() function.

Then convert the images to CV_32F format, to scale the image pixels so that they get values between [0,1].

// Load images.
// TODO: Load the images using cv::imread() and convert to 32-bit floating point images.
// Using relative filenames such as "../tiger.png" should work.
// Remember to rescale so that they have values in the interval [0, 1].
// Hint: convertTo().
cv::Mat img_1, img_2;
img_1 = cv::imread("../white_tiger.png");
img_2 = cv::imread("../tiger.png");

img_1.convertTo(img_1, CV_32F, 1.0/255.0);
img_2.convertTo(img_2, CV_32F, 1.0/255.0);

showResult("Lab 4 - Image 1 original", img_1);
showResult("Lab 4 - Image 2 original", img_2);

Step3. Create an image of blend weights

Note this mask should fulfill following requirements:

Same size as the input images
Left half of the columns are black (0.0)
Right half of the columns are white (1.0)
How to make this ramp?

首先创建一个水平mask：左侧 = 1.0（黑色，用于显示img_1），右侧 = 0.0（白色，用于显示img_2）。

创建一个与img_1尺寸相同的单通道32浮点图像并将所有像素值初始化为1.0f（黑色）
计算图像中间列的位置
将图像右半部分（从中间列到最后一列）设为0.0f（白色）

使用 cv::blur() 在两半之间创建一个平滑过渡的ramp，以避免出现尖锐的接缝。

定义ramp宽度300像素
使用水平方向的模糊（x方向）创建平滑过渡
- 模糊核的大小ramp_width * 2 + 1, 1
- 垂直方向：1像素（不进行垂直模糊）

最后，我们将单通道mask合并为 3 通道，以兼容 RGB。

创建3个相同的单通道图像（base）
使用cv::merge合并为一个3通道图像

  // Construct weights.
  // TODO: Create a 32-bit, 3 channel floating point weight image.
  // The first half of the columns should be black (1.0f).
  // The last half of the coSlumns should be white (0.0f).
  // Then make a ramp between these two halves.
  // Hint: Use cv::blur() to make the ramp.

  // Create 1-channel float image: left half = 1.0, right half = 0.0
  cv::Mat base(img_1.size(), CV_32FC1, cv::Scalar(1.0f));
  int mid_col = base.cols / 2;
  base.colRange(mid_col, base.cols).setTo(0.0f);
  // Apply blur to create a smooth ramp
  const int ramp_width = 300;
  cv::blur(base, base, cv::Size(ramp_width * 2 + 1, 1));
  // Convert to 3 channels (RGB float)
  cv::Mat weights;
  cv::Mat channels[] = {base, base, base};
  cv::merge(channels, 3, weights);
  showResult("Lab 4 - Weights", weights);

Step4. Simple linear blending

  // TODO: Finish linear_blending.cpp.
  cv::Mat lin_blend = linearBlending(img_1, img_2, weights);
  showResult("Lab 4 - Linear blend", lin_blend);

Implement simple blending of two images using the weight mask you just created.

cv::Mat linearBlending(const cv::Mat& img_1, const cv::Mat& img_2, const cv::Mat& weights)
{
  // TODO: Blend the two images according to the weights: result = weights*img_1 + (1-weights)*img_2
  // No need to loop through all pixels!
  // Hint: https://docs.opencv.org/3.3.1/d1/d10/classcv_1_1MatExpr.html
  cv::Mat blended = weights.mul(img_1) + (cv::Scalar(1.0, 1.0, 1.0) - weights).mul(img_2);
  return blended;
}

无需逐个循环所有像素，直接利用 OpenCV 的矢量化矩阵（mul()）更有效率。
weights.mul(img_1) → 权重矩阵与图像1逐像素相乘
cv::Scalar(1.0, 1.0, 1.0) →3通道全1标量

Refer to: https://docs.opencv.org/4.9.0/d1/d10/classcv_1_1MatExpr.html

Run the code and observe the results. The result image should look like this with ramp_width=300:

Step5. Laplace blending

  // TODO: Finish laplace_blending.cpp.
  cv::Mat lap_blend = laplaceBlending(img_1, img_2, weights);
  showResult("Lab 4 - Laplace blend", lap_blend);

Construct a Gaussian pyramid

std::vector<cv::Mat, std::allocator<cv::Mat>> constructGaussianPyramid(const cv::Mat& img)
{
  // Construct the pyramid starting with the original image.
  std::vector<cv::Mat> pyr;
  pyr.push_back(img.clone());

  // Add new downscaled images to the pyramid
  // until image width is <= 16 pixels
  while(pyr.back().cols > 16)
  {
    // TODO: Add the next level in the pyramid.
    // Hint cv::pyrDown(...)
   cv::Mat downsampled;
   cv::pyrDown(pyr.back(), downsampled); 
   pyr.push_back(downsampled);
  }
  return pyr;
}

将图像复制到金字塔最底层
- img.clone()：对输入图像进行深拷贝
- pyr.push_back()：写入金字塔最底层图像（第0层）
循环下采样
- 首先创建空Mat对象downsampled
- 执行pyrDown
  - 对当前最后一级图像（pyr.back()）进行高斯模糊
  - 删除偶数行/列实现2倍降采样
  - 结果存入downsampled
- 更新金字塔：将结果加入pyr末尾
返回完整金字塔

Construct a Laplacian pyramid

std::vector<cv::Mat> constructLaplacianPyramid(const cv::Mat& img)
{
  // TODO: Use constructGaussianPyramid() to construct a laplacian pyramid.
  // Hint: cv::pyrUp(...)
  
  // Construct a Gaussian Pyramid
  std::vector<cv::Mat> gaussian_pyr = constructGaussianPyramid(img);
  std::vector<cv::Mat> laplacian_pyr;
  // Construct a Laplacian Pyramid
  for(size_t i = 0; i < gaussian_pyr.size() - 1; ++i){
    cv::Mat upsampled;
    cv::pyrUp(gaussian_pyr[i + 1], upsampled, gaussian_pyr[i].size());
    cv::Mat lap;
    cv::subtract(gaussian_pyr[i], upsampled, lap);
    laplacian_pyr.push_back(lap);
  }
  // Add the smallest image as the last layer
  laplacian_pyr.push_back(gaussian_pyr.back());
  return laplacian_pyr;
}

首先构建高斯金字塔（constructGaussianPyramid()）
拉普拉斯金字塔计算
- 遍历高斯金字塔所有层（除最后一层）
  - 上采样
    - pyrUp函数将金字塔的第i+1层图像上采样，上采样后尺寸与第i相同
    - 上采样过程：先将图像尺寸通过插入零值扩大2倍，然后用相同的高斯核进行卷积
  - 差分计算
    - 计算高斯金字塔当前层与上采样结果的插值作为拉普拉斯金字塔的当前层
    - 该层包含当前尺度下的高频细节
  - 添加最后一层
    - 高斯金字塔最小层作为拉普拉斯金字塔最后一层

Reconstruct an image by collapsing a Laplacian pyramid

cv::Mat collapsePyramid(const std::vector<cv::Mat>& pyr)
{
  // TODO: Collapse the pyramid.

  cv::Mat current = pyr.back().clone(); 
  // Reconstruct the image
  for (int i = static_cast<int>(pyr.size()) - 2; i >= 0; --i){
    // Upsample the current image to the size of the next layer
    cv::Mat upsampled;
    cv::pyrUp(current, upsampled, pyr[i].size());
    current = upsampled + pyr[i];
  }
  return current;
}

根据拉普拉斯金字塔重建全分辨率图像。每个层次添加高频细节。
首先从金字塔的最后一层开始（最小、最粗糙）
重建循环（从倒数第二层开始向上重建）
- 上采样：将当前重建结果current上采样到下一层尺寸
- 添加细节层：上采样结果与拉普拉斯金字塔当前层相加（**<font style="color:rgb(64, 64, 64);background-color:rgb(236, 236, 236);">current = upsampled + pyr[i][i]</font>**）恢复出高斯金字塔当前层
返回重建结果

Perform the Laplace blending

cv::Mat laplaceBlending(const cv::Mat& img_1, const cv::Mat& img_2, const cv::Mat& weights)
{
  // Construct a gaussian pyramid of the weight image.
  // TODO: Finish constructGaussianPyramid().
  std::vector<cv::Mat> weights_pyr = constructGaussianPyramid(weights);

  // Construct a laplacian pyramid of each of the images.
  // TODO: Finish constructLaplacianPyramid().
  std::vector<cv::Mat> img_1_pyr = constructLaplacianPyramid(img_1);
  std::vector<cv::Mat> img_2_pyr = constructLaplacianPyramid(img_2);

  // Blend the laplacian pyramids according to the corresponding weight pyramid.
  std::vector<cv::Mat> blend_pyr(img_1_pyr.size());
  for (size_t i = 0; i < img_1_pyr.size(); ++i)
  {
    // TODO: Blend the images using linearBlending() on each pyramid level.
    blend_pyr[i] = linearBlending(img_1_pyr[i], img_2_pyr[i], weights_pyr[i]);
  }

  // Collapse the blended laplacian pyramid.
  // TODO: Finish collapsePyramid().
  return collapsePyramid(blend_pyr);
}