Discontinuous seam carving for video retargeting_improved seam carving for video retargeting-CSDN博客

本文链接：https://blog.csdn.net/feifeiiong/article/details/61614537

Discontinuous seam carving for video retargeting

写本文的目的很简单，我实在是要不到本文对应论文的代码实现，因此决定自己写，博客先写出来放这里，等写好就开源。
本文的工作已经基本完成，剩下的工作包括一些效率上的优化处理，核心的思路会公布在文章中，如有错误，请留言指教。文章本身仅为学术范围内的分享和交流，请勿转载。
说明：本文代码是可以跑的，但是很多tricks没有包含在代码中，因此使用本文的代码速度会很慢，有学术上的交流请跟我联系或下方留言，目前我自己做到的速度大概是一个seam毫秒级，欢迎探讨。

前期准备

本文对应的基础算法是图像/视频缩放领域的经典算法-seam-carving 方法，关于该方法的知识是阅读本文的基础，相关的paper如下：

【SIGGRAPH】07-Seam carving for content-aware image resizing
【SIGGRAPH】08-Improved Seam Carving for Video Retargeting
相对应的代码可以在github上找到。

核心思想

这篇论文的主要创新点是提出了一种新的能量函数，在此能量函数的基础上，应用经典seam carving 算法对视频的每一帧进行缩放，该能量函数需要同时保持时间和空间一致性。

时间一致性

视频时间一致性的保持，关键在于seam在相邻帧的位置关系。我们知道，如果对视频的每一帧用相同的seam处理，那么视频帧之间的时间一致性毫无疑问会得到保持，但是空间一致性就会遭到破坏。根据论文提出的新的处理方式，有如下的时间一致性能量函数计算过程：

void SeamCarver::calculateTempral(Mat& newEnergy, vector<uint> seam) {

    // we know the seam, and we should compute the new energy term according to the seam
    vector<int> seamVal(energy.cols - 1);
    for (int i = 0; i < energy.rows;++i) {
        seamVal.clear();
        seamVal.resize(energy.cols - 1);
        uint pos = seam[i];
        // first get the temporal row
        for (int m = 0; m < energy.cols-1;++m){
            if (m<pos) {
                seamVal[m] = energy.at<uint32_t>(i, m);
            }
            else {
                seamVal[m] = energy.at<uint32_t>(i, m + 1);
            }
        }
        for (int j = 0; j < energy.cols;++j){
            //left and right
            // not contain left
            int totalEnergy = 0;
            //left and right
            //left first
            for (int l = 0; l < j;l++) {
                totalEnergy += abs(static_cast<int>(energy.at<uint32_t>(i, l) - seamVal[l]));
            }
            //right second
            for (int l = j + 1; l < energy.cols;++l) {
                totalEnergy += abs(static_cast<int>(energy.at<uint32_t>(i, l) - seamVal[l - 1]));
            }
            newEnergy.at<uint32_t>(i, j) = totalEnergy*0.2;

        }
    }
}

我们使用newEnergy来存储对应于每一个像素的 total cost，这里根据论文的方法，首先计算出temporal cost。

空间一致性

video retargeting的另一核心问题就是空间一致性的保持，在论文中，作者认为每一个像素的删除，对于整体spatial coherence的保持是可以通过计算得出的，这里，我们同样基于前一帧的seam，作为我们进行处理的基准，来计算像素的spatial coherence cost：

void SeamCarver::calculateSpatial(Mat &newEnergy, vector<uint> seam) {
    // horizontal and vertical cost
    // horizontal first
    for (int i = 0; i < energy.rows; ++i) {
        for (int j = 0; j < energy.cols; ++j) {
            int totalEnergy = 0;
            //we calculate the horizontal first
            if (j==0&&(j+2)<energy.cols) {// two factors - border pixel
                int before = energy.at<uint32_t>(i, j) - energy.at<uint32_t>(i, j + 1);
                int after = energy.at<uint32_t>(i, j + 1) - energy.at<uint32_t>(i, j + 2);
                // so we can get the abs of the two values
                int minus = abs(abs(before) - abs(after));
                // so we add the minus to the totoalEnergy
                totalEnergy += minus;
            } else if(j+1<energy.cols){ // three factors - interior pixel
                int firstVal = energy.at<uint32_t>(i, j - 1) - energy.at<uint32_t>(i, j);
                int secondVal = energy.at<uint32_t>(i, j)- energy.at<uint32_t>(i, j + 1);
                int before = abs(firstVal) + abs(secondVal);
                int after = energy.at<uint32_t>(i, j - 1) - energy.at<uint32_t>(i, j + 1);
                int minus = abs(abs(before) - abs(after));
                // so we add the minus to the totoalEnergy
                totalEnergy += minus;
            }
            //we calculate the vertical second
            //11 pixels according to the paper
            //the paper is not clear enough about the vertical cost, we use the seam to calculate
            int verticalEnergy = 0;
            if (i+1<energy.rows) {
                uint pos = seam[i + 1]; // we get the seam pixel for the i+1 row
                //search 11 pixels
                int distance = abs((int)pos - j);
                if (distance>11){ // we assign it infinite val
                    verticalEnergy = INFINVAL;
                }
                else { // we calculate the verticalEnergy of the pixel, two cases
                    // the pos == j
                    if (j==pos) {
                        verticalEnergy = 0;
                    }
                    //we need define the GV operation and GD operation first
                    else if (j>pos) {  // first case j>pos
                        int firstEnergy = 0;
                        for (int m = pos; m < j;m++) {
                            firstEnergy += abs(g_v(i, m) - g_d_first(i, m));
                        }
                        for (int m = pos + 1; m <= j;m++) {
                            firstEnergy += abs(g_v(i, m) - g_d_first(i, m-1));
                        }
                        verticalEnergy += firstEnergy;
                    }
                    else {  // second case j<pos similar to the first
                        int secEnergy = 0;
                        for (int m = j; m < pos;m++) {
                            secEnergy += abs(g_v(i, m) - g_d_sec(i, m));
                        }
                        for (int m = j + 1; m <= pos; m++) {
                            secEnergy += abs(g_v(i, m) - g_d_sec(i, m - 1));
                        }
                        verticalEnergy += secEnergy;
                    }
                }
                // so we compute the vertical energy

            }
            totalEnergy += verticalEnergy;
            // so we get all spatial energy involved, then we assign it to the Mat
            newEnergy.at<uint32_t>(i, j) += 0.8*totalEnergy;


        }
    }
}