ACL库和opencv库LK光流性能对比

最新推荐文章于 2024-05-14 05:40:11 发布

CNccion

最新推荐文章于 2024-05-14 05:40:11 发布

阅读量915

点赞数

分类专栏：性能优化文章标签：计算机视觉 opencv

本文链接：https://blog.csdn.net/weixin_41965270/article/details/115702995

版权

性能优化专栏收录该内容

1 篇文章 0 订阅

订阅专栏

1.什么是光流

它是空间运动物体在观察成像平面上的像素运动的瞬时速度，是利用图像序列中像素在时间域上的变化以及相邻之间的相关性

来找到上一帧跟当前帧之间存在的对应关系，从而计算出相邻帧之间物理的运动信息的一种方法。

2.按照理论基础与数学方法的区别分五种：

基于梯度的方法
基于匹配的方法
基于能量的方法
基于相位的方法
基于神经动力学方法

3.openCV中实现的光流算法

calcOpticalFlowPyrLK

通过金字塔Lucas-Kanade光流方法计算某些点集的光流(稀疏光流)。

参考论文:Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithm

calcOpticalFlowFarmeback

用Gunnar Farmeback的算法计算稠密光流(即图像上所有像素点的光流都计算出来)

参考论文:Two-Frame Motion Estimation Based on PolynomialExpansion

CalcOpticalFlowBM

通过块匹配的方法来计算光流

CalcOpticalFlowHS

用Horn-Schunck的算法计算稠密光流

参考论文: Determining Optical Flow

calcOpticalFlowSF

2012年欧洲视觉会议的一篇文章的实现：“simpleFlow: A Non-iterative,Sublinear Optical FlowAlgorithm”

工程网站:http://graphics.berkeley.edu/papers/Tao-SAN-2012-05/

4.LK光流原理

《学习opencv 3中文版》-第16章和《视觉slam十四讲》-第8讲中都有讲解，请参考。

5.LK光流在ACL和opencv中的性能对比

5.1 ACL

ACL(arm compute library)是ARM公司发布的开源工程，旨在为图像/视频/多媒体/计算机视觉等领域的开发者提供arm平台的硬件加速库。工程地址:ACL

5.1.1编译

scons Werror=0 -j8 debug=0 asserts=0 neon=1 opencl=0 os=linux arch=arm64-v8a

5.1.2库版本

**Output of 'strings libarm_compute.so | grep arm_compute_version':**arm_compute_version=v21.02 Build options: {'os': 'linux', 'opencl': '0', 'neon': '1', 'asserts': '0', 'debug': '0', 'arch': 'arm64-v8a', 'Werror': '0'} Git hash=7dcb9fadb98cad05fca72de3273311d570d98b4e

5.1.3运行平台

quad-core Arm Cortex-A53 756MHz CPU

5.2 opencv和LK的源码

#include <iostream>
#include <ctype.h>
#include <sys/time.h>
#include <cstdlib>
#include <cstring>
#include <fstream>
#include <iostream>
#include <memory>
#include <random>
#include <string>
#include <tuple>
#include <vector>
/*opencv include*/
#include "opencv2/video/tracking.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/videoio.hpp"
#include "opencv2/highgui.hpp"
/*arm compute include*/
#include "arm_compute/runtime/NEON/NEFunctions.h"
#include "arm_compute/core/Helpers.h"
#include "arm_compute/core/ITensor.h"
#include "arm_compute/core/Types.h"
#include "arm_compute/core/Window.h"
#include "arm_compute/runtime/Tensor.h"
#include "Utils.h"


using namespace cv;
using namespace std;
using namespace arm_compute;

#define GETTIME(time)                                  \
{                                                      \
    struct timeval lTime;                              \
    gettimeofday(&lTime, 0);                           \
    *time = (lTime.tv_sec * 1000000 + lTime.tv_usec);  \
}

void help(char** argv)
{
    cout << "Call:" << argv[0] << "[image1] [image2]" << endl;
    cout << "Demonstrates Pyramid Lucas-Kanade optical flow." << endl;
}

int main( int argc, char** argv )
{
    if(argc != 3){
        help(argv);
        exit(1);
    }
    long  start_time1,end_time1;
    long  start_time2,end_time2;
    long  start_time3,end_time3;
    long  start_time4,end_time4;

    /*arm computer library*/
    Pyramid               pyr_1st{};
    Pyramid               pyr_2nd{};
    NEGaussianPyramidHalf pyrf_1st{};
    NEGaussianPyramidHalf pyrf_2nd{};
    NEOpticalFlow         optkf{};
    Image                 src_1st{}, src_2nd{};
    KeyPointArray         input_points(1000);
    KeyPointArray         output_points(1000);
    KeyPointArray         point_estimates(1000);

    /*Initialize, load two images from the file system, and
      allocate the images and others structures we will need 
      for results*/
    cv::Mat imgA = cv::imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
    cv::Mat imgB = cv::imread(argv[2], CV_LOAD_IMAGE_GRAYSCALE);
    int win_size = 10;
    cv::Mat  imgC = cv::imread(argv[2], CV_LOAD_IMAGE_UNCHANGED);
    /*opencv read image  transfer to opticalFlow*/
    /*src_1st image data*/
    printf("imgA.cols:%d -- imgA.rows:%d\n",imgA.cols, imgA.rows);
    src_1st.allocator()->init(TensorInfo(imgA.cols,imgA.rows, Format::U8));
    src_1st.allocator()->allocate();
    memcpy(src_1st.buffer(), imgA.ptr(), imgA.cols * imgA.rows);
    save_to_ppm(src_1st,"1st.ppm");
    /*src_2st image data*/
    printf("imgB.cols:%d -- imgB.rows:%d\n",imgB.cols, imgB.rows);
    src_2nd.allocator()->init(TensorInfo(imgB.cols,imgB.rows, Format::U8));
    src_2nd.allocator()->allocate();
    memcpy(src_2nd.buffer(), imgB.ptr(), imgB.cols * imgB.rows);
    save_to_ppm(src_2nd,"2nd.ppm");


    vector<cv::Point2f> cornersA, cornersB;
    const  int MAX_CORNERS = 500;
    static int seq_num = 0;
    printf("######## %d ########\n",seq_num++);
    //good features track
    GETTIME(&start_time1);
    cv::goodFeaturesToTrack(imgA, cornersA, MAX_CORNERS, 0.01, 5, cv::noArray(), 3, false, 0.04);
    GETTIME(&end_time1);
    printf("goodFeaturesToTrack time:%ldus\n",end_time1 - start_time1);

    //cornersubpix
    GETTIME(&start_time2);
    cv::cornerSubPix(imgA, cornersA, cv::Size(win_size, win_size), cv::Size(-1, -1), cv::TermCriteria(cv::TermCriteria::MAX_ITER|cv::TermCriteria::EPS,20,0.03));
    GETTIME(&end_time2);
    printf("cornerSubPix time:%ldus\n",end_time2 - start_time2);

    input_points.clear();
    input_points.resize(cornersA.size());
    point_estimates.resize(cornersA.size());
    printf("cornersA.size:%ld -- input_points.size:%ld -- point_estimates.size:%ld\n",cornersA.size(),input_points.num_values(),point_estimates.num_values());
    /*tracking points point_estimate*/
    for(size_t k = 0; k < cornersA.size(); k++){
        auto &keypoint1           = input_points.at(k);
        auto &keypoint2           = point_estimates.at(k);
        keypoint1.x = keypoint2.x = cornersA.at(k).x;
        keypoint1.y = keypoint2.y = cornersA.at(k).y;
        keypoint1.tracking_status = keypoint2.tracking_status = 1;
    }
    


    //call the Lucas Kanade algorithm for neon LK
    const unsigned int num_levels = 5;
    // Initialise and allocate pyramids
    PyramidInfo pyramid_info(num_levels, SCALE_PYRAMID_HALF, src_1st.info()->tensor_shape(), src_1st.info()->format());
    pyr_1st.init_auto_padding(pyramid_info);
    pyr_2nd.init_auto_padding(pyramid_info);

    pyrf_1st.configure(&src_1st, &pyr_1st, BorderMode::UNDEFINED, 0);
    pyrf_2nd.configure(&src_2nd, &pyr_2nd, BorderMode::UNDEFINED, 0);
    output_points.resize(input_points.num_values());
    optkf.configure(&pyr_1st, &pyr_2nd,
                        &input_points, &point_estimates, &output_points,
                        Termination::TERM_CRITERIA_BOTH , 0.3f, 20, 21, true, BorderMode::UNDEFINED, 0);
    pyr_1st.allocate();
    pyr_2nd.allocate();
    //Execute the functions:
    GETTIME(&start_time4);
    pyrf_1st.run();
    pyrf_2nd.run();
    optkf.run();
    GETTIME(&end_time4);
    printf("neon lk time:%ldus\n",end_time4 - start_time4);


    //call the Lucas Kanade algorithm for opencv
    vector<uchar> features_found;
    GETTIME(&start_time3);
    cv::calcOpticalFlowPyrLK(imgA,imgB,cornersA,cornersB,features_found,noArray(),cv::Size(win_size*2+1,win_size*2+1),5,cv::TermCriteria(cv::TermCriteria::MAX_ITER|cv::TermCriteria::EPS,20,0.3));
    GETTIME(&end_time3);
    printf("calcOpticalFlowPyrLK time:%ldus\n", end_time3 - start_time3);
    //Now make some image of what we are looking at:
    //Note that if you want to track cornersB further,i.e.
    //pass them as input to the next calcOpticalFlowPyrLK,
    //you would need to "compress" the vector, i.e.,exclude points for which
    //features_found[i] == false.

    for(int i = 0; i < (int)cornersA.size(); i++){
        if(!features_found[i]){
            continue;
        }
        line(imgC, cornersA[i], cornersB[i], Scalar(0, 255, 0), 2, cv::LINE_AA);
    }
    for(int i = 0; i < (int)output_points.num_values();i++){
        auto kp = output_points.at(i);
        if(!kp.tracking_status){
            continue;
        }
        cv::Point2f tmp;
        tmp.x = kp.x;
        tmp.y = kp.y;
        line(imgC,cornersA[i],tmp,Scalar(0, 0, 255), 2, cv::LINE_AA);
    }

    cv::imwrite("imageA.jpg",imgA);
    cv::imwrite("imageB.jpg",imgB);
    cv::imwrite("LK.jpg",imgC);

    return 0;

5.3在arm上的结果保存图

opencv的结果图

ACL的结果图

结果对比

5.4时间消耗对比

######## 1 ########
neon lk time:79194us
calcOpticalFlowPyrLK time:78242us

######## 2 ########
neon lk time:81771us
calcOpticalFlowPyrLK time:81385us

######## 3 ########
neon lk time:88099us
calcOpticalFlowPyrLK time:79475us

######## 4 ########
neon lk time:78677us
calcOpticalFlowPyrLK time:77828us

######## 5 ########
neon lk time:77822us
calcOpticalFlowPyrLK time:77820us

######## 6 ########
neon lk time:79310us
calcOpticalFlowPyrLK time:78403us

######## 7 ########
neon lk time:86065us
calcOpticalFlowPyrLK time:78048us

######## 8 ########
neon lk time:79414us
calcOpticalFlowPyrLK time:79199us

######## 9 ########
neon lk time:77931us
calcOpticalFlowPyrLK time:77827us

######## 10 ########
neon lk time:78240us
calcOpticalFlowPyrLK time:78007us

6.结论

在保持lk函数输入参数相同的情况下，ACL和opencv的性能几乎一致。