c++ OpenCL硬件加速

概要

开发环境:RK3588
目的:使用OpenCL进行硬件加速 降低CPU占有率

系统支持

首先RK3588系列芯片支持GPU的硬件加速 且官方提供了部分快捷接口
然后就是各种库的编译和搭建了这里主要进行视频解码和图片处理
主要包含的库又 libmali.so opencv ffmpeg

libmali.so库

我的开发板是firefly的RK3588J,官方提供的固件内包含了这个库。可以在命令行内用 sudo find /usr -name libmali.so查看是否有该库

firefly@firefly:~# find /usr -name libmali.so
/usr/lib/aarch64-linux-gnu/libmali.so
firefly@firefly:~# strings /usr/lib/aarch64-linux-gnu/libmali.so | grep Mali-G610
Mali-G610
 
firefly@firefly:~# strings /usr/lib/aarch64-linux-gnu/libmali.so | grep cl
.....
clReleaseCommandBufferKHR
clReleaseCommandQueue
clReleaseContext
clReleaseDevice
clReleaseEvent
clReleaseKernel
clReleaseMemObject
.....
 
firefly@firefly:~# ls -l /usr/lib/aarch64-linux-gnu/libmali.so
lrwxrwxrwx 1 1007 1008 12 Jul 29  2020 /usr/lib/aarch64-linux-gnu/libmali.so -> libmali.so.1

如果没有的话可以从官方提供的网站进行下载安装
参考 https://blog.csdn.net/Graceful_scenery/article/details/135783830
官方链接 https://developer.arm.com/downloads/-/mali-drivers/user-space

ffmpeg库

ffmepg库主要用于拉流解码,这里配合opencv使用
注意opecv库和ffmpeg库自行编译的话需要配套使用 否则opencv编译的时候会找不到相应的ffmpeg解码库
这里我用的是opencv3.4.15配ffmpeg3.4.13 (具体应该对应哪个版本 可以去opencv配置文件里找)
这里不做赘述
参考 https://blog.csdn.net/qq_21331593/article/details/135275482

opencv库

这里库编译也不做赘述了 网上很多案例 随便搜
这里写个避坑的
opencv源码编译后,开启opencl成功了,但是用opencl进行GPU加速的图像处理 报错了
报错如下, 其中一个(我的是resize的时候报错了 这里随便找了个 一样的处理方法)
(PS:为什么要在这标注呢 因为要重新编译源码 先改了再说)

OpenCL program build log: imgproc/color_rgb
<u>Status -11: CL_BUILD_PROGRAM_FAILURE</u>
-D depth=0 -D scn=3 -D PIX_PER_WI_Y=1 -D dcn=1 -D bidx=0 -D STRIPE_SIZE=1
<built-in>:167:9: error: expected member name or ';' after declaration specifiers
int32_t depth;             /**< The image depth. */
~~~~~~~ ^
<built-in>:1:15: note: expanded from here
#define depth 0
              ^

<built-in>:167:8: error: expected ';' at end of declaration list
int32_t depth;             /**< The image depth. */
       ^

error: Compiler frontend failed (error code 62)

大概意思很简单 就是变量重复了
之后就很简单了
找到opencv的源码目录 我的是/opt/opencv/opencv-3.4.15/modules/imgproc/src
这里以cv::resize为例 打开resize.cpp
全局搜索’depth=‘
在这里插入图片描述
这里搜索到了三个 但是只需要改后面的两个 吧’depth=’ 改为 ’ocl_depth=‘ 就可以了。
(PS:全改其实也可以 但是会全程调用opencl加速 如果opencl被禁用了 或者硬件不支持 resize就会失败了)

测试demo

到这基本上环境就配置完成了 剩下就是测试了
因为opencv本身是支持opencl的加速的 所以直接调用ocl库就可以了 不用单独写内核代码
这里先创建个文件夹

(base) firefly@firefly:~$ mkdir demoProject
(base) firefly@firefly:~$ cd demoProject/
(base) firefly@firefly:~/demoProject$

创建两个文件 demo.cpp 和 CMakeLists.txt

(base) firefly@firefly:~/demoProject$ touch demo.cpp
(base) firefly@firefly:~/demoProject$ touch CMakeLists.txt

demo.cpp

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <iostream>
#include <unistd.h>
#include <fstream>
//#include <opencl.h>

#include <opencv2/opencv.hpp>
#include <opencv2/core/cuda.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/video/video.hpp>
#include <opencv2/core/ocl.hpp>

#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
#include <libswscale/swscale.h>

using namespace cv;
using namespace std;

int main(int argc, char** argv) {
    cv::ocl::setUseOpenCL(true);

    bool ret1 = cv::ocl::haveOpenCL();
    bool ret2 = cv::ocl::useOpenCL();
    std::cout<<"has cl:"<<ret1<<std::endl;
    std::cout<<"use cl:"<<ret2<<std::endl;
    String url = "rtsp://admin:Aa123456@192.168.5.165:554";

    cv::VideoCapture cap;
    cap = cv::VideoCapture(url, cv::CAP_FFMPEG);

    if(cap.isOpened())
        std::cout<<"stream open success"<<std::endl;
    else
        std::cout<<"stream open failse"<<std::endl;
    cv::Mat frame; //= cv::imread("frame_140.png");
    cv::Mat result;
    cv::UMat gpu_frame;
    cv::UMat gpu_result;

    while(1) {
        cap.read(gpu_frame);
        if(gpu_frame.empty()) {
            continue;
        }
        cv::resize(gpu_frame, gpu_result, cv::Size(640, 360), cv::INTER_LINEAR);

        cv::imwrite("output_cpp.png", gpu_result);
        //cv::imshow("ls", gpu_result);
    }
    cap.release();
    cv::ocl::setUseOpenCL(false);
    return 0;
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.10)
project(demo)

# 设置 C++ 标准
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED True)

# 查找 OpenCV 库
find_package(OpenCV REQUIRED)

# 包含 OpenCV 头文件
include_directories(${OpenCV_INCLUDE_DIRS})


add_executable(demo
               demo.cpp
               )

# 链接 OpenCV 库
target_link_libraries(demo ${OpenCV_LIBS})

然后创建一个build目录编译运行就可以了

(base) firefly@firefly:~/demoProject$ mkdir build
(base) firefly@firefly:~/demoProject$ cd build/
(base) firefly@firefly:~/demoProject/build$ cmake ../
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenCV: /usr (found version "4.2.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/firefly/demoProject/build
(base) firefly@firefly:~/demoProject/build$ make
Scanning dependencies of target demo
[ 50%] Building CXX object CMakeFiles/demo.dir/demo.cpp.o
[100%] Linking CXX executable demo
[100%] Built target demo
(base) firefly@firefly:~/demoProject/build$
(base) firefly@firefly:~/demoProject/build$ ./demo
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '7'.
has cl:1
use cl:1
stream open success

可以看到此时CPU占有率在20%左右
在这里插入图片描述
查看GPU使用情况

(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
2@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
2@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
3@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
3@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$

可以看到GPU一直在被调用
将OpenCL禁用 重新编译后在运行demo

cv::ocl::setUseOpenCL(false);

在这里插入图片描述
我们发现use cl被禁用CPU占有率来到了30%左右
查看GPU使用情况

(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz

小结

至此测试全部结束,demo内只用了一个resize操作,可以看到已经节省了10%左右的CPU占用率
GPU加速在图形处理上还是有很大提升的。
不过瑞芯微官方给出的案例上 对OpenCL的介绍很少 猜测目前这部分在多线程使用时还是有些BUG的 我尝试将demo封装成动态库文件 ,发现调用ocl的时候会失败,暂时不知道什么原因。有研究的大佬可以留言分享下。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值