概要
开发环境:RK3588
目的:使用OpenCL进行硬件加速 降低CPU占有率
系统支持
首先RK3588系列芯片支持GPU的硬件加速 且官方提供了部分快捷接口
然后就是各种库的编译和搭建了这里主要进行视频解码和图片处理
主要包含的库又 libmali.so opencv ffmpeg
libmali.so库
我的开发板是firefly的RK3588J,官方提供的固件内包含了这个库。可以在命令行内用 sudo find /usr -name libmali.so查看是否有该库
firefly@firefly:~# find /usr -name libmali.so
/usr/lib/aarch64-linux-gnu/libmali.so
firefly@firefly:~# strings /usr/lib/aarch64-linux-gnu/libmali.so | grep Mali-G610
Mali-G610
firefly@firefly:~# strings /usr/lib/aarch64-linux-gnu/libmali.so | grep cl
.....
clReleaseCommandBufferKHR
clReleaseCommandQueue
clReleaseContext
clReleaseDevice
clReleaseEvent
clReleaseKernel
clReleaseMemObject
.....
firefly@firefly:~# ls -l /usr/lib/aarch64-linux-gnu/libmali.so
lrwxrwxrwx 1 1007 1008 12 Jul 29 2020 /usr/lib/aarch64-linux-gnu/libmali.so -> libmali.so.1
如果没有的话可以从官方提供的网站进行下载安装
参考 https://blog.csdn.net/Graceful_scenery/article/details/135783830
官方链接 https://developer.arm.com/downloads/-/mali-drivers/user-space
ffmpeg库
ffmepg库主要用于拉流解码,这里配合opencv使用
注意opecv库和ffmpeg库自行编译的话需要配套使用 否则opencv编译的时候会找不到相应的ffmpeg解码库
这里我用的是opencv3.4.15配ffmpeg3.4.13 (具体应该对应哪个版本 可以去opencv配置文件里找)
这里不做赘述
参考 https://blog.csdn.net/qq_21331593/article/details/135275482
opencv库
这里库编译也不做赘述了 网上很多案例 随便搜
这里写个避坑的
opencv源码编译后,开启opencl成功了,但是用opencl进行GPU加速的图像处理 报错了
报错如下, 其中一个(我的是resize的时候报错了 这里随便找了个 一样的处理方法)
(PS:为什么要在这标注呢 因为要重新编译源码 先改了再说)
OpenCL program build log: imgproc/color_rgb
<u>Status -11: CL_BUILD_PROGRAM_FAILURE</u>
-D depth=0 -D scn=3 -D PIX_PER_WI_Y=1 -D dcn=1 -D bidx=0 -D STRIPE_SIZE=1
<built-in>:167:9: error: expected member name or ';' after declaration specifiers
int32_t depth; /**< The image depth. */
~~~~~~~ ^
<built-in>:1:15: note: expanded from here
#define depth 0
^
<built-in>:167:8: error: expected ';' at end of declaration list
int32_t depth; /**< The image depth. */
^
error: Compiler frontend failed (error code 62)
大概意思很简单 就是变量重复了
之后就很简单了
找到opencv的源码目录 我的是/opt/opencv/opencv-3.4.15/modules/imgproc/src
这里以cv::resize为例 打开resize.cpp
全局搜索’depth=‘
这里搜索到了三个 但是只需要改后面的两个 吧’depth=’ 改为 ’ocl_depth=‘ 就可以了。
(PS:全改其实也可以 但是会全程调用opencl加速 如果opencl被禁用了 或者硬件不支持 resize就会失败了)
测试demo
到这基本上环境就配置完成了 剩下就是测试了
因为opencv本身是支持opencl的加速的 所以直接调用ocl库就可以了 不用单独写内核代码
这里先创建个文件夹
(base) firefly@firefly:~$ mkdir demoProject
(base) firefly@firefly:~$ cd demoProject/
(base) firefly@firefly:~/demoProject$
创建两个文件 demo.cpp 和 CMakeLists.txt
(base) firefly@firefly:~/demoProject$ touch demo.cpp
(base) firefly@firefly:~/demoProject$ touch CMakeLists.txt
demo.cpp
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <iostream>
#include <unistd.h>
#include <fstream>
//#include <opencl.h>
#include <opencv2/opencv.hpp>
#include <opencv2/core/cuda.hpp>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/video/video.hpp>
#include <opencv2/core/ocl.hpp>
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
#include <libswscale/swscale.h>
using namespace cv;
using namespace std;
int main(int argc, char** argv) {
cv::ocl::setUseOpenCL(true);
bool ret1 = cv::ocl::haveOpenCL();
bool ret2 = cv::ocl::useOpenCL();
std::cout<<"has cl:"<<ret1<<std::endl;
std::cout<<"use cl:"<<ret2<<std::endl;
String url = "rtsp://admin:Aa123456@192.168.5.165:554";
cv::VideoCapture cap;
cap = cv::VideoCapture(url, cv::CAP_FFMPEG);
if(cap.isOpened())
std::cout<<"stream open success"<<std::endl;
else
std::cout<<"stream open failse"<<std::endl;
cv::Mat frame; //= cv::imread("frame_140.png");
cv::Mat result;
cv::UMat gpu_frame;
cv::UMat gpu_result;
while(1) {
cap.read(gpu_frame);
if(gpu_frame.empty()) {
continue;
}
cv::resize(gpu_frame, gpu_result, cv::Size(640, 360), cv::INTER_LINEAR);
cv::imwrite("output_cpp.png", gpu_result);
//cv::imshow("ls", gpu_result);
}
cap.release();
cv::ocl::setUseOpenCL(false);
return 0;
}
CMakeLists.txt
cmake_minimum_required(VERSION 3.10)
project(demo)
# 设置 C++ 标准
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED True)
# 查找 OpenCV 库
find_package(OpenCV REQUIRED)
# 包含 OpenCV 头文件
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(demo
demo.cpp
)
# 链接 OpenCV 库
target_link_libraries(demo ${OpenCV_LIBS})
然后创建一个build目录编译运行就可以了
(base) firefly@firefly:~/demoProject$ mkdir build
(base) firefly@firefly:~/demoProject$ cd build/
(base) firefly@firefly:~/demoProject/build$ cmake ../
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found OpenCV: /usr (found version "4.2.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/firefly/demoProject/build
(base) firefly@firefly:~/demoProject/build$ make
Scanning dependencies of target demo
[ 50%] Building CXX object CMakeFiles/demo.dir/demo.cpp.o
[100%] Linking CXX executable demo
[100%] Built target demo
(base) firefly@firefly:~/demoProject/build$
(base) firefly@firefly:~/demoProject/build$ ./demo
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '7'.
has cl:1
use cl:1
stream open success
可以看到此时CPU占有率在20%左右
查看GPU使用情况
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
2@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
2@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
3@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
3@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
4@300000000Hz
(base) firefly@firefly:~/demoProject/build$
可以看到GPU一直在被调用
将OpenCL禁用 重新编译后在运行demo
cv::ocl::setUseOpenCL(false);
我们发现use cl被禁用CPU占有率来到了30%左右
查看GPU使用情况
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
(base) firefly@firefly:~/demoProject/build$ cat /sys/devices/platform/fb000000.gpu/devfreq/fb000000.gpu/load
0@300000000Hz
小结
至此测试全部结束,demo内只用了一个resize操作,可以看到已经节省了10%左右的CPU占用率
GPU加速在图形处理上还是有很大提升的。
不过瑞芯微官方给出的案例上 对OpenCL的介绍很少 猜测目前这部分在多线程使用时还是有些BUG的 我尝试将demo封装成动态库文件 ,发现调用ocl的时候会失败,暂时不知道什么原因。有研究的大佬可以留言分享下。