一、背景介绍
1.1 RK3588芯片特性
Rockchip RK3588是面向AIoT领域的高性能SoC芯片,采用8nm制程工艺,搭载:
- 4xCortex-A76 + 4xCortex-A55大小核架构
- Mali-G610 MP4 GPU(支持Vulkan 1.2/OpenCL 2.2)
- 6TOPS NPU(本测试未涉及)
1.2 为什么选择MNN?
阿里巴巴开源的MNN(Mobile Neural Network)推理引擎具有以下优势:
- 多平台支持:iOS/Android/Linux/Windows全平台覆盖
- 异构计算:支持CPU/GPU/NPU多后端
- 轻量化:基础库仅约500KB
- 量化加速:支持FP16/INT8量化压缩
1.3 测试目标解析
通过ResNet50模型测试不同计算后端的性能表现:
- CPU:通用计算,验证基础性能
- Vulkan:新一代跨平台图形计算API,低开销并行计算
- OpenCL:通用异构计算标准,支持多类型加速器
- 量化对比:验证精度与速度的平衡点
二、参考链接
三、操作步骤
3.1 Vulkan环境搭建
# 安装Mali GPU官方驱动(包含Vulkan支持)
wget https://repo.rock-chips.com/edge/debian-release-v2.0.0/pool/main/r/rockchip-mali/rockchip-mali_1.9-12_arm64.deb
sudo dpkg -i rockchip-mali_1.9-12_arm64.deb
# 创建符号链接确保动态库可见性
sudo ln -s /usr/lib/aarch64-linux-gnu/libmali-valhall-g610-g6p0-wayland-gbm-vulkan.so /usr/lib/aarch64-linux-gnu/libmali.so
# 配置Vulkan驱动描述文件
sudo mkdir -p /etc/vulkan/icd.d/
echo '{
"file_format_version": "1.0.0",
"ICD": {
"library_path": "/usr/lib/aarch64-linux-gnu/libmali-valhall-g610-g6p0-wayland-gbm-vulkan.so",
"api_version": "1.0.0"
}
}' | sudo tee /etc/vulkan/icd.d/mali.json
apt install vulkan-tools vulkan-utils -y
vulkaninfo
关键配置解析:
libmali.so
是Mali GPU的统一驱动入口- ICD(Installable Client Driver)文件声明Vulkan驱动路径
vulkaninfo
工具用于验证驱动安装成功
如果一切正常,控制台将输出:
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.
'DISPLAY' environment variable not set... skipping surface info
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.
==========
VULKANINFO
==========
Vulkan Instance Version: 1.2.131
Instance Extensions: count = 10
====================
VK_EXT_debug_report : extension revision 9
VK_EXT_debug_utils : extension revision 1
VK_EXT_headless_surface : extension revision 1
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_surface : extension revision 25
Layers: count = 0
=======
Presentable Surfaces:
=====================
Groups:
=======
Device Group Properties (Group 0):
physicalDeviceCount: count = 1
Mali-LODX (ID: 0)
subsetAllocation = 0
Device Group Present Capabilities (Group 0):
arm_release_ver of this libmali is 'g6p0-01eac0', rk_so_ver is '10'.
Mali-LODX (ID: 0)
Can present images from the following devices:
Mali-LODX (ID: 0)
Present modes:
DEVICE_GROUP_PRESENT_MODE_LOCAL_BIT_KHR
Device Properties and Extensions:
=================================
GPU0:
VkPhysicalDeviceProperties:
---------------------------
apiVersion = 4202661 (1.2.165)
driverVersion = 25165824 (0x1800000)
vendorID = 0x13b5
deviceID = 0xa8670000
deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
deviceName = Mali-LODX
3.2 安装OpenCL环境
# 替换系统默认OpenCL驱动
mv /lib/aarch64-linux-gnu/libOpenCL.so.1 /lib/aarch64-linux-gnu/libOpenCL.so.1.bk
ln -s /usr/lib/aarch64-linux-gnu/libmali.so /lib/aarch64-linux-gnu/libOpenCL.so.1
# 安装开发工具链
sudo apt install -y opencl-headers
sudo apt install -y ocl-icd-libopencl1
sudo apt install -y ocl-icd-opencl-dev
sudo apt install -y clinfo
clinfo
3.3 Vulkan运行relu
算子
3.3.1 安装glslang-tools
apt install glslang-tools -y
3.3.2 编写计算着色器(relu.comp
)
- 计算着色器原理
ReLU(Rectified Linear Unit)是深度学习中的常用激活函数,数学表达式为:
f ( x ) = m a x ( 0 , x ) f(x) = max(0, x) f(x)=max(0,x)
-
生成GLSL着色器代码
cat > relu.comp <<-'EOF' #version 450 layout(local_size_x = 256) in; // 每个工作组256个线程 layout(binding = 0) buffer InputBuffer { float inputData[]; }; layout(binding = 1) buffer OutputBuffer { float outputData[]; }; void main() { uint idx = gl_GlobalInvocationID.x; // 全局线程索引 outputData[idx] = max(inputData[idx], 0.0); } EOF
说明:
layout(local_size_x = 256) in;
指定计算着色器的工作组大小。binding = 0
和binding = 1
分别绑定输入和输出缓冲区。gl_GlobalInvocationID.x
获取全局线程 ID,遍历所有数据元素。max(inputData[idx], 0.0)
实现 ReLU 操作,对于每个元素,输出其与零的最大值。
-
Vulkan执行流程