YOLO-v5模型ONNX转Caffe

最新推荐文章于 2025-03-05 09:07:56 发布

pssbjn

最新推荐文章于 2025-03-05 09:07:56 发布

阅读量1.6k

点赞数 13

文章标签： YOLO caffe 人工智能

本文链接：https://blog.csdn.net/pssbjn/article/details/138153159

版权

本文详细介绍了如何将YOLOv5模型从ONNX转换为Caffe，以便在特定硬件上部署。涉及的内容包括安装Caffe环境、添加upsample和permute算子，以及ONNX转Caffe的具体步骤和模型调整。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

概述

将YOLOv5模型或者其他深度学习模型部署到嵌入式平台时，一般不需要将YOLOv5模型从ONNX转换为Caffe模型。从ONNX转换为Caffe有一些用途：

一些比较老旧或者低端的处理器可能只支持转换Caffe模型，因此必须将ONNX转换为Caffe
转换为Caffe模型之后可以在PC上用C/C++代码调用Caffe框架对模型进行验证和评估，在目标平台上验证要方便，便于算法迭代。

YOLOv5模型中的5维转置、除法等Caffe不直接支持，可以参考YOLO算法移植，对训练好的模型进行裁剪后再导出为ONNX模型文件。裁剪原则是Detect层仅保留有训练权重的卷积层，用sigmoid作为网络输出，具体方法根据自己的偏好用可视化方法删除和添加层，也可以修改Detect层中export相关的代码。

安装Caffe环境

Caffe已经多年未更新，依赖的库相应也比较老旧。直接在开发主机上安装容易出来软件包冲突问题，用虚拟机或者docker安装比较好。用Docker安装Caffe环境参见安装caffe环境。用虚拟机和用docker安装，在依赖库和环境配置方面完全相同。

添加upsample和permute算子

以下参考博客yolo模型转换：pytorch -＞ onnx -＞ caffe

caffe不支持upsample和permute算子，这两个算子的实现代码可以从caffe_plus中添加:

下载caffe_plus源代码后，从caffe_plus源代码目录中将以下文件拷贝到caffe源代码的对应目录

算子	文件
upsample	include/caffe/layers/upsample_layer.hpp src/caffe/layers/upsample_layer.cpp src/caffe/layers/upsample_layer.cu
permute	include/caffe/layers/permute_layer.hpp src/caffe/layers/permute_layer.cpp src/caffe/layers/permute_layer.cu

git clone https://github.com/jnulzl/caffe_plus.git
cp caffe_plus/include/caffe/layers/{upsample_layer.hpp,permute_layer.hpp} caffe/include/caffe/layers/
cp caffe_plus/src/caffe/layers/{upsample_layer.c*,permute_layer.c*} caffe/src/caffe/layers/

修改src/caffe/proto/caffe.proto添加upsample和permute算子的protobuf定义

2.1 在message LayerParameter 末尾添加两个算子的参数

message LayerParameter {
  optional WindowDataParameter window_data_param = 129;

  optional PermuteParameter permute_param = 150;
  optional UpsampleParameter upsample_param = 151;
}

2.2 在文件末尾添加两个算子的参数类型定义

message PermuteParameter {
  // The new orders of the axes of data. Notice it should be with
  // in the same range as the input data, and it starts from 0.
  // Do not provide repeated order.
  repeated uint32 order = 1;
}
 
message UpsampleParameter {		
	optional int32 height = 1 [default = 32];
	optional int32 width = 2 [default = 32];
	optional int32 height_scale = 3 [default = 2];
	optional int32 width_scale = 4 [default = 2];
	enum UpsampleOp {
		NEAREST = 0;
		BILINEAR = 1;
	}
	optional UpsampleOp mode = 5 [default = BILINEAR];

重新编译caffe

make all
make pycaffe

ONNX转Caffe

可以实现ONNX转Caffe的开源库比较多，但是要将YOLOv5从ONNX转换为Caffe都需要作一些修改，这里选择改动比较小的yolov5_onnx2caffe

convertCaffe.py中的convertToCaffe函数

        inputs = node.inputs  # 列表，由可视化中 input 一栏中 name 字段组成，顺序同可视化界面一致。如果某个键有参数数组，则也会在 input_tensors 存在

        inputs_tensor = node.input_tensors  # 字典，可视化界面中，如果有参数数组就是这里面的值，键也在input 中， 有多少参数数组就有多少键值

        input_non_exist_flag = False

        for inp in inputs:  # input 组成元素有两种，一是上层节点 name，二是本层参数 name
            if not inp: continue
            if inp not in exist_edges and inp not in inputs_tensor:  # 筛除，正常节点判断条件是不会成立的
                input_non_exist_flag = True
                break
        if input_non_exist_flag:
            continue

以上代码中添加行（有些node的个别input为空，导致该节点的后续输入没有加入输入列表，引起后续处理错误）

            if not inp: continue

修改resize算子

YOLOv5 Head网络中的Upsample层在导出ONNX时，用resize算子实现，属性如下
请添加图片描述
其中模式为nearest, inputs[2]是scale.

修改文件onnx2caffe/_operators.py中的函数_convert_resize

2.1 scales是Resize算子的第三个参数（从0开始计数，索引号为2）

将

        scales = node.input_tensors.get(node.inputs[1])

修改为

        scales = node.input_tensors.get(node.inputs[2])

2.2 linear和nearest都用反卷积算子Deconvolution实现

将

    if  str(mode,encoding="gbk") == "linear": #mode=="linear":

修改为

    if  str(mode,encoding="gbk") == "linear" or str(mode,encoding="gbk") == "nearest": #mode=="linear":

转换Caffe

修改convertCaffe.py中的输入和输出文件名，执行可得到转换后的caffe模型文件。

if __name__ == "__main__":
    # 修改以下三个参数
    # 输入：onnx模型文件路径
    #onnx_path = "/home/willer/nanodet_concat/tools/nanodet-simple.onnx"
    # 输出：caffe模型的网络结构描述文件
    #prototxt_path = "./nanodet-simple.prototxt"
    # 输出：caffe模型的网络参数文件
    #caffemodel_path = "./nanodet-simple.caffemodel"

    graph = getGraph(onnx_path)

测试caffe模型

调用深度学习框架的一般流程如下：

环境初始化：初始化配置、设置等。嵌入式处理器的深度学习库一般要先初始化AI硬件。
加载模型文件
准备输入：测试程序一般从图像或视频文件加载图像，并进行缩放和padding等操作，满足网络输入需要
推导：将输入复制到网络输入，执行网路推导，获取输出
后处理：

用caffe框架调用caffe模型文件的流程参考tools/caffe.cpp，流程如下：

初始化

Caffe::set_mode(Caffe::CPU); // 使用CPU计算

加载模型文件

// 加载模型结构文件
Caffe::Net<float> caffe_net(prototxt_path, caffe::TEST, 0, 0);
// 加载参数文件
caffe_net.CopyTrainedLayersFrom(caffemodel_path);

图片预处理：这一步主要用opencv完成，与YOLOv5 python代码的图片预处理一一对应

获取网络输入尺寸(ih, iw)
加载图片或视频文件，获取图片尺寸(imgh, imgw)
计算缩放比例: 取(ih / imgh, iw / imgw)中较小的那个作为缩放比例
缩放和padding: 得到最终尺寸为(ih, iw)

推导

    auto input_layer = caffe_net.input_blobs()[0]; // yolo模型仅有一个输入Blob
     // 分配输入缓存，缓存尺寸为(1, 3, iw, ih)，与网络输入尺寸一致
    float input_data = input_layer->mutable_cpu_data();
    // 将预处理后的图片拷贝到输入缓存
    // 推导
    caffe.net.Forward();
    // 获取输出
    auto& outputs = caffe_net.output_blobs();
    //

caffe::Net::output_blobs的函数原型位于文件include/caffe.net.hpp，返回std::vector的引用，可以通过size获取输出的数量。

inline const vector<Blob<Dtype>*>& output_blobs() const

caffe::Blob的定义位于文件include/caffe/blob.hpp，

成员函数shape()获取维度信息
成员函数cpu_data()访问数据缓存
成员函数mutable_cpu_data()分配数据缓存
成员函数set_cpu_data()设置分配好的缓存

以上成员函数与protobuf的接口的用法是一样的。

后处理

后处理参考YOLOv5算法移植的C++部分，onnx到caffe模型的转换过程不涉及定点化/量化，因此没有scale参数（或者认为scale为1即可）。