放在前面:
为了尝试各种算法,又来跑yolo了,这次是基于darknet-efficientB0.cfg转caffe的记录
需要用到的一些项目地址:
训练工具--darknet:https://github.com/AlexeyAB/darknet
转caffe工具--darknet to caffe:https://github.com/marvis/pytorch-caffe-darknet-convert
使用caffe-yolo的c++工程--caffe-yolov3:https://github.com/ChenYingpeng/caffe-yolov3
如果想看之前修改的ghost-yolo文件:https://blog.csdn.net/weixin_38715903/article/details/105550619
目录
这次是在上次ghost-yolo转caffe的基础上,继续修改darknet2caffe.py文件,然后这次有点不同,上次使用caffemodel的时候不需要修改caffe-yolov3文件,这次需要修改
这次修改的文件存放在:https://github.com/hualuluu/efficientNetB0-yolo
1.修改转caffe工具
确认efficientDet-B0的结构有哪些需要添加的
- swish激活函数
- 不一样的shortcut操作
- dropout层
A.关于swish激活函数
我用的caffe是ssd的版本,所以没有swish激活函数,以此为前提。
需要做两个操作:为caffe添加swish_layer以及在转caffe工具darknet2caffe.py中添加swish layer的相关操作
elif block['type'] == 'convolutional':
conv_layer = OrderedDict()
conv_layer['bottom'] = bottom
if block.has_key('name'):
conv_layer['top'] = block['name']
conv_layer['name'] = block['name']
else:
conv_layer['top'] = 'layer%d-conv' % layer_id
conv_layer['name'] = 'layer%d-conv' % layer_id
conv_layer['type'] = 'Convolution'
convolution_param = OrderedDict()
convolution_param['num_output'] = block['filters']
prev_filters = block['filters']
convolution_param['kernel_size'] = block['size']
#print(block)
if 'groups' in block:
convolution_param['group']=block['groups']
if 'pad' in block:
if block['pad'] == '1':
convolution_param['pad'] = str(int(convolution_param['kernel_size'])/2)
else:
convolution_param['pad']=str(int(convolution_param['kernel_size'])/2)
convolution_param['stride'] = block['stride']
if block['batch_normalize'] == '1':
convolution_param['bias_term'] = 'false'
else:
convolution_param['bias_term'] = 'true'
conv_layer['convolution_param'] = convolution_param
layers.append(conv_layer)
bottom = conv_layer['top']
if block['batch_normalize'] == '1':
bn_layer = OrderedDict()
bn_layer['bottom'] = bottom
bn_layer['top'] = bottom
if block.has_key('name'):
bn_layer['name'] = '%s-bn' % block['name']
else:
bn_layer['name'] = 'layer%d-bn' % layer_id
bn_layer['type'] = 'BatchNorm'
batch_norm_param = OrderedDict()
batch_norm_param['use_global_stats'] = 'true'
bn_layer['batch_norm_param'] = batch_norm_param
layers.append(bn_layer)
scale_layer = OrderedDict()
scale_layer['bottom'] = bottom
scale_layer['top'] = bottom
if block.has_key('name'):
scale_layer['name'] = '%s-scale' % block['name']
else:
scale_layer['name'] = 'layer%d-scale' % layer_id
scale_layer['type'] = 'Scale'
scale_param = OrderedDict()
scale_param['bias_term'] = 'true'
scale_layer['scale_param'] = scale_param
layers.append(scale_layer)
"""这里添加Sigmoid层的操作"""
if block['activation'] == 'logistic':
sigmoid_layer = OrderedDict()
sigmoid_layer['bottom'] = bottom
sigmoid_layer['top'] = bottom
if block.has_key('name'):
sigmoid_layer['name'] = '%s-act' % block['name']
else:
sigmoid_layer['name'] = 'layer%d-act' % layer_id
sigmoid_layer['type'] = 'Sigmoid'
layers.append(sigmoid_layer)
"""这里添加swish层的操作"""
elif block['activation'] == 'swish':
swish_layer = OrderedDict()
swish_layer['bottom'] = bottom
swish_layer['top'] = bottom
if block.has_key('name'):
swish_layer['name'] = '%s-swish' % block['name']
else:
swish_layer['name'] = 'layer%d-swish' % layer_id
swish_layer['type'] = 'Swish'
layers.append(swish_layer)
B.关于shortcut操作
之前有说过shortcut就是相当于eltwise操作,但这里有一个问题是,efficientDet-B0的结构中存在维度不相同的特征相加的情况
layer filters size/strd(dil) input output
90 conv 576 1 x 1/ 1 60 x 34 x 112 -> 60 x 34 x 576 0.263 BF
....
....
139 upsample 2x 30 x 17 x 128 -> 60 x 34 x 128
140 Shortcut Layer: 90, wt = 0, wn = 0, outputs: 60 x 34 x 128 0.000 BF
比如上面显示的140层是将90层和139层的特征相加,但是我们可以看到两个特征的维度分别为:60*34*576和60*34*128,而且最后的输出为60*34*128。我去看了darknet的源码,是根据输出的维度,来判断需要舍弃的特征参数,这一层中就是将90层的60*34*128特征与139层的60*34*128特征进行eltwise操作。
在caffe中没有这样的层,我又不想自己写,于是,我们可以利用split layer的操作将90层的输出特征,分为60*34*0到60*34*128和60*34*129到60*34*576两个部分。【这里要注意,因为我们只用到了split操作后的第一部分特征60*34*0到60*34*128,另一部分就舍弃了,会影响最后的caffe使用,所以到时要修改caffe-yolov3的detecnet.cpp文件】
最终修改darknet2caffe.py:
elif block['type'] == 'shortcut':
if(int(block['from'])>0):
"""
还是讲一下原理吧:
darknet中的shuortcut有一个参数from,表示除了前一层外,还接受哪一层的特征;
比如刚刚讲的140层的from值就为90,【也可以用-50代替90】
那么这里为什么from参数>0就split呢?
因为比较巧合,effi_B0.cfg中,进行shortcut操作的层,
只要两个输入层特征的维度相同,他们的from值就为负数,
两个输入层特征维度不同,from都是用正数表示的
所以刚好利用这个规律,区分shortcut的操作,
当然如果cfg结构参数有变化就要视情况而定了
"""
#添加split层
prev_layer_id1=int(block['from'])+1
slice_layer = OrderedDict()
slice_layer['bottom']=topnames[prev_layer_id1]
slice_layer['name'] = 'layer%d-slice' % layer_id
top1=slice_layer['name']+'_1'
top2=slice_layer['name']+'_2'
slice_layer['top']=[top1,top2]
#slice_layer['top']=top1
slice_layer['type']='Slice'
slice_param=OrderedDict()
slice_param['axis']='1'
slice_param['slice_point']='128'
slice_layer['slice_param']=slice_param
layers.append(slice_layer)
bottom1 = top1
else:
prev_layer_id1 = layer_id + int(block['from'])
bottom1 = topnames[prev_layer_id1]
#后面是一样的eltwise层基本操作,不做修改
prev_layer_id2 = layer_id - 1
#print('^^^^^^^^^^^^^^^^^^^^^^^^^^^')
#print('topnames:',topnames)
#print(layer_id,prev_layer_id1,prev_layer_id2)
bottom2= topnames[prev_layer_id2]
shortcut_layer = OrderedDict()
shortcut_layer['bottom'] = [bottom1, bottom2]
if block.has_key('name'):
shortcut_layer['top'] = block['name']
shortcut_layer['name'] = block['name']
else:
shortcut_layer['top'] = 'layer%d-shortcut' % layer_id
shortcut_layer['name'] = 'layer%d-shortcut' % layer_id
shortcut_layer['type'] = 'Eltwise'
eltwise_param = OrderedDict()
eltwise_param['operation'] = 'SUM'
shortcut_layer['eltwise_param'] = eltwise_param
layers.append(shortcut_layer)
bottom = shortcut_layer['top']
if block['activation'] != 'linear':
relu_layer = OrderedDict()
relu_layer['bottom'] = bottom
relu_layer['top'] = bottom
if block.has_key('name'):
relu_layer['name'] = '%s-act' % block['name']
else:
relu_layer['name'] = 'layer%d-act' % layer_id
relu_layer['type'] = 'ReLU'
if block['activation'] == 'leaky':
relu_param = OrderedDict()
relu_param['negative_slope'] = '0.1'
relu_layer['relu_param'] = relu_param
layers.append(relu_layer)
topnames[layer_id] = bottom
layer_id = layer_id + 1
C.添加dropout层
这个没什么说的,就是添加一个dropout layer
elif block['type'] == 'dropout':
dropout_layer = OrderedDict()
dropout_layer['bottom'] = bottom
if block.has_key('name'):
dropout_layer['top'] = block['name']
dropout_layer['name'] = block['name']
else:
dropout_layer['top'] = 'layer%d-dropout' % layer_id
dropout_layer['name'] = 'layer%d-dropout' % layer_id
dropout_layer['type'] = 'Dropout'
dropout_param = OrderedDict()
dropout_param['dropout_ratio'] = block['probability']
dropout_layer['dropout_param']=dropout_param
layers.append(dropout_layer)
bottom = dropout_layer['top']
topnames[layer_id] = bottom
layer_id = layer_id+1
2.修改caffe-yolo中的文件
首先要明确,经过prototxt之后的结构的网络输出层是什么,只有需连接yolo layer的两个卷积层吗,不是的。
除了两个卷积层之外,还有的是刚刚我们为了shortcut操作split之后的部分特征,split之后使用了前部分的特征,后部分的特征就没有用了,也会作为输出层输出。
在代码98-101行的部分有如下:
net->Forward();
for(int i =0;i<net->num_outputs();++i){
blobs.push_back(net->output_blobs()[i]);
//这里把所有的输出层传递给blobs,然后送到后面的get_detections函数得到box之类的结果
//刚刚分析过最后输出总共4个特征:两个与yolo相连的conv和split之后的不需要的部分
//【effi_B0结构中有两个这样的split输出】,所以总输出是4个,net->num_outputs()=4
//get_detections函数需要的是与yolo相连接的layer输出的特征,不要多余的
//所以这里,应该找到需要的layer,放入blobs中
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
//修改如下,可以根据情况自己改,肯定有更好的方法
for(int i=0;i<net->num_outputs();++i){
if (i==0 || i==3)//取net->output_blobs()[0],net->output_blobs()[3]放入blobs
{ blobs.push_back(net->output_blobs()[i]);
LOG(INFO) << net->blob_names()[net->output_blob_indices()[i]];
//这一行输出存放到blobs中的layer名,确认一下自己有没有弄错,可以注释掉
}
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
2.具体操作流程
emmm,训练步骤还是一样pass默认都会,使用的就是官方的effi_B0.cfg文件。【不会的看这里-https://blog.csdn.net/weixin_38715903/article/details/103695844】,下载cfg文件,按照darknet的步骤训练就行了;
-
A.darknet to caffe
git clone https://github.com/marvis/pytorch-caffe-darknet-convert
然后下载darknet2caffe.py,替换原有的darknet2caffe.py
【https://github.com/hualuluu/efficientNetB0-yolo】
注意!!下载的darknet2caffe.py中有一些路径需要改成自己的,比如说你的caffe路径【需要一些caffe头文件】
然后:
再运行:
sudo python2.7 darknet2caffe.py cfg/ghostnet-yolo.cfg ghostnet-yolo.weights ghostnet-yolo.prototxt ghostnet-yolo.caffemodel
这一步会得到prototxt和caffemodal
-
B.caffe-yolo
git clone https://github.com/ChenYingpeng/caffe-yolov3
cd caffe-yolov3
修改detectnet.cpp文件:
在代码98-101行的部分有如下:
net->Forward();
for(int i =0;i<net->num_outputs();++i){
blobs.push_back(net->output_blobs()[i]);
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
//修改如下,可以根据情况自己改,肯定有更好的方法
for(int i=0;i<net->num_outputs();++i){
if (i==0 || i==3)//取net->output_blobs()[0],net->output_blobs()[3]放入blobs
{ blobs.push_back(net->output_blobs()[i]);
LOG(INFO) << net->blob_names()[net->output_blob_indices()[i]];
}
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
将生成的caffemodel和prototxt放在./caffemodel和./prototxt文件下【没有就建一个】
修改cmakelist.txt
"""全部都要改成自己的caffe路径"""
# build C/C++ interface
include_directories(${PROJECT_INCLUDE_DIR} ${GIE_PATH}/include)
include_directories(${PROJECT_INCLUDE_DIR}
/home/ubuntu247/liliang/caffe-ssd/include
/home/ubuntu247/liliang/caffe-ssd/build/include
)
file(GLOB inferenceSources *.cpp *.cu )
file(GLOB inferenceIncludes *.h )
cuda_add_library(yolov3-plugin SHARED ${inferenceSources})
target_link_libraries(yolov3-plugin
/home/ubuntu247/liliang/caffe-ssd/build/lib/libcaffe.so
/usr/lib/x86_64-linux-gnu/libglog.so
/usr/lib/x86_64-linux-gnu/libgflags.so.2
/usr/lib/x86_64-linux-gnu/libboost_system.so
/usr/lib/x86_64-linux-gnu/libGLEW.so.1.13
)
如果你在训练中使用的是自己的anchors值,要修改anchors的值(yolo.cpp中),再进行编译;还有yolo.h中的classes数
/*
* Company: Synthesis
* Author: Chen
* Date: 2018/06/04
*/
#include "yolo_layer.h"
#include "blas.h"
#include "cuda.h"
#include "activations.h"
#include "box.h"
#include <stdio.h>
#include <math.h>
//yolov3
//float biases[18] = {10,13,16,30,33,23,30,61,62,45,59,119,116,90,156,198,373,326};
float biases[18] = {7, 15, 16, 18, 22, 32, 9, 40, 20, 71, 37, 39, 52, 65, 70, 110, 105, 208};
/*
* Company: Synthesis
* Author: Chen
* Date: 2018/06/04
*/
#ifndef __YOLO_LAYER_H_
#define __YOLO_LAYER_H_
#include <caffe/caffe.hpp>
#include <string>
#include <vector>
using namespace caffe;
const int classes = 3;
const float thresh = 0.5;
const float hier_thresh = 0.5;
const float nms_thresh = 0.5;
const int num_bboxes = 3;
const int relative = 1;
编译
mkdir build
cd build
cmake ..
make -j12
运行
./x86_64/bin/detectnet ../prototxt/effi-yolo.prototxt ../caffemodel/effi-yolo.caffemodel ../images/bicycle.jpg
应该没有遗漏的地方吧,想到再说,就酱,撒花~