yolov4-tiny从安装到训练再到python调用接口

最新推荐文章于 2025-04-20 16:27:37 发布

细泡儿

最新推荐文章于 2025-04-20 16:27:37 发布

阅读量5.7k

点赞数 30

分类专栏：小白玩jetson nano 文章标签： python 计算机视觉目标检测

本文链接：https://blog.csdn.net/yzy1119/article/details/121337573

版权

小白玩jetson nano 专栏收录该内容

4 篇文章

订阅专栏

本文档详细介绍了如何从头开始在Darknet框架中安装和配置Yolov4-tiny，包括下载源码、修改配置、编译、测试图像、视频和摄像头。接着，它阐述了训练自己的数据集的步骤，包括创建目录结构、修改配置文件和训练模型。最后，提供了两种Python调用Darknet接口的方法，用于图像检测，分别展示了详细代码示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

（一）安装

在GitHub网址https://github.com/AlexeyAB/darknet下载最新版的darknetAB源码
解压后会生成名为darknet-master的文件夹
将解压的文件放到darknet的空文件夹下
下载 yolov4-tiny模型权重：yolov4-tiny模型权重文件，将下载好的权重放到darknet-master目录下
打开darknet-master目录下makefile文件，修改参数以适应于自己的计算机，GPU、CUDNN、OPENCV是主要修改的，其他参数根据自身需求修改

GPU=0     	# 使用GPU
CUDNN=0		# 使用GPU
CUDNN_HALF=0		# 混合精度训练，用于加速
OPENCV=1		# 使用opencv
AVX=0
OPENMP=0
LIBSO=1		# 生成libdarknet.so，便于python调用darknet模型
ZED_CAMERA=0
ZED_CAMERA_v2_8=0

注意: 如果要用python调用darknet模型接口的话一定把 LIBSO 设为1

上述步骤完成后
在darknet目录下执行make进行编译

make

编译完成后
选择如下指令进行测试图片，视频以及摄像头实时检测

./darknet detector test cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights data/dog.jpg	# 图片测试
./darknet detector demo cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights -ext_output test.mp4		# 视频测试
./darknet detector demo cfg/coco.data cfg/yolov4-tiny.cfg yolov4-tiny.weights -c 0		# 摄像头测试

（二）训练自己的权重文件
用Yolov4-tiny来进行训练，则需要下载Yolov4-tiny的预训练权重：yolov4-tiny预训练权重
并放在darknet-master目录下

之后建立yolov4-tiny训练所需的目录结构

---darknet
  ---darknet-master
	---VOCdevkit
		---VOC2007
			---Annotations	# 存放xml文件
			---ImageSets
				---Main		# 存放训练集和测试集图片索引号的txt文件
				  ---test.txt # 存放测试集图片的路径
				  ---train.txt # 存放训练集图片的路径
			---JPEGImages	# 存放图片文件
			---labels
			  ---***.txt   #存放训练集和测试集的标注信息（如：0 0.002221 0.002221 0.002221 0.002221），***与照片名字相同
			  ---***.txt   #存放训练集和测试集的标注信息（如：0 0.002221 0.002221 0.002221 0.002221），***与照片名字相同
			  ---  #n多个，与图片数量相同

之后，修改训练所需文件（有**.names, **.cfg, **.data三个文件）
修改 **.names
在 darknet-master/data/目录下建立 **.names
参考coco.names，更改自己的.names文件

# 存放自己的类别，这里的类别是“1 2 3 4 5 6 7 8 ”
1 
2
3
4
5
6
7
8

修改 **.names
在 darknet-master/cfg/目录下建立 **.data
参考coco.data，更改自己的.names文件

classes= 8
train  = /darknet/VOCdevkit/VOC2020/ImageSets/Main/number_train.txt
valid  = /darknet/VOCdevkit/VOC2020/ImageSets/Main/number_text.txt
names = /darknet/darknet-master/data/**.names
backup = /darknet/darknet-master/backup/ #训练时生成的权重文件

修改 **.cfg
在 darknet-master/cfg/目录下建立 **.cfg
参考yolov4-tiny.cfg，更改自己的.cfg文件

1）yolov4-tiny.cfg文件第1-7行

[net]
#Testing
#batch=1
#subdivisions=1
#Training
batch=64
subdivisions=16
# 注意：由于是进行训练，这里不需要修改。训练过程中可能出现
# CUDA out of memory的提示，可将这里的subdivisions增
# 大，如32或64，但是数值越大耗时越长，因此需要权衡一下；

（2）yolov4-tiny.cfg文件第8-9行

width=224
height=224
# 可以写别的大小，比例是1:1 
# 但是这里的数值必须是32的倍数，
# 这里也是数值越大耗时越长；

（3）第20行的参数max_batches

max_batches = 16000  #max_batches = classes*2000 也有写 max_batches = classes*1000，这里写的是8*2000
policy=steps
steps=12800,14400   # steps=max_batches*0.8, max_batches*0.9
scales=.1,.1
# 更改max_batches， steps两处

（4）继续修改yolov4-tiny.cfg文件，按Ctrl+F键，搜索“classes”，将classes=80改为classes=2，并将classes前面最近的filters修改为39，计算由来（classes+5）*3=39；

注意：把所有的都改了

[convolutional]
size=1
stride=1
pad=1
filters=39  #（classes+5）*3=21 这里是（8+5）*3 = 39
activation=linear

[yolo]
mask = 3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes=8  # 自己的来类别数量
num=6
# 更改filters， classes两处

注意：把所有的都改了

最后：打开终端，切换到darknet目录下
使用

./darknet detector train cfg/***.data cfg/***.cfg yolov4-tiny.conv.29 -cpu 
# 或者 -gpu 训练模型

回车开始训练

训练的过程中，生成的权重文件会存放在/darknet/backup文件夹下

（三）python调用darknet 接口
有两种方法
法一：
调用之前，必须，在make编译前把将makefile文件里的LIBSO设为1
如果没有设置

make clean #清除编译
make       #再编译
# 也可以直接 make，为了保险，make clean一下 ，再make

之后在darknet目录下，建立darknet_me.py

import os
import cv2
import numpy as np
import darknet
import time
 
class Detect:
    def __init__(self, metaPath, configPath, weightPath, gpu_id=2, batch=1):
        '''
        :param metaPath:   ***.data 存储各种参数
        :param configPath: ***.cfg  网络结构文件
        :param weightPath: ***.weights yolo的权重
        :param batch:      ########此类只支持batch=1############
        '''
        assert batch == 1, "batch必须为1"
        # 设置gpu_id
        darknet.set_gpu(0)
        # 网络
        network, class_names, class_colors = darknet.load_network(
            configPath,
            metaPath,
            weightPath,
            batch_size=batch
        )
        self.network = network
        self.class_names = class_names
        self.class_colors = class_colors
 
    def bbox2point(self, bbox):
        x, y, w, h = bbox
        xmin = x - (w / 2)
        xmax = x + (w / 2)
        ymin = y - (h / 2)
        ymax = y + (h / 2)
        return (xmin, ymin, xmax, ymax)
 
    def point2bbox(self, point):
        x1, y1, x2, y2 = point
        x = (x1 + x2) / 2
        y = (y1 + y2) / 2
        w = (x2 - x1)
        h = (y2 - y1)
        return (x, y, w, h)
 
    def image_detection(self, image_bgr, network, class_names, class_colors, thresh=0.5):
        # 判断输入图像是否为3通道
        if len(image_bgr.shape) == 2:
            image_bgr = np.stack([image_bgr] * 3, axis=-1)
        # 获取原始图像大小
        orig_h, orig_w = image_bgr.shape[:2]
 
        width = darknet.network_width(network)
        height = darknet.network_height(network)
        darknet_image = darknet.make_image(width, height, 3)
 
        # image = cv2.imread(image_path)
        image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB)
        image_resized = cv2.resize(image_rgb, (width, height), interpolation=cv2.INTER_LINEAR)
 
        darknet.copy_image_from_bytes(darknet_image, image_resized.tobytes())
        detections = darknet.detect_image(network, class_names, darknet_image, thresh=thresh)
        darknet.free_image(darknet_image)
        new_detections = []
        for detection in detections:
            pred_label, pred_conf, (x, y, w, h) = detection
            new_x = x / width * orig_w
            new_y = y / height * orig_h
            new_w = w / width * orig_w
            new_h = h / height * orig_h
 
            # 可以约束一下
            (x1, y1, x2, y2) = self.bbox2point((new_x, new_y, new_w, new_h))
            x1 = x1 if x1 > 0 else 0
            x2 = x2 if x2 < orig_w else orig_w
            y1 = y1 if y1 > 0 else 0
            y2 = y2 if y2 < orig_h else orig_h
 
            (new_x, new_y, new_w, new_h) = self.point2bbox((x1, y1, x2, y2))
 
            new_detections.append((pred_label, pred_conf, (new_x, new_y, new_w, new_h)))
 
        image = darknet.draw_boxes(new_detections, image_rgb, class_colors)
        return cv2.cvtColor(image, cv2.COLOR_RGB2BGR), new_detections
 
    def predict_image(self, image_bgr, thresh=0.5, is_show=True, save_path=''):
        '''
        :param image_bgr:  输入图像
        :param thresh:     置信度阈值
        :param is_show:   是否将画框之后的原始图像返回
        :param save_path: 画框后的保存路径, eg='/home/aaa.jpg'
        :return:
        '''
        draw_bbox_image, detections = self.image_detection(image_bgr, self.network, self.class_names, self.class_colors,
                                                           thresh)
        # detections = [('helmet', '99.76', (271.18813918187067, 162.8237687624418, 88.92447724709143, 112.84086117377649))]
        if is_show:
            if save_path:
                cv2.imwrite(save_path, draw_bbox_image)
            return draw_bbox_image
        return detections
 
  
detect = Detect(metaPath=r'cfg/***.data',
                    configPath=r'cfg/***.cfg',
                    weightPath=r'***.weights',# 生成的权重文件
                    gpu_id=1)
if __name__ == '__main__':
    image_root = r'/home/xipaoer/yolov4_tiny/darknet-master/data/78_2.jpg'
    image = cv2.imread(image_root)
#    print(image)
    save_root = r'/home/xipaoer/Desktop/number/'
    draw_bbox_image = detect.predict_image(image, save_path=os.path.join(save_root, "12.jpg"))

然后根据自己的需求更改就可以了

法二：
使用opencv调用yolov4-tiny（需要opencv版本4.4.0及以上）
注意：jetson nano 不要随意改变opencv版本，可能会造成“核心已转储”的问题，jetson nano不建议使用本方法

import numpy as np
import cv2
import os
import random
 
weights_path = '***.weights'     #模型权重文件
cfg_path = 'cfg/***.cfg'         #模型配置文件
labels_path = 'data/***.names'#模型类别标签文件

#初始化一些参数
LABELS = open(labels_path).read().strip().split("\n")
boxes = []
confidences = []
classIDs = []
color_list=[]
for i in range(len(LABELS)):
    color_list.append([random.randint(0,255),random.randint(0,255),random.randint(0,255)])
 
#加载网络配置与训练的权重文件 构建网络
net = cv2.dnn.readNetFromDarknet(cfg_path, weights_path)
 
#读入待检测的图像
image = cv2.imread(os.path.join("img","1.jpg"))
#得到图像的高和宽
(H,W) = image.shape[0: 2]
 
 
#得到YOLO需要的输出层
ln = net.getLayerNames()
out = net.getUnconnectedOutLayers() #得到未连接层得序号  [[200] /n [267]  /n [400] ]
x = []
for i in out:   # 1=[200]
    x.append(ln[i[0]-1])    # i[0]-1    取out中的数字  [200][0]=200  ln(199)= 'yolo_82'
ln=x
# ln  =  ['yolo_82', 'yolo_94', 'yolo_106']  得到 YOLO需要的输出层
 
 
 
#从输入图像构造一个blob，然后通过加载的模型，给我们提供边界框和相关概率
#blobFromImage(image, scalefactor=None, size=None, mean=None, swapRB=None, crop=None, ddepth=None)
blob = cv2.dnn.blobFromImage(image, 1 / 255.0, (416, 416),swapRB=True, crop=False)
#构造了一个blob图像，对原图像进行了图像的归一化，缩放了尺寸 ，对应训练模型
 
net.setInput(blob)
layerOutputs = net.forward(ln)  #ln此时为输出层名称  ，向前传播  得到检测结果
 
for output in layerOutputs:  #对三个输出层 循环
    for detection in output:  #对每个输出层中的每个检测框循环
        scores=detection[5:]  #detection=[x,y,h,w,c,class1,class2] scores取第6位至最后
        classID = np.argmax(scores)#np.argmax反馈最大值的索引
        confidence = scores[classID]
        if confidence >0.5:#过滤掉那些置信度较小的检测结果
            box = detection[0:4] * np.array([W, H, W, H])
            #print(box)
            (centerX, centerY, width, height)= box.astype("int")
            # 边框的左上角
            x = int(centerX - (width / 2))
            y = int(centerY - (height / 2))
            # 更新检测出来的框
            boxes.append([x, y, int(width), int(height)])
            confidences.append(float(confidence))
            classIDs.append(classID)
 
idxs=cv2.dnn.NMSBoxes(boxes, confidences, 0.2,0.3)
box_seq = idxs.flatten()#[ 2  9  7 10  6  5  4]
if len(idxs)>0:
    for seq in box_seq:
        (x, y) = (boxes[seq][0], boxes[seq][1])  # 框左上角
        (w, h) = (boxes[seq][2], boxes[seq][3])  # 框宽高
        # if classIDs[seq]==0: #根据类别设定框的颜色
        #     color = [0,0,255]
        # else:
        #     color = [0,255,0]
        cv2.rectangle(image, (x, y), (x + w, y + h), color_list[classIDs[seq]], 2)  # 画框
        text = "{}: {:.4f}".format(LABELS[classIDs[seq]], confidences[seq])
        cv2.putText(image, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.8, color_list[classIDs[seq]],2)  # 写字
cv2.namedWindow('Image', cv2.WINDOW_AUTOSIZE)
cv2.imshow("Image", image)
cv2.waitKey(0)

然后根据自己的需求更改就可以了