opencv如何调用yolov3（Python版）

陈子迩

已于 2023-07-31 21:17:46 修改

阅读量652

点赞数 1

分类专栏： opencv实战深度学习学习笔记文章标签： opencv YOLO

于 2023-07-31 12:58:52 首次发布

本文链接：https://blog.csdn.net/weixin_45303602/article/details/132020166

版权

opencv实战同时被 2 个专栏收录

28 篇文章 20 订阅

订阅专栏

深度学习学习笔记

12 篇文章 7 订阅

订阅专栏

YOLO是“You Only Look Once”的简称，它虽然不是最精确的算法，但在精确度和速度之间选择的折中，效果也是相当不错。YOLOv3借鉴了YOLOv1和YOLOv2，虽然没有太多的创新点，但在保持YOLO家族速度的优势的同时，提升了检测精度，尤其对于小物体的检测能力。YOLOv3算法使用一个单独神经网络作用在图像上，将图像划分多个区域并且预测边界框和每个区域的概率。

注意：opencv-python 本文使用的版本为4.5.2.52

本文提供的目标检测实时检测代码，也可以使用本地视频

import cv2
import numpy as np

cap = cv2.VideoCapture(0)
whT = 320
confThreshold = 0.5
nmsThreshold = 0.3

classFile = 'classes.txt'
classNames = []
with open(classFile, 'rt') as f:
    classNames = f.read().rstrip('\n').split('\n')

导入所需的库（OpenCV和NumPy）。
打开摄像头（ID为0表示默认的摄像头）。
设置输入图像的大小（whT），置信度阈值（confThreshold）和非最大抑制的阈值（nmsThreshold）。
从文件中读取类别名称（classNames）。

modelConfiguration = 'yolov3.cfg'
modelWeights = 'yolov3.weights'

net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

加载YOLOv3模型的配置文件（modelConfiguration）和预训练权重文件（modelWeights）。

创建一个深度学习网络（net）并设置其计算后端（OpenCV）和目标设备（CPU）。

def findObejects(outputs, img):
    hT, wT, cT = img.shape
    bbox = []
    classIds = []
    confs = []

    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                w, h = int(det[2] * wT), int(det[3] * hT)
                x, y = int((det[0] * wT) - w / 2), int((det[1] * hT) - h / 2)
                bbox.append([x, y, w, h])
                classIds.append(classId)
                confs.append(float(confidence))

定义函数findObejects，用于从YOLOv3的输出中提取检测结果。

将输出中的目标框信息（坐标、宽度、高度）、类别ID和置信度进行解析，将符合条件（置信度大于confThreshold）的目标保存到bbox、classIds和confs列表中。

    # print(len(bbox))
    indices = cv2.dnn.NMSBoxes(bbox, confs, confThreshold, nmsThreshold)
    print(indices)
    for i in indices:
        i = i[0]
        box = bbox[i]
        x, y, w, h = box[0], box[1], box[2], box[3]
        cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 255), 2)
        cv2.putText(img, f'{classNames[classIds[i]]} {int(confs[i]*100)}%',
                    (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 2)

使用非最大抑制（NMS）算法对检测到的目标框进行筛选，去除重叠较多的框。

遍历通过NMS筛选后的目标框，将其绘制在原始图像上，并在框的上方显示目标类别和置信度。

while True:
    success, img = cap.read()
    blob = cv2.dnn.blobFromImage(img, 1/255, (whT, whT), [0, 0, 0], crop=False)
    net.setInput(blob)

    layerNames = net.getLayerNames()
    outputNames = [layerNames[i[0]-1] for i in net.getUnconnectedOutLayers()]

    outputs = net.forward(outputNames)
    findObejects(outputs, img)

    cv2.imshow('image', img)
    cv2.waitKey(1)

在一个无限循环中，不断从摄像头中读取图像。
将图像预处理为网络输入大小的Blob（二进制大型对象）。
使用YOLOv3模型进行推理，得到输出（outputs）。
调用findObejects函数进行目标检测，将检测结果绘制在图像上。
将处理后的图像显示在窗口中，直到按下键盘上的任意键退出循环，完成程序运行。

下面是全部代码，代码需要的配置文件和权重文件我放在百度网盘中

链接：https://pan.baidu.com/s/1zcy0fVQ38NvmM7763mNr7A
提取码：eo38

import cv2
import numpy as np

# 初始化摄像头捕获
cap = cv2.VideoCapture(0)

# YOLOv3模型参数
whT = 320
confThreshold = 0.5
nmsThreshold = 0.3

# 从文件中加载类别名称
classFile = 'classes.txt'
classNames = []
with open(classFile, 'rt') as f:
    classNames = f.read().rstrip('\n').split('\n')

# 加载YOLOv3模型
modelConfiguration = 'yolov3.cfg'
modelWeights = 'yolov3.weights'
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

def findObejects(outputs, img):
    hT, wT, cT = img.shape
    bbox = []
    classIds = []
    confs = []

    # 处理YOLOv3输出，找到超过置信度阈值的目标
    for output in outputs:
        for det in output:
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                # 获取边界框的尺寸，并转换为像素坐标
                w, h = int(det[2] * wT), int(det[3] * hT)
                x, y = int((det[0] * wT) - w / 2), int((det[1] * hT) - h / 2)
                bbox.append([x, y, w, h])
                classIds.append(classId)
                confs.append(float(confidence))

    # 执行非最大抑制，去除重复检测结果
    indices = cv2.dnn.NMSBoxes(bbox, confs, confThreshold, nmsThreshold)

    # 绘制边界框和目标标签
    for i in indices:
        i = i[0]
        box = bbox[i]
        x, y, w, h = box[0], box[1], box[2], box[3]
        cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 255), 2)
        cv2.putText(img, f'{classNames[classIds[i]]} {int(confs[i]*100)}%',
                    (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 2)

while True:
    # 从摄像头获取一帧图像
    success, img = cap.read()

    # 为YOLOv3模型准备图像
    blob = cv2.dnn.blobFromImage(img, 1/255, (whT, whT), [0, 0, 0], crop=False)
    net.setInput(blob)

    # 获取YOLOv3模型的输出层名称
    layerNames = net.getLayerNames()
    outputNames = [layerNames[i[0]-1] for i in net.getUnconnectedOutLayers()]

    # YOLOv3前向推理
    outputs = net.forward(outputNames)

    # 在图像中检测目标并绘制边界框
    findObejects(outputs, img)

    # 显示带有边界框的图像
    cv2.imshow('image', img)
    
    # 当按下 'q' 键时退出循环
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 释放摄像头并关闭窗口
cap.release()
cv2.destroyAllWindows()

陈子迩

关注

1
点赞
踩
16

收藏

觉得还不错? 一键收藏
打赏
0
评论
opencv如何调用yolov3（Python版）

YOLO是“You Only Look Once”的简称，它虽然不是最精确的算法，但在精确度和速度之间选择的折中，效果也是相当不错。YOLOv3借鉴了YOLOv1和YOLOv2，虽然没有太多的创新点，但在保持YOLO家族速度的优势的同时，提升了检测精度，尤其对于小物体的检测能力。YOLOv3算法使用一个单独神经网络作用在图像上，将图像划分多个区域并且预测边界框和每个区域的概率。
复制链接

扫一扫