[深度学习]C++调用Python-YOLO模型进行目标检测

最新推荐文章于 2025-01-03 22:23:52 发布

Chris_Liu_

最新推荐文章于 2025-01-03 22:23:52 发布

阅读量6.6k

点赞数 16

文章标签： python 人工智能 opencv 深度学习

本文链接：https://blog.csdn.net/Chris_Liu_/article/details/119739884

版权

本文介绍了如何在C++环境中调用Python的YOLOv5模型进行目标检测，包括环境配置、C++调用Python的步骤、YOLOv5源码的修改以及C++读取Python返回值的方法。通过示例代码详细阐述了各个步骤，实现了从C++读取图片，通过YOLOv5模型进行目标检测，并将结果返回到C++的过程。

摘要由CSDN通过智能技术生成

文章目录：

前言
C++调用Python的步骤
修改YOLOv5源码
C++读取Python返回值

前言

目前深度学习算法大多数是基于Python实现，但一些项目的框架是用C++搭建，所以就出现了在C++中调用模型的问题。本文主要记录C++调用Python-YOLOv5模型的步骤，实现C++中读取图片然后传入YOLO模型中进行检测，最后将类名、坐标、置信度返回到C++中。开发环境为QT5、Python3.8、Opencv3.414以及运行YOLOv5源码的虚拟环境。

一、C++调用Python步骤

1.配置环境

Python中自带有C++接口。首先需要导入Python目录下的include文件夹和库文件。

我的Python是用Anaconda安装，所以这里我加入的是Anaconda虚拟环境里的路径，同时也导入了Opencv3.4。

在.pro文件中加入以下内容，根据自己的路径更改：

INCLUDEPATH+=D:/Opencv_Source/build_x64/install/include     \
             D:/Anaconda/envs/yolov5/include        \
             D:/Anaconda/envs/yolov5/Lib/site-packages/numpy/core/include/numpy

Debug:
{
LIBS+=D:/Opencv_Source/build_x64/install/x64/vc14/lib/opencv_world3414d.lib     \
      D:/Anaconda/envs/yolov5/libs/python38_d.lib
}

Release:
{
LIBS+=D:/Opencv_Source/build_x64/install/x64/vc14/lib/opencv_world3414.lib      \
      D:/Anaconda/envs/yolov5/libs/python38.lib
}

2.调用Python步骤

大致步骤为：

（1）初始化

（2）设置文件所在路径

（3）调用文件

（4）获得函数列

（5）调用函数

下面直接贴代码，包含前四部，可根据注释理解：

PyObject* pModule; //.py文件
PyObject* pFunc;  //py文件中的函数
PyObject* pClass; //类
PyObject* pInstance; //实例
PyObject* args;//参数

Py_SetPythonHome(L"D:/Anaconda/envs/yolov5");//指定python.exe位置
Py_Initialize();//使用python之前，要调用Py_Initialize();这个函数进行初始化

PyRun_SimpleString("import sys");
PyRun_SimpleString("sys.path.append('./')");//设置.py文件所在位置

//file为不包含扩展名的文件名
pModule = PyImport_ImportModule(file); //调用上述路径下的.py文件
if (pModule == NULL)
{
    cout << "Can't find the python file!" << endl;
    return 0;
}
cout << "find file succed" << endl;

// 模块的字典列表
PyObject* pDict = PyModule_GetDict(pModule); //获得Python模块中的函数列
if (pDict == NULL)
{
    cout << "Can't find the dictionary!" << endl;
    return 0;
}
cout << "find dictionary succed" << endl;

3.调用函数并传入参数

调用Python模块中的函数只需要两句代码，但C++中并没有直接将Mat类转换为Python数据类型的函数。

这里借鉴了别的博文的代码，将C++中Mat类型里的数据转换为Python里的元组作为参数传入Python模块.附链接：https://blog.csdn.net/qq_38109843/article/details/87969732

以下为C++中的代码

    import_array();

    int m, n;
    n = img.cols*3;
    m = img.rows;
    unsigned char *data = (unsigned  char*)malloc(sizeof(unsigned char) * m * n);
    int p = 0;
    for (int i = 0; i < m; i++)
    {
        for (int j = 0; j < n; j++)
        {
            data[p] = img.at<unsigned char>(i, j);
            p++;
        }
    }

    npy_intp Dims[2] = { m, n }; //给定维度信息
    PyObject*PyArray = PyArray_SimpleNewFromData(2, Dims, NPY_UBYTE, data);

    PyObject *ArgArray = PyTuple_New(2);
    PyObject *arg = PyLong_FromLong(30);
    PyTuple_SetItem(ArgArray, 0, PyArray);
    PyTuple_SetItem(ArgArray, 1, arg);

    //pDict是Python模块中的函数列，function是函数名
    PyObject*pFunc = PyDict_GetItemString(pDict, function);     //获取函数

    //ArgArray是传入的参数，pRet是返回值
    PyObject* pRet= PyObject_CallObject(pFunc, ArgArray);       //调用函数

Python中的代码：

def arrayreset(array):
    a = array[:, 0:len(array[0] - 2):3]
    b = array[:, 1:len(array[0] - 2):3]
    c = array[:, 2:len(array[0] - 2):3]
    a = a[:, :, None]
    b = b[:, :, None]
    c = c[:, :, None]
    m = np.concatenate((a, b, c), axis=2)
    return m

注意传入的图像必须是RGB图像。

二、修改YOLOv5源码

1.YOLOv5环境配置

YOLO的环境配置我也是照着别人的博文配置，作为一个知识尚匮乏的大学生就不再写一篇误人子弟了。

直接上链接：https://blog.csdn.net/kasaiki/article/details/108651751

训练自己模型的方法：https://blog.csdn.net/weixin_44936889/article/details/110661862

2.修改YOLOv5代码

YOLOv5源码中就已经有detect.py文件用作目标检测，其功能也非常丰富，可以改变参数来实现不同的输入输出方式，这里我就直接做减法，只实现传入一张图片进行检测，返回目标的类名、坐标和置信度。

def detect(image,a=1,
           weights='best.pt',  # model.pt path(s)
           imgsz=640,  # inference size (pixels)
           conf_thres=0.25,  # confidence threshold
           iou_thres=0.45,  # NMS IOU threshold
           device='cpu',
           max_det=1000,  # maximum detections per image
           classes=None,  # filter by class: --class 0, or --class 0 2 3
           agnostic_nms=False,  # class-agnostic NMS
           augment=False,  # augmented inference
           line_thickness=3,  # bounding box thickness (pixels)
           hide_labels=False,  # hide labels
           hide_conf=False,  # hide confidences
           half=False,  # use FP16 half-precision inference
           ):

    #Initialize
    set_logging()
    device = select_device(device)
    half &= device.type != 'cpu'  # half precision only supported on CUDA

    #加载图像
    im0s = arrayreset(image)
    img = letterbox(im0s)[0]
    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
    img = np.ascontiguousarray(img)

    # Load model
    model = attempt_load(weights, map_location=device)  # load FP32 model
    stride = int(model.stride.max())  # model stride
    imgsz = check_img_size(imgsz, s=stride)  # check image size
    names = model.module.names if hasattr(model, 'module') else model.names  # get class names
    if half:
        model.half()  # to FP16

    # Second-stage classifier
    classify = False
    if classify:
        modelc = load_classifier(name='resnet101', n=2)  # initialize
        modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']).to(device).eval()

    # Run inference
    if device.type != 'cpu':
        model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
    t0 = time.time()

    img = torch.from_numpy(img).to(device)
    img = img.half() if half else img.float()  # uint8 to fp16/32
    img /= 255.0  # 0 - 255 to 0.0 - 1.0
    if img.ndimension() == 3:
        img = img.unsqueeze(0)

    # Inference
    t1 = time_synchronized()
    pred = model(img, augment=augment)[0]

    # Apply NMS
    pred = non_max_suppression(pred, conf_thres, iou_thres, classes, agnostic_nms, max_det=max_det)
    t2 = time_synchronized()

    # Apply Classifier
    if classify:
        pred = apply_classifier(pred, modelc, img, im0s)

    info = []

    for i, det in enumerate(pred):  # detections per image
        if len(det):
            # Rescale boxes from img_size to im0 size
            det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0s.shape).round()

            for *xyxy, conf, cls in reversed(det):# Add bbox to image
                c = int(cls)  # integer class
                label = None if hide_labels else (names[c] if hide_conf else f'{names[c]} {conf:.2f}')
                plot_one_box(xyxy, im0s, label=label, color=colors(c, True), line_thickness=line_thickness)
                x1,y1,x2,y2=xyxy[0].item(), xyxy[1].item(), xyxy[2].item(), xyxy[3].item()
                info.append(names[c])
                info.append((x1, y1, x2, y2,conf.item()))

    cv2.imshow('show', im0s)
    print(f'Done. ({time.time() - t0:.3f}s)')
    cv2.waitKey(0)  # 1 millisecond
    return info

返回的info是一个列表，其格式是[name,(x1,y1,x2,y2,value)......]，name是类名，以字符串表示，元组里是坐标和置信度。将识别到的目标依次存储到列表里，再返回到C++中解析即可。

三、C++读取返回值

C++中有函数能转换Python的数据类型,这里我们返回值里有列表、字符串和元组，都有对应的函数进行转换。

我是看这篇博文学习的：https://blog.csdn.net/stu_csdn/article/details/69488385

下面附上代码：

    char * buffer1;     //储存Python文件返回值
    PyObject *ListItem;

    //定义坐标，置信度
    float x1 = 0;
    float y1=0;
    float x2 = 0;
    float y2=0;
    float value=0;

    if(PyList_Check(pRet))  //检查返回值是不是列表
    {
         for(int i=0;i<PyList_Size(pRet);i+=2)
         {
             ListItem=PyList_GetItem(pRet,i);      //读取列表里的第i个元素
             PyArg_Parse(ListItem,"s",&buffer1);    //转换为字符串类型
             buffer2=buffer1;
             cout<<buffer2<<endl;

             ListItem=PyList_GetItem(pRet,i+1);
             PyArg_ParseTuple(ListItem, "f|f|f|f|f", &x1,&y1,&x2,&y2,&value);//转换为浮点型
             cout<<"x1:  "<<x1<<"y1: "<<y1<<"x2: "<<x2<<"y2: "<<y2<<"  "<<value<<endl;
         }
    }

整个代码运行得到的结果：

总结

想说一下自己短短涉猎深度学习半年多时间的感想。在学习深度学习之前我只是学习了Opencv来做一些比赛，后来看有图像处理的基础就跟着老师做项目才进阶深度学习。但我确实也像网上经常被嘲笑的调包侠一样，做深度学习也就会调个包，改改参数，工程能力确实有比较大的锻炼，但没有理论基础。所以接下来也会补一下理论知识来丰富自己，也想不只是学习深度学习的算法，多学一些Opencv的优秀算法。