2021SC@SDUSC山东大学软件学院软件工程应用与实践——yolov5代码分析——第十二篇——detect.py（1）

最新推荐文章于 2024-05-22 01:09:29 发布

Grey Cluster

最新推荐文章于 2024-05-22 01:09:29 发布

阅读量805

点赞数 2

分类专栏： yolov5 文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/qq_53219137/article/details/122184207

版权

yolov5 专栏收录该内容

13 篇文章 14 订阅

订阅专栏

2021SC@SDUSC

请不要忽视代码中的注释

导入第三方库

import argparse
import sys
from pathlib import Path

import cv2
import numpy as np
import torch
import torch.backends.cudnn as cudnn

FILE = Path(__file__).resolve()
ROOT = FILE.parents[0]  # YOLOv5 root directory
if str(ROOT) not in sys.path:
    sys.path.append(str(ROOT))  # add ROOT to PATH

from models.experimental import attempt_load
from utils.datasets import LoadImages, LoadStreams
from utils.general import apply_classifier, check_img_size, check_imshow, check_requirements, check_suffix, colorstr, \
    increment_path, is_ascii, non_max_suppression, print_args, save_one_box, scale_coords, set_logging, \
    strip_optimizer, xyxy2xywh
from utils.plots import Annotator, colors
from utils.torch_utils import load_classifier, select_device, time_sync

argparse ：python的命令行解析的标准模块可以让我们直接在命令行中就可以向程序中传入参数并让程序运行

sys ： sys系统模块包含了与Python解释器和它的环境有关的函数。

Path ：Path将str转换为Path对象使字符串路径易于操作的模块

cv2 ： opencv模块

torch ： pytorch模块

cudnn ： cuda模块

设置opt参数

def parse_opt():
    parser = argparse.ArgumentParser()
    parser.add_argument('--weights', nargs='+', type=str, default='yolov5s.pt', help='model path(s)')
    parser.add_argument('--source', type=str, default='data/images', help='file/dir/URL/glob, 0 for webcam')
    parser.add_argument('--imgsz', '--img', '--img-size', nargs='+', type=int, default=[640], help='inference size h,w')
    parser.add_argument('--conf-thres', type=float, default=0.25, help='confidence threshold')
    parser.add_argument('--iou-thres', type=float, default=0.45, help='NMS IoU threshold')
    parser.add_argument('--max-det', type=int, default=1000, help='maximum detections per image')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--view-img', action='store_true', help='show results')
    parser.add_argument('--save-txt', action='store_true', help='save results to *.txt')
    parser.add_argument('--save-conf', action='store_true', help='save confidences in --save-txt labels')
    parser.add_argument('--save-crop', action='store_true', help='save cropped prediction boxes')
    parser.add_argument('--nosave', action='store_true', help='do not save images/videos')
    parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3')
    parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS')
    parser.add_argument('--augment', action='store_true', help='augmented inference')
    parser.add_argument('--visualize', action='store_true', help='visualize features')
    parser.add_argument('--update', action='store_true', help='update all models')
    parser.add_argument('--project', default='runs/detect', help='save results to project/name')
    parser.add_argument('--name', default='exp', help='save results to project/name')
    parser.add_argument('--exist-ok', action='store_true', help='existing project/name ok, do not increment')
    parser.add_argument('--line-thickness', default=3, type=int, help='bounding box thickness (pixels)')
    parser.add_argument('--hide-labels', default=False, action='store_true', help='hide labels')
    parser.add_argument('--hide-conf', default=False, action='store_true', help='hide confidences')
    parser.add_argument('--half', action='store_true', help='use FP16 half-precision inference')
    opt = parser.parse_args()
    opt.imgsz *= 2 if len(opt.imgsz) == 1 else 1  # expand
    print_args(FILE.stem, opt)
    return opt

weights: 模型的权重地址默认 weights/best.pt

source: 测试数据文件(图片或视频)的保存路径默认data/images

imgsz: 网络输入图片的大小默认640

conf-thres: object置信度阈值默认0.25

iou-thres: 做nms的iou阈值默认0.45

max-det: 每张图片最大的目标个数默认1000

device: 设置代码执行的设备 cuda device, i.e. 0 or 0,1,2,3 or cpu

view-img: 是否展示预测之后的图片或视频默认False

save-txt: 是否将预测的框坐标以txt文件格式保存默认True 会在runs/detect/expn/labels下生成每张图片预测的txt文件

save-conf: 是否保存预测每个目标的置信度到预测tx文件中默认True

save-crop: 是否需要将预测到的目标从原图中扣出来剪切好并保存会在runs/detect/expn下生成crops文件，将剪切的图片保存在里面默认False

nosave: 是否不要保存预测后的图片默认False 就是默认要保存预测后的图片

classes: 在nms中是否是只保留某些特定的类默认是None 就是所有类只要满足条件都可以保留

agnostic-nms: 进行nms是否也除去不同类别之间的框默认False

augment: 预测是否也要采用数据增强 TTA

update: 是否将optimizer从ckpt中删除更新模型默认False

project: 当前测试结果放在哪个主文件夹下默认runs/detect

name: 当前测试结果放在run/detect下的文件名默认是exp

exist-ok: 是否存在当前文件默认False 一般是 no exist-ok 连用所以一般都要重新创建文件夹

line-thickness: 画框的框框的线宽默认是 3

hide-labels: 画出的框框是否需要隐藏label信息默认False

hide-conf: 画出的框框是否需要隐藏conf信息默认False

half: 是否使用半精度 Float16 推理可以缩短推理时间但是默认是False

main函数

def main(opt):
    # 调用colorstr函数彩色打印选择的opt参数
    print(colorstr('detect: ') + ', '.join(f'{k}={v}' for k, v in vars(opt).items()))
    # 检查已经安装的包是否满足requirements对应txt文件的要求
    check_requirements(exclude=('tensorboard', 'thop'))
    # 执行run 开始推理
    run(**vars(opt))

run函数

载入参数

def run(weights='weights/yolov5s.pt',  # 权重文件地址 默认 weights/best.pt
        source='data/images',          # 测试数据文件(图片或视频)的保存路径 默认data/images
        imgsz=640,                     # 输入图片的大小 默认640(pixels)
        conf_thres=0.25,               # object置信度阈值 默认0.25  用在nms中
        iou_thres=0.45,                # 做nms的iou阈值 默认0.45   用在nms中
        max_det=1000,                  # 每张图片最多的目标数量  用在nms中
        device='',                     # 设置代码执行的设备 cuda device, i.e. 0 or 0,1,2,3 or cpu
        view_img=False,                # 是否展示预测之后的图片或视频 默认False
        save_txt=False,   # 是否将预测的框坐标以txt文件格式保存 默认True 会在runs/detect/expn/labels下生成每张图片预测的txt文件
        save_conf=False,  # 是否保存预测每个目标的置信度到预测tx文件中 默认True
        save_crop=False,  # 是否需要将预测到的目标从原图中扣出来 剪切好 并保存 会在runs/detect/expn下生成crops文件，将剪切的图片保存在里面  默认False
        nosave=False,     # 是否不要保存预测后的图片  默认False 就是默认要保存预测后的图片
        classes=None,     # 在nms中是否是只保留某些特定的类 默认是None 就是所有类只要满足条件都可以保留
        agnostic_nms=False,     # 进行nms是否也除去不同类别之间的框 默认False
        augment=False,          # 预测是否也要采用数据增强 TTA 默认False
        update=False,           # 是否将optimizer从ckpt中删除  更新模型  默认False
        project='runs/detect',  # 当前测试结果放在哪个主文件夹下 默认runs/detect
        name='exp',             # 当前测试结果放在run/detect下的文件名  默认是exp  =>  run/detect/exp
        exist_ok=False,         # 是否存在当前文件 默认False 一般是 no exist-ok 连用  所以一般都要重新创建文件夹
        line_thickness=3,       # bounding box thickness (pixels)   画框的框框的线宽  默认是 3
        hide_labels=False,      # 画出的框框是否需要隐藏label信息 默认False
        hide_conf=False,        # 画出的框框是否需要隐藏conf信息 默认False
        half=False,             # 是否使用半精度 Float16 推理 可以缩短推理时间 但是默认是False
        prune_model=False,      # 是否使用模型剪枝 进行推理加速
        fuse=False,             # 是否使用conv + bn融合技术 进行推理加速
        ):

初始化一些配置


# 是否保存预测后的图片 默认nosave=False 所以只要传入的文件地址不是以.txt结尾 就都是要保存预测后的图片的save_img = not nosave and not source.endswith('.txt')  # save inference images
# 是否是使用webcam 网页数据 一般是Fasle  因为我们一般是使用图片流LoadImages(可以处理图片/视频流文件)
    webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith(
        ('rtsp://', 'rtmp://', 'http://', 'https://'))

    # Directories
# 检查当前Path(project) / name是否存在 如果存在就新建新的save_dir 默认exist_ok=False 需要重建
    # 将原先传入的名字扩展成新的save_dir 如runs/detect/exp存在 就扩展成 runs/detect/exp1
    save_dir = increment_path(Path(project) / name, exist_ok=exist_ok)  # increment run
# 如果需要save txt就新建save_dir / 'labels' 否则就新建save_dir
    # 默认save_txt=False 所以这里一般都是新建一个 save_dir(runs/detect/expn)    
(save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

    # Initialize
     # Initialize 初始化日志信息
    set_logging()
# 获取当前主机可用的设备
    device = select_device(device)
 # 如果设配是GPU 就使用half(float16)  包括模型半精度和输入图片半精度
    half &= device.type != 'cpu'  # half precision only supported on CUDA

载入模型和模型参数并调整模型

# Load model
    w = weights[0] if isinstance(weights, list) else weights
    classify, suffix, suffixes = False, Path(w).suffix.lower(), ['.pt', '.onnx', '.tflite', '.pb', '']
    check_suffix(w, suffixes)  # check weights have acceptable suffix
    pt, onnx, tflite, pb, saved_model = (suffix == x for x in suffixes)  # backend booleans
    stride, names = 64, [f'class{i}' for i in range(1000)]  # assign defaults
    if pt:
        model = attempt_load(weights, map_location=device)  # load FP32 model
        stride = int(model.stride.max())  # model stride
        names = model.module.names if hasattr(model, 'module') else model.names  # get class names
        if half:
            model.half()  # to FP16
        if classify:  # second-stage classifier
            modelc = load_classifier(name='resnet50', n=2)  # initialize
            modelc.load_state_dict(torch.load('resnet50.pt', map_location=device)['model']).to(device).eval()
    elif onnx:
        check_requirements(('onnx', 'onnxruntime'))
        import onnxruntime
        session = onnxruntime.InferenceSession(w, None)
    else:  # TensorFlow models
        check_requirements(('tensorflow>=2.4.1',))
        import tensorflow as tf
        if pb:  # https://www.tensorflow.org/guide/migrate#a_graphpb_or_graphpbtxt
            def wrap_frozen_graph(gd, inputs, outputs):
                x = tf.compat.v1.wrap_function(lambda: tf.compat.v1.import_graph_def(gd, name=""), [])  # wrapped import
                return x.prune(tf.nest.map_structure(x.graph.as_graph_element, inputs),
                               tf.nest.map_structure(x.graph.as_graph_element, outputs))

            graph_def = tf.Graph().as_graph_def()
            graph_def.ParseFromString(open(w, 'rb').read())
            frozen_func = wrap_frozen_graph(gd=graph_def, inputs="x:0", outputs="Identity:0")
        elif saved_model:
            model = tf.keras.models.load_model(w)
        elif tflite:
            interpreter = tf.lite.Interpreter(model_path=w)  # load TFLite model
            interpreter.allocate_tensors()  # allocate
            input_details = interpreter.get_input_details()  # inputs
            output_details = interpreter.get_output_details()  # outputs
            int8 = input_details[0]['dtype'] == np.uint8  # is TFLite quantized uint8 model
    imgsz = check_img_size(imgsz, s=stride)  # check image size
    ascii = is_ascii(names)  # names are ascii (use PIL for UTF-8)

加载推理数据

# Dataloader
# 通过不同的输入源来设置不同的数据加载方式
    if webcam:
        # 一般不会使用webcam模式从网页中获取数据
        view_img = check_imshow()
        cudnn.benchmark = True  # set True to speed up constant image size inference
        dataset = LoadStreams(source, img_size=imgsz, stride=stride, auto=pt)
        bs = len(dataset)  # batch_size
    else:
        # 一般是直接从source文件目录下直接读取图片或者视频数据
        dataset = LoadImages(source, img_size=imgsz, stride=stride, auto=pt)
        bs = 1  # batch_size
    vid_path, vid_writer = [None] * bs, [None] * bs

推理前测试

# Run inference
# 这里先设置一个全零的Tensor进行一次前向推理 判断程序是否正常
    if pt and device.type != 'cpu':
        model(torch.zeros(1, 3, *imgsz).to(device).type_as(next(model.parameters())))  # run once
    dt, seen = [0.0, 0.0, 0.0], 0

Grey Cluster

关注

2
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
2021SC@SDUSC山东大学软件学院软件工程应用与实践——yolov5代码分析——第十二篇——detect.py（1）

2021SC@SDUSC目录导入第三方库设置opt参数main函数run函数请不要忽视代码中的注释导入第三方库import argparseimport sysfrom pathlib import Pathimport cv2import numpy as npimport torchimport torch.backends.cudnn as cudnnFILE = Path(__file__).resolve()ROOT = FILE.parent
复制链接

扫一扫