目标检测算法——YOLOv5/YOLOv7如何改变bbox检测框的粗细大小

本文链接：https://blog.csdn.net/m0_53578855/article/details/124056604

本文介绍了YOLO目标检测中遇到的预测边框过细和遮挡目标的问题，以及如何通过调整`detect.py`文件中的`line_thickness`参数来改变预测框的粗细。同时，分享了多个YOLO系列算法的创新与改进项目，包括主干网络、轻量化结构、注意力机制、检测头部、损失函数等多个方面，旨在提升检测精度和性能。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

深度学习Tricks，第一时间送达

目标检测 YOLOv5 常见的边框（bounding box ）坐标表示方法

边框是在图像上标记目标的矩形。边框的标注有多种格式。每种格式都使用其特定的边框坐标表示。常见的包括Pascal VOC、COCO、YOLO。

pascal_voc
边框坐标编码是[x_min, y_min, x_max, y_max]
x_min和y_min表示边框左上角坐标，x_max和y_max表示边框的右下脚坐标。
例：[98, 345, 420, 462]

coco
边框坐标编码是[x_min, y_min, width, height]
表示左上角的坐标以及边框的宽度和高度。
例：[98, 345, 322, 117]

yolo
边框坐标编码[x_center, y_center, width, height]，这4个值是经过数据规范化（normalized ）的。
x_center, y_center表示边框的中心位置， width, height分别表示边框的宽度和高度
边框的宽度是322,高度是117

不规范化是

[(98 + （322 / 2)), (345 + （117 / 2)), 322, 117]=[259, 403.5, 322, 117]

规范化方法是

[259 / 640, 403.5 / 480, 322 / 640, 117 / 480]

最终结果是

[0.4046875, 0.840625, 0.503125, 0.24375].

问题1：

由于不同数据集中图片分辨率存在一定的差异性，比如自身数据集图片大小为1920×1080，发现在预测过程中bbox检测框太细，最终甚至无法看清楚实际预测值。

问题2：

部分小伙伴的检测框太粗而导致检测目标被遮挡，尤其是当目标较小并且很密集时，目标很容易就被框挡住，不容易观察整体检测效果如何。

解决方法：

在detect.py文件中找到下面这行：

# Process detections
        for i, det in enumerate(pred):  # detections per image
            if webcam:  # batch_size >= 1
                p, s, im0, frame = path[i], '%g: ' % i, im0s[i].copy(), dataset.count
            else:
                p, s, im0, frame = path, '', im0s, getattr(dataset, 'frame', 0)

            p = Path(p)  # to Path
            save_path = str(save_dir / p.name)  # img.jpg
            txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # img.txt
            s += '%gx%g ' % img.shape[2:]  # print string
            gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
            if len(det):
                # Rescale boxes from img_size to im0 size
                det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()

                # Print results
                for c in det[:, -1].unique():
                    n = (det[:, -1] == c).sum()  # detections per class
                    s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string

                # Write results
                for *xyxy, conf, cls in reversed(det):
                    if save_txt:  # Write to file
                        xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                        line = (cls, *xywh, conf) if opt.save_conf else (cls, *xywh)  # label format
                        with open(txt_path + '.txt', 'a') as f:
                            f.write(('%g ' * len(line)).rstrip() % line + '\n')

                    if save_img or view_img:  # Add bbox to image
                        label = f'{names[int(cls)]} {conf:.3f}'
                        
############################
plot_one_box(xyxy, im0, label=label, 
color=colors[int(cls)], line_thickness=6)