YOLOv10改进实战|增加NWDLoss提升小目标检测能力

小猫发财

已于 2024-05-31 02:38:13 修改

阅读量5.7k

点赞数 21

分类专栏： YOLOv8/v10项目实战文章标签： YOLO 目标检测人工智能 python 深度学习图像识别

于 2024-05-30 20:24:41 首次发布

本文链接：https://blog.csdn.net/Dora_blank/article/details/139333155

版权

YOLOv8/v10项目实战专栏收录该内容

6 篇文章 ¥129.90 ¥299.90

订阅专栏

超级会员免费看

本文详细介绍了如何在YOLOv10中集成并使用NWDLoss以提高小目标检测性能。通过修改配置文件、增加wasserstein_loss函数、调整BboxLoss和v8DetectionLoss类，以及编写训练代码，实现了Wasserstein损失与IoU损失的结合。实验证明，这些改动有助于提升模型对小目标的识别能力。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

前言

本篇博客我们将详细介绍如何在 YOLOv10 项目中增加 NWDLoss。包括如何修改配置文件、增加新的损失函数、调整现有的损失计算模块，以及增加训练代码来使用新的损失函数。相信通过这篇博文会使大家更佳熟悉YOLOv10项目的整体结构

文章概述

1. 增加了两个参数 nwdloss 和 iou_ratio 用于配置是否使用 Wasserstein 损失以及 IoU 损失的比例
2. 在 ultralytics/utils/loss.py 中添加了 wasserstein_loss 函数
3. 修改了 BboxLoss 类的 init 和 forward 函数,加入了 Wasserstein 损失的计算
4. 修改了 v8DetectionLoss 类的 init 函数,添加了 nwdloss 和 iou_ratio 参数
5. 编写了训练和验证的主函数,支持命令行参数设置,支持使用nwdloss等

必要环境

配置yolov10环境可参考往期博客
地址：https://blog.csdn.net/Dora_blank/article/details/139302363?spm=1001.2014.3001.5502
论文地址
地址：https://arxiv.org/abs/2110.13389

一、修改方法

1.修改配置文件

我们需要在配置文件 ultralytics\cfg\default.yaml 中增加两个新的参数 nwdloss 和 iou_ratio。这两个参数分别控制是否使用 NWDLoss 以及 NWDLoss 和 IoU 损失的权重比

nwdloss: False
iou_ratio: 0.5

参数详解：
1. nwdloss: 用于指定是否启用 NWDLoss，默认值为 False，表示不使用 NWDLoss
2. iou_ratio: 用于指定 NWDLoss 和 IoU 损失的权重比，默认值为 0.5，表示 NWDLoss 和 IoU 损失各占 50%

2. 增加 wasserstein_loss函数

在 ultralytics\utils\loss.py 文件中增加一个名为 wasserstein_loss 的函数，函数的主要计算包括拆分预测框和目标框的坐标，计算框的宽度、高度及中心坐标，计算中心距离和宽高距离，最后返回 wasserstein 损失

import torch
import torch.nn as nn

def wasserstein_loss(pred, target, eps=1e-7, constant=12.8):
    r"""Implementation of paper `Enhancing Geometric Factors into
    Model Learning and Inference for Object Detection and Instance
    Segmentation <https://arxiv.org/abs/2005.03572>`_.
    Code is modified from https://github.com/Zzh-tju/CIoU.
    Args:
        pred (Tensor): Predicted bboxes of format (x_min, y_min, x_max, y_max),
            shape (n, 4).
        target (Tensor): Corresponding gt bboxes, shape (n, 4).
        eps (float): Eps to avoid log(0).
    Return:
        Tensor: Loss tensor.
    """

    # 拆分坐标
    b1_x1, b1_y1, b1_x2, b1_y2 = pred.split(1, dim=-1)
    b2_x1, b2_y1, b2_x2, b2_y2 = target.split(1, dim=-1)

    # 计算框的宽度和高度
    w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1 + eps
    w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1 + eps

    # 计算框的中心坐标
    b1_x_center, b1_y_center = (b1_x1 + b1_x2) / 2, (b1_y1 + b1_y2) / 2
    b2_x_center, b2_y_center = (b2_x1 + b2_x2) / 2, (b2_y1 + b2_y2) / 2

    # 计算中心距离和宽高距离
    center_distance = (b1_x_center - b2_x_center).pow(2) + (b1_y_center - b2_y_center).pow(2) + eps
    wh_distance = ((w1 - w2).pow(2) + (h1 - h2).pow(2)) / 4
    
    # Wasserstein 距离
    wasserstein_2 = center_distance + wh_distance
    
    # 返回损失
    return torch.exp(-torch.sqrt(wasserstein_2) / constant)

参数详解：
1. eps: 用于避免计算过程中出现除零错误的一个极小值
2. constant: 用于控制 Wasserstein 距离的缩放因子

3. 修改 BboxLoss类

我们需要在 ultralytics\utils\loss.py 的BboxLoss类中集成 NWDLoss，需要修改 init 和 forward 方法，将这两个函数替换为如下代码

class BboxLoss(nn.Module):
    """Criterion class for computing training losses during training."""

    def __init__(self, reg_max, use_dfl=False, nwd_loss=False, iou_ratio=0.5):
        """Initialize the BboxLoss module with regularization maximum and DFL settings."""
        super().__init__()
        self.reg_max = reg_max
        self.use_dfl = use_dfl
        self.iou_ratio = iou_ratio
        self.nwd_loss = nwd_loss

    def forward(self, pred_dist, pred_bboxes, anchor_points, target_bboxes, target_scores, target_scores_sum, fg_mask):
        """IoU loss."""
        weight = target_scores.sum(-1)[fg_mask].unsqueeze(-1)
        iou = bbox_iou(pred_bboxes[fg_mask], target_bboxes[fg_mask], xywh=False, CIoU=True)
        loss_iou = ((1.0 - iou) * weight).sum() / target_scores_sum

        if self.nwd_loss:
            nwd = wasserstein_loss(pred_bboxes[fg_mask], target_bboxes[fg_mask])
            nwd_loss = ((1.0 - nwd) * weight).sum() / target_scores_sum
            loss_iou = self.iou_ratio * loss_iou + (1 - self.iou_ratio) * nwd_loss
        # DFL loss
        if self.use_dfl:
            target_ltrb = bbox2dist(anchor_points, target_bboxes, self.reg_max)
            loss_dfl = self._df_loss(pred_dist[fg_mask].view(-1, self.reg_max + 1), target_ltrb[fg_mask]) * weight
            loss_dfl = loss_dfl.sum() / target_scores_sum
        else:
            loss_dfl = torch.tensor(0.0).to(pred_dist.device)

        return loss_iou, loss_dfl

参数详解：
1. nwd_loss: 指定是否使用 NWDLoss
2. iou_ratio: 指定 NWDLoss 和 IoU 损失的权重比

4. 修改 v8DetectionLoss 类的 init 方法

我们还需在 ultralytics\utils\loss.py 的v8DetectionLoss 类中集成 NWDLoss 的相关参数，需要修改 init 方法，将该函数代码替换为如下代码

class v8DetectionLoss:
    """Criterion class for computing training losses."""

    def __init__(self, model, tal_topk=10):  # model must be de-paralleled
        """Initializes v8DetectionLoss with the model, defining model-related properties and BCE loss function."""
        device = next(model.parameters()).device  # get model device
        h = model.args  # hyperparameters

        m = model.model[-1]  # Detect() module
        self.bce = nn.BCEWithLogitsLoss(reduction="none")
        self.hyp = h
        self.stride = m.stride  # model strides
        self.nc = m.nc  # number of classes
        self.no = m.no
        self.reg_max = m.reg_max
        self.device = device

        self.use_dfl = m.reg_max > 1

        self.nwdloss = self.hyp.nwdloss
        self.iou_ratio = self.hyp.iou_ratio

        self.assigner = TaskAlignedAssigner(topk=tal_topk, num_classes=self.nc, alpha=0.5, beta=6.0)
        self.bbox_loss = BboxLoss(m.reg_max - 1, use_dfl=self.use_dfl, nwd_loss=self.nwdloss,
                                  iou_ratio=self.iou_ratio).to(device)

        self.proj = torch.arange(m.reg_max, dtype=torch.float, device=device)

参数详解：
1. nwdloss: 从default.yaml中读取，指定是否使用 NWDLoss。
2. iou_ratio: 从default.yaml中读取，指定 NWDLoss 和 IoU 损失的权重比

二、训练代码

完成训练代码如下其中usenwd参数控制是否使用nwdloss，iou_ratio参数控制nwdloss和iou损失的权重比

# -*- coding:utf-8 -*-
from ultralytics import YOLOv10
import argparse

# 解析命令行参数
parser = argparse.ArgumentParser(description='Train or validate YOLO model.')
# train用于训练原始模型  val 用于得到精度指标
parser.add_argument('--mode', type=str, default='train', help='Mode of operation.')
# 预训练模型
parser.add_argument('--weights', type=str, default='yolov10n.pt', help='Path to model file.')
# 是否使用nwdloss
parser.add_argument('--usenwd', type=str, default=False, help='Whether to use NWDLoss or not (True/False)')
# iou使用比例
parser.add_argument('--iou_ratio', type=float, default=0.5, help='Intersection over Union (IoU) threshold for NWDLoss')

# 数据集存放路径
parser.add_argument('--data', type=str, default='data-test.yaml', help='Path to data file.')
parser.add_argument('--epoch', type=int, default=50, help='Number of epochs.')
parser.add_argument('--batch', type=int, default=8, help='Batch size.')
parser.add_argument('--workers', type=int, default=0, help='Number of workers.')
parser.add_argument('--device', type=str, default='0', help='Device to use.')
parser.add_argument('--name', type=str, default='', help='Name data file.')
args = parser.parse_args()


def train(model, data, epoch, batch, workers, device, name, usenwd, iou_ratio):
    model.train(data=data, epochs=epoch, batch=batch, workers=workers, device=device, name=name,
                nwdloss=usenwd, iou_ratio=iou_ratio,val_period=1)


def validate(model, data, batch, workers, device, name):
    model.val(data=data, batch=batch, workers=workers, device=device, name=name)


def main():
    model = YOLOv10(args.weights)
    if args.mode == 'train':
        train(model, args.data, args.epoch, args.batch, args.workers, args.device, args.name, args.usenwd,
              args.iou_ratio)
    else:
        validate(model, args.data, args.batch, args.workers, args.device, args.name)


if __name__ == '__main__':
    main()