【YOLOv8】preprocess代码块详解

最新推荐文章于 2024-10-15 18:37:52 发布

生菜模拟器

最新推荐文章于 2024-10-15 18:37:52 发布

阅读量521

点赞数 1

文章标签： YOLO 计算机视觉深度学习

原文链接：https://zhuanlan.zhihu.com/p/666726337

版权

同一个batch內不同图像的标注目标个数可能不同，需要进行对其处理。所谓对其，如batch_size=2, 其中第二张图像标准5个box,则其shape为 5 x 6 （ 6表示[img_idx,cls_id,cx,cy,width,height]，第一张图像标注2个box，则其shape为2x6）故需要按标注目标数目最大的进行对齐，即将第张图像的2x6填充为5x6,空余位置用0补齐。

def preprocess(self, targets, batch_size, scale_tensor):
    """Preprocesses the target counts and matches with the input batch size to output a tensor."""
    #将box信息由归一化尺度转换到输入图像尺度，并对bath内每张图像的gt个数进行对齐(目标个数都设定一个统一的值M，方便进行矩阵运算)
    #M值的设定规则为，选取batch内最大的gt_num作为M（n_max_boxes）
    if targets.shape[0] == 0:
        out = torch.zeros(batch_size, 0, 5, device=self.device)
    else:
        #获取所有bounding box中对应的每张图片的id
        i = targets[:, 0]  # image index
        # 计算每张图像中标注框的个数,对tensor内相同值进行汇总
        _, counts = i.unique(return_counts=True)
        counts = counts.to(dtype=torch.int32)
        # 按照batch内最大M个GT创新全0的tensor b x M x 5 ,其中5 =cls,cx,cy,width,height
        out = torch.zeros(batch_size, counts.max(), 5, device=self.device)
        #遍览每张图片
        for j in range(batch_size):
            #找到当前batch对应的图片id
            matches = i == j
            #当前图片的框总数
            n = matches.sum()
            if n:
                out[j, :n] = targets[matches, 1:]
        # out为归一化之后的结果,需通过scales映射回输入尺度
        out[..., 1:5] = xywh2xyxy(out[..., 1:5].mul_(scale_tensor))
    return out