YOLOV8的Detect head 逐行解读

Qiming_v

已于 2024-07-06 13:49:03 修改

阅读量2.6k

点赞数 53

文章标签：深度学习 pytorch

于 2024-07-04 17:39:17 首次发布

本文链接：https://blog.csdn.net/zhilaizhiwang/article/details/140181955

版权

YOLOV8从不同的特征层，得到不同大小的特征图，然后预测每个特征图的每个格子anchor的类别概率，以及每个格子中物体的边框，即相对于中心点上下左右的偏移量box。

shape为[(1, 144, 80, 80),(1, 144, 40, 40),(1,144,20,20)]。

输入x为从不同的上采样层得到的结果

x_cat = torch.cat([xi.view(shape[0], self.no, -1) for xi in x], 2)
box, cls = x_cat.split((self.reg_max * 4, self.nc), 1)
 #(1,64,8400),(1,80,8400)

整合这些结果，得到的shape为 (1,144,8400)。其中：
8400 = 80 * 80+40 * 40+20 * 20，总的预测数
144 为80个class和4*16个box
4 为预测的四个边框距离中心点的距离，是Anchor-Free的预测目标,格式为[left,top,right,bottom]。
self.reg_max = 16，是中心点的最大预测范围，即边框距离中心点的最远距离为16，但并不是16个像素，因为预测值都进行了不同stride的缩放。这个参数也决定了检测物体最大边框为 reg_max * stride*2。

self.anchors, self.strides = (x.transpose(0, 1) for x in make_anchors(x, self.stride, 0.5))
#(2,8400),(1,8400)
self.shape = shape #(1, 144, 80, 80)

def make_anchors(feats, strides, grid_cell_offset=0.5):
    """Generate anchors from features."""
    anchor_points, stride_tensor = [], []
    assert feats is not None
    dtype, device = feats[0].dtype, feats[0].device
    for i, stride in enumerate(strides):
        _, _, h, w = feats[i].shape
        sx = torch.arange(end=w, device=device, dtype=dtype) + grid_cell_offset  # shift x
        sy = torch.arange(end=h, device=device, dtype=dtype) + grid_cell_offset  # shift y
        sy, sx = torch.meshgrid(sy, sx, indexing="ij") if TORCH_1_10 else torch.meshgrid(sy, sx)
        anchor_points.append(torch.stack((sx, sy), -1).view(-1, 2))
        stride_tensor.append(torch.full((h * w, 1), stride, dtype=dtype, device=device))
    return torch.cat(anchor_points), torch.cat(stride_tensor)


self.anchors[:,:10]
tensor([[0.5000, 1.5000, 2.5

最低0.47元/天解锁文章