yolov5-yolo.py class Model

一 自定义部分

1.1 读取yaml/dict文件

def __init__(self, cfg='yolov5s.yaml', ch=3, nc=None, anchors=None):  # model, input channels, number of classes
    super().__init__()
    if isinstance(cfg, dict):
        self.yaml = cfg  # model dict
    else:  # is *.yaml
        import yaml  # for torch hub
        self.yaml_file = Path(cfg).name
        with open(cfg, encoding='ascii', errors='ignore') as f:
            self.yaml = yaml.safe_load(f)  # model dict
  • 翻译:cfg是dict格式就直接self.yaml=cfg;否则是yaml,就加载。

1.2 定义模型结构

    # Define model
    ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels
    if nc and nc != self.yaml['nc']:
        LOGGER.info(f"Overriding model.yaml nc={self.yaml['nc']} with nc={nc}")
        self.yaml['nc'] = nc  # override yaml value
    if anchors:
        LOGGER.info(f'Overriding model.yaml anchors with anchors={anchors}')
        self.yaml['anchors'] = round(anchors)  # override yaml value
    self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist
    self.names = [str(i) for i in range(self.yaml['nc'])]  # default names
    self.inplace = self.yaml.get('inplace', True)

参数:

  • ch输入通道数目,为3
  • nc 类别数目,nc以train.py中命令的yaml文件为主。

函数:

  • parse_model(deepcopy(self.yaml), ch=[ch]),解析cfg的文件,自动创建网络结构,修改cfg时记得每行缩进与原文一致。

1.3 创建stride和anchor

     # Build strides, anchors
    m = self.model[-1]  # Detect()
    if isinstance(m, Detect):
        s = 256  # 2x min stride
        m.inplace = self.inplace
        x=self.forward(torch.zeros(1, ch, s, s))
        m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))])  # forward
        check_anchor_order(m)  # must be in pixel-space (not grid-space)
        m.anchors /= m.stride.view(-1, 1, 1)
        self.stride = m.stride
        self._initialize_biases()  # only run once

参数:

  • m即整个模型的最后一层,Detect
  • s自定义的一个数据的w,h(用于后面输入网络,看金字塔结构的,缩放关系)
  • m.stride特征图缩放比例,即tensor([ 8., 16., 32.])

函数

1.3.1 计算stride下采样倍率

  • self.forward(torch.zeros(1, ch, s, s))自动输入了一个大小为(1,3,256,256)的全零矩阵,输出的大小为分别为:
    ①torch.Size([1, 3, 32, 32, 7])
    ②torch.Size([1, 3, 16, 16, 7])
    ③torch.Size([1, 3, 8, 8, 7])
    (因为我的nc=2,所以是7)

1.3.2 检查锚框顺序

  • check_anchor_order检查anchor是不是从小到大来写的,保证与网络的下采样倍数一致。
def check_anchor_order(m):
    # Check anchor order against stride order for YOLOv5 Detect() module m, and correct if necessary
    a = m.anchors.prod(-1).mean(-1).view(-1)  # mean anchor area per output layer
    da = a[-1] - a[0]  # delta a
    ds = m.stride[-1] - m.stride[0]  # delta s
    if da and (da.sign() != ds.sign()):  # same order
        LOGGER.info(f'{PREFIX}Reversing anchor order')
        m.anchors[:] = m.anchors.flip(0)

m.anchors=
tensor([[[ 10., 13.],
[ 16., 30.],
[ 33., 23.]],

    [[ 30.,  61.],
     [ 62.,  45.],
     [ 59., 119.]],
     
    [[116.,  90.],
     [156., 198.],
     [373., 326.]]])
  • .prod 即是连乘,-1即在最后一个维度上连乘,.prod(-1)后即为
    tensor([[ 130., 480., 759.],
    [ 1830., 2790., 7021.],
    [ 10440., 30888., 121598.]])
  • .mean(-1)在最后一个维度上求均值
    tensor([ 456.33334, 3880.33325, 54308.66797])
  • .view(-1)展成一列,其实数据对的话,mean后已经是一列了
    所以有
    a= tensor([ 456.33334, 3880.33325, 54308.66797])
    stride=[8,16,32]
    通过观察da和ds符号是否一致,观察anchor的排列是否正确,为从小到大。

1.3.3 初始化bias权重

    def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency
        # https://arxiv.org/abs/1708.02002 section 3.3
        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.
        m = self.model[-1]  # Detect() module
        for mi, s in zip(m.m, m.stride):  # from
            b = mi.bias.view(m.na, -1).detach()  # conv.bias(255) to (3,85)
            b[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)
            b[:, 5:] += math.log(0.6 / (m.nc - 0.999999)) if cf is None else torch.log(cf / cf.sum())  # cls
            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

初始化detect的bias,detect的layer有三层,所以遍历三次

  • 我的输出是 na*(4+2+1)=3*7=21,如mi.bias=tensor([ 0.05255, -0.00674, 0.06468, -0.02544, -0.06994, -0.03853, -0.03931, 0.02318, 0.01586, 0.07249, 0.07560, -0.08699, -0.02540, 0.03635, 0.06441, -0.05532, 0.07548, -0.01346, 0.08670, 0.02409, 0.02118], requires_grad=True)
  • 故将b从21维的一列向量展成3x7的矩阵,前4个通道就以
    b[:, 4] += math.log(8 / (640 / s) ** 2)初始化,主要对应x,y,w,h(我猜)s即stride
  • b[:, 5:] += math.log(0.6 / (m.nc - 0.999999)) if cf is None else torch.log(cf / cf.sum())主要对应各个class的置信度。

1.3 前向推理

  • _forward_augment
  • _forward_once

1.3.1

 def _forward_once(self, x, profile=False, visualize=False):
    y, dt = [], []  # outputs
    for m in self.model:
        if m.f != -1:  # if not from previous layer
            x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
        if profile:
            self._profile_one_layer(m, x, dt)
        x = m(x)  # run 找好输入,在计算该层的输出
        y.append(x if m.i in self.save else None)  # save output 需要则记载
        if visualize:
            feature_visualization(x, m.type, m.i, save_dir=visualize)
    return x

解读:
对于模型每一层有:

  • 如果输入不是上一层的输出,则按参数f,去找对应的输入(会被后面用到的输出记在了y里,以方便后面层调用),如[-1, 20, 23],-1则取x,20则找y[20]。

参数:

  • y保存需要保存的输出,比如需要cat的或者是最后detect的结果。

  • self.save 记录需要保存的层序号,如[4, 6, 10, 14, 17, 20, 23]
    函数:

  • x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]
    如果m.f是int,则直接在y里找,如果是其他的,如list,则遍历去找。
    savelist创建

save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)

如果是参数f不是-1的话,就不用记录,否则就记录f%i,f是需要的数据层索引,i是本层索引,一般是索引前面的,所以x%i=x,至于为什么这么写就不懂了…
——————————————————————————————

forward_once中的前向推理结果

  • DETECT层:
    [1,3,32,32,7]
    [1, 3, 16, 16, 7]
    [1, 3, 8, 8, 7]

①3:3个检测层;②每层3组锚框;③32,32:检测层w,h;④7:4+2+1

  • 自己新建的输出层
    [1,4,16,16],还需要变换数据。
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
yolov5是一个目标检测算法,yolo.py是其中的一个核心文件,主要实现了模型的构建和训练。下面是yolo.py的代码详解: 1. 导入必要的库和模块 ```python import torch import torch.nn as nn import numpy as np from collections import OrderedDict from utils.general import anchors, autopad, scale_img, check_anchor_order, check_file, check_img_size, \ check_requirements, non_max_suppression, xyxy2xywh, xywh2xyxy, plot_one_box from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info from models.common import Conv, DWConv ``` 2. 定义YOLOv5模型 ```python class YOLOv5(nn.Module): def __init__(self, nc=80, anchors=(), ch=(), inference=False): # model, input channels, number of classes super(YOLOv5, self).__init__() self.nc = nc # number of classes self.no = nc + 5 # number of outputs per anchor self.nl = len(anchors) # number of detection layers self.na = len(anchors[0]) // 2 # number of anchors per layer self.grid = [torch.zeros(1)] * self.nl # init grid a = torch.tensor(anchors).float().view(self.nl, -1, 2) self.register_buffer('anchors', a) # shape(nl,na,2) self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2)) # shape(nl,1,na,1,1,2) self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch) # output conv self.inference = inference # inference flag ``` 3. 定义前向传播函数 ```python def forward(self, x): self.img_size = x.shape[-2:] # store image size x = self.forward_backbone(x) # backbone z = [] # inference output for i in range(self.nl): x[i] = self.m[i](x[i]) # conv bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() if not self.training: # inference if self.inference == 'tflite': z.append(x[i].detach().cpu()) # inference tflite else: io = x[i].sigmoid() io[..., 4:] = io[..., 4:] * io[..., 4:].mean(1, keepdim=True) * self.nc # sigmoid obj,class scores bxy = io[..., :2].sigmoid() * 2. - 0.5 + self.grid[i] # xy bwh = io[..., 2:4].exp() * self.anchor_grid[i] # wh xywh = torch.cat((bxy, bwh), -1).view(bs, -1, 4) * self.stride[i] # xywh (center+offset) relative to image size z.append(xywh.view(bs, -1, self.no), ) # xywhn return x if self.training else (torch.cat(z, 1), x) ``` 4. 定义后向传播函数 ```python def forward_backbone(self, x): x = self.conv1(x) x = self.bn1(x) x = self.act1(x) x = self.pool1(x) x = self.layer1(x) x = self.layer2(x) x = self.layer3(x) x = self.layer4(x) x = self.layer5(x) x = self.layer6(x) x = self.layer7(x) x = self.layer8(x) x = self.layer9(x) return x ``` 以上就是yolo.py的代码详解,其中包括了YOLOv5模型的定义和前向传播函数的实现。相关问题如下: 相关问题: 1. YOLOv5模型的输入和输出是什么? 2. YOLOv5模型的训练过程是怎样的? 3. YOLOv5模型中的anchors是什么?

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值