yolov5 create_dataloader原码及解析

最新推荐文章于 2024-06-19 10:57:28 发布

华农度假村村长

最新推荐文章于 2024-06-19 10:57:28 发布

阅读量1k

点赞数 2

分类专栏： # yolov5 文章标签：计算机视觉 yolov5

本文链接：https://blog.csdn.net/weixin_50862344/article/details/126796709

版权

yolov5 专栏收录该内容

21 篇文章 24 订阅

订阅专栏

创建数据集的调用关系
create_dataloader（…）----->LoadImagesAndLabels（…）

create_dataloader

part 1 : 参数

def create_dataloader(path,
                      imgsz,
                      batch_size,
                      stride,
                      single_cls=False,
                      hyp=None,
                      augment=False,
                      cache=False,
                      pad=0.0,
                      rect=False,
                      rank=-1,
                      workers=8,
                      image_weights=False,
                      quad=False,
                      prefix='',
                      shuffle=False):

返回值是train_loader, dataset

(1)rect

parser.add_argument('--rect', action='store_true', help='rectangular training')

下图分别是方形推理方式和矩阵推理方式

在这里插入图片描述

矩阵推理会加速模型的推理过程，减少一些冗余信息

【参考博客】：手把手带你调参Yolo v5 (v6.2)（二）

【参考博客】：olov5中的Rectangular training和Rectangular inference
在yolov5中被调用的代码如下：

        # Rectangular Training
        if self.rect:
            # Sort by aspect ratio
            s = self.shapes  # wh


            ar = s[:, 1] / s[:, 0]  #输入图像高宽比 # aspect ratio
            irect = ar.argsort()#将ar中的宽高比从小到大排列，提取其对应的index(索引)，然后输出

            #按照索引排序（irect）整理
            self.im_files = [self.im_files[i] for i in irect]
            self.label_files = [self.label_files[i] for i in irect]
            self.labels = [self.labels[i] for i in irect]
                            #self.labels = list(labels)
            self.shapes = s[irect]  # wh

            #------重排ar----------------
            ar = ar[irect]#将ar按irect中索引顺序排列

            #设置训练图像的大小  Set training image shapes
            shapes = [[1, 1]] * nb
            for i in range(nb):#nb：number of batches
                ari = ar[bi == i]#bi：batch index,ari为每一个batch最后匹配到的结果
                mini, maxi = ari.min(), ari.max()#取当前ari列表中最大最小值

                #始终要记得ar代表的是是高宽比，shape<---wh
                if maxi < 1:
                    shapes[i] = [maxi, 1]#高大宽小
                elif mini > 1:
                    shapes[i] = [1, 1 / mini]#高小宽大

            self.batch_shapes = np.ceil(np.array(shapes) * img_size / stride + pad).astype(int) * stride
            #对于shapes每一个都给出最小的padding

ar = ar[irect]将排序索引转化为排序结果
举一个简单的例子：

x=np.array([1,4,3,-1,6,9])
print(f"排序前：{x}")
y=x.argsort()
print(f"排序后x：{x}")
print(f"排序后y：{y}")
x=x[y]
print(x)

输出：
排序前：[ 1  4  3 -1  6  9]
排序后x：[ 1  4  3 -1  6  9]
排序后y：[3 0 2 1 4 5]
[-1  1  3  4  6  9]

ari = ar[bi == i]这个循环的过程
基于之前的代码再举一个例子

batch_size=2
n=6
bi = np.floor(np.arange(n) / batch_size).astype(int)
nb = bi[-1] + 1
print(f"bi:{bi}",f"nb:{nb}")
shapes=[]
for i in range(nb):  # nb：number of batches
    print(bi == i)
    ari = x[bi == i]
    mini, maxi = ari.min(), ari.max()
    print(mini,maxi)
输出：
bi:[0 0 1 1 2 2] nb:3
[ True  True False False False False]
-1 1
[False False  True  True False False]
3 4
[False False False False  True  True]
6 9

(2)quad

parser.add_argument('--quad', action='store_true', help='quad dataloader')

好处是在比默认 640 大的数据集上训练效果更好
副作用是在 640 大小的数据集上训练效果可能会差一些

(3)图像权重 image_weights

在训练过程中，当设置参数–image_weights为True时，会计算图像采集的权重，若图像权重越大,那么该图像被采样的概率也越大。后面遍历图像时,则按照重新采集的索引dataset.indices进行计算。

类别权重

 model.class_weights = labels_to_class_weights(dataset.labels, nc).to(device) * nc  # attach class weights

labels_to_class_weights在utils/general.py中

def labels_to_class_weights(labels, nc=80):
    # Get class weights (inverse frequency) from training labels
    if labels[0] is None:  # no labels loaded
        return torch.Tensor()
    #将标签信息在水平方向上拼接起来
    labels = np.concatenate(labels, 0)  # labels.shape = (866643, 5) for COCO

    #label标签里面的维度信息？？
    classes = labels[:, 0].astype(int)  # labels = [class xywh]
    weights = np.bincount(classes, minlength=nc)  # occurrences per class

    # Prepend gridpoint count (for uCE training)
    # gpi = ((320 / 32 * np.array([1, 2, 4])) ** 2 * 3).sum()  # gridpoints per image
    # weights = np.hstack([gpi * len(labels)  - weights.sum() * 9, weights * 9]) ** 0.5  # prepend gridpoints to start

    weights[weights == 0] = 1  # replace empty bins with 1
    weights = 1 / weights  # number of targets per class
    #     每一个类别    标注框的数量
    weights /= weights.sum()  # normalize