Yolo v5 长方形训练修改

EmoC001

已于 2024-02-29 21:50:24 修改

阅读量1.4k

点赞数 3

分类专栏：鼠鼠的AI笔记文章标签： YOLO 深度学习 pytorch

于 2023-04-25 22:49:28 首次发布

本文链接：https://blog.csdn.net/u013302570/article/details/130375818

版权

鼠鼠的AI笔记专栏收录该内容

32 篇文章 4 订阅 ¥19.90 ¥99.00

订阅专栏

超级会员免费看

感谢，以下内容改自：http://t.csdn.cn/37m2w

Train.py

添加train，test (480,640) for each

 parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=[[480,640],[480,640]], help='train, val image size (pixels)')
 parser.add_argument('--rect', action='store_true', default=True, help='rectangular training')

分出train/val

# imgsz = check_img_size(opt.imgsz, gs, floor=gs * 2)  # verify imgsz is 
if isinstance(opt.imgsz,int): 
        imgsz_train = check_img_size(opt.imgsz, gs, floor=gs * 2)  # verify imgsz is gs-multiple
        imgsz_val = imgsz_train
else:
     	imgsz_train = check_img_size(opt.imgsz[0], gs, floor=gs * 2)  # verify imgsz is gs-multiple
     	imgsz_val = check_img_size(opt.imgsz[1], gs, floor=gs * 2)  # verify imgsz is gs-multiple

create_dataloader
train， val 都会有create_dataloader，里面的参数要做改变:

train_loader, dataset = create_dataloader(train_path,
                                              imgsz_train,
                                              batch_size // WORLD_SIZE,
                                              gs,
                                              single_cls,
                                              hyp=hyp,
                                              augment=True,
                                              cache=None if opt.cache == 'val' else opt.cache,
                                              rect=opt.rect,
                                              rank=LOCAL_RANK,
                                              workers=workers,
                                              image_weights=opt.image_weights,
                                              quad=opt.quad,
                                              prefix=colorstr('train: '),
                                              shuffle=True,
                                              seed=opt.seed)

val_loader = create_dataloader(val_path,
                                       imgsz_val,
                                       batch_size // WORLD_SIZE * 2,
                                       gs,
                                       single_cls,
                                       hyp=hyp,
                                       cache=None if noval else opt.cache,
                                       rect=True,
                                       rank=-1,
                                       workers=workers * 2,
                                       pad=0.5,
                                       prefix=colorstr('val: '))[0]

if not resume:
 	if not opt.noautoanchor:
                check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz_train)  # run AutoAnchor
   model.half().float()  # pre-reduce anchor precision

if opt.multi_scale:
    sz = random.randrange(int(max(imgsz_train) * 0.5), int(max(imgsz_train) * 1.5) + gs) // gs * gs  # size
    sf = sz / max(imgs.shape[2:])  # scale factor
    if sf != 1:
        ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)
        imgs = nn.functional.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)

其它

这里用于调整loss gain, 如果imgsz_train 越大， ‘max(imgsz_train) / 640) ** 2’ 越大，如果 number of layers(nl) 越多层，3 / nl 越小。这两个量中和了

if isinstance(imgsz_train,int):
        hyp['obj'] *= (imgsz_train / 640) ** 2 * 3 / nl  # scale to image size and layers
else:
   		hyp['obj'] *= (max(imgsz_train) / 640) ** 2 * 3 / nl  # scale to image size and layers

LOGGER.info(f'Image sizes {imgsz_train} train, {imgsz_val} val\n'
                f'Using {train_loader.num_workers * WORLD_SIZE} dataloader workers\n'
                f"Logging results to {colorstr('bold', save_dir)}\n"
                f'Starting training for {epochs} epochs...')

# Multi-scale
if opt.multi_scale:
    sz = random.randrange(int(imgsz_train * 0.5), int(imgsz_train * 1.5) + gs) // gs * gs  # size
    sf = sz / max(imgs.shape[2:])  # scale factor
    if sf != 1:
        ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)
        imgs = nn.functional.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)

results, maps, _ = validate.run(data_dict,
			                     batch_size=batch_size // WORLD_SIZE * 2,
			                     imgsz=imgsz_val,
			                     half=amp,
			                     model=ema.ema,
			                     single_cls=single_cls,
			                     dataloader=val_loader,
			                     save_dir=save_dir,
			                     plots=False,
			                     callbacks=callbacks,
			                     compute_loss=compute_loss)

results, _, _ = validate.run(
                        data_dict,
                        batch_size=batch_size // WORLD_SIZE * 2,
                        imgsz=imgsz_val,
                        model=attempt_load(f, device).half(),
                        iou_thres=0.65 if is_coco else 0.60,  # best pycocotools at iou 0.65
                        single_cls=single_cls,
                        dataloader=val_loader,
                        save_dir=save_dir,
                        save_json=is_coco,
                        verbose=True,
                        plots=plots,
                        callbacks=callbacks,
                        compute_loss=compute_loss)  # val best model with plots

dataloaders.py

class： LoadImagesAndLabels（）

【mosaic 】可处理一个值/多个值的尺寸
comment :self.mosaic = self.augment and not self.rect . 就算rect, 也要mosaic

# self.mosaic = self.augment and not self.rect 
self.mosaic = self.augment
if isinstance(img_size, int):
     self.mosaic_border = [-img_size//2, -img_size//2]
else:
    self.mosaic_border = [-img_size[0]//2, -img_size[1]//2] # hight, width

def load_image()

如果将 comments的地方放开，做letterbox时，图片会被直接放大，而不是原图

if isinstance(self.img_size,int):
       r = self.img_size / max(h0, w0)  # ratio
       if r != 1:  # if sizes are not equal
           interp = cv2.INTER_LINEAR if (self.augment or r > 1) else cv2.INTER_AREA
           im = cv2.resize(im, (math.ceil(w0 * r), math.ceil(h0 * r)), interpolation=interp)
       else:
           rh, rw = self.img_size[0]/ max(h0, w0), self.img_size[1]/ max(h0, w0)
           # if rh != 1 or rw !=1:
           #     interp = cv2.INTER_LINEAR if (self.augment or rh > 1 or rw > 1) else cv2.INTER_AREA
           #     im = cv2.resize(im, (math.ceil(w0 * rw), math.ceil(h0 * rh)), interpolation=interp)

如果 rh, rw = self.img_size[0]/ max(h0, w0), self.img_size[1]/ max(h0, w0) 这个是等比例放大
如果 rh, rw = self.img_size[0]/ h0, self.img_size[1]/ w0 这个等价于直接resize 到 480x640的高宽，会变形

根据需要，如果不希望被放大，而是原图，则使用上面被comment掉的代码。
那么从load_image 到 letter_box 它会是：load_image 后不改变图的尺寸，保持原图大小.

letter_box后：

如果图片太大就会等比例缩小

（这一步要通过填充得到理想尺寸）。

原图： 11071222
目标尺寸：480672
ratio：0.4
新尺寸：530*480 （resize）

如果图片过小

原图： 122128
目标尺寸：480672
ratio：3.93 (选择不要scale up, 所以ratio = 1)

如果选择星号的方案，那么基本在letterbox不需要补边，因为在上一步，基本符合了640，480

load_mosaic()

s = self.img_size
if isinstance(self.img_size,int):
     yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border)  # mosaic center x, y
 else:
     s_h, s_w = s[0], s[1] 
     yc, xc = [int(random.uniform(-x, 2 * s + x)) for x, s in zip(self.mosaic_border, self.img_size)]

clip 的范围处理为0~最大

# Concat/clip labels
labels4 = np.concatenate(labels4, 0)
for x in (labels4[:, 1:], *segments4):
    if isinstance(s,int):
        np.clip(x, 0, 2 * s, out=x)  # clip when using random_perspective()
    else:
        np.clip(x, 0, 2 * max(s), out=x)  # clip when using random_perspective()random_perspective()

self.rect

之前设定高宽为480x640
如果这里的 self.rect == True。那么这里会贴心的考虑到stride 和 padding的尺寸，在480x640基础上往外扩充，得到对应batch中容错尺寸：512x672
进入letterbox时，目标尺寸会变为512x672.

如果这里的 self.rect == True, 进入letterbox时，目标尺寸依然为480x640。

在train.py 中：

val_loader = create_dataloader(val_path,
                                       imgsz_val,
                                       batch_size // WORLD_SIZE * 2,
                                       gs,
                                       single_cls,
                                       hyp=hyp,
                                       cache=None if noval else opt.cache,
                                       rect=opt.rect,
                                       rank=-1,
                                       workers=workers * 2,
                                       pad=0.5,
                                       prefix=colorstr('val: '))[0]

EmoC001

关注

3
点赞
踩
16

收藏

觉得还不错? 一键收藏
5
评论
Yolo v5 长方形训练修改

感谢，以下内容改自：http://t.csdn.cn/37m2wdataloaders.py# YOLOv5 🚀 by Ultralytics, GPL-3.0 license"""Dataloaders and dataset utils"""import contextlibimport globimport hashlibimport jsonimport mathimport osimport randomimport shutilimport timefrom it
复制链接

扫一扫