yolov5训练相关参数解释

很多常用术语不太懂,毕竟咱不是这专业的,也算个初学者,总之,菜是原罪,能学就学。

1.官方解释

查看https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data,里面有这样一句话。

For training command outputs and further details please see the training section of Google Colab Notebook.

打开这个notebook(需要点手段,你们懂的)。
总结一下,这个notebook中有关train 的信息。

  1. actual training is much longer, around 300-1000 epochs, depending on your dataset
  2. --cfg选择model文件(models/yolo5s.yaml)
  3. --data选择datase文件(data/coco128.yaml)
  4. --weights指定初始权重文件(随机初始化--weights ''
  5. All training results are saved to runs/exp0 for the first experiment, then runs/exp1, runs/exp2 etc. for subsequent experiments.(实验发现到10就停下了,之后不断更新exp10)
  6. 可选tensorboard(还不会用。。。)
  7. A Mosaic Dataloader is used for training
  8. View test_batch0_gt.jpg to see test batch 0 ground truth labels.
  9. View test_batch0_pred.jpg to see test batch 0 predictions.
  10. Training losses and performance metrics are saved to Tensorboard and also to a runs/exp0/results.txt logfile. results.txt is plotted as results.png after training completes.

然后就没了。。。。。显然对咱深入理解没啥帮助,也就勉强一用。

2.源码阅读

传参都在这了。

if __name__ == '__main__':
    check_git_status()
    parser = argparse.ArgumentParser()
    parser.add_argument('--cfg', type=str, default='models/yolov5s.yaml', help='model.yaml path')
    parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
    parser.add_argument('--hyp', type=str, default='', help='hyp.yaml path (optional)')
    parser.add_argument('--epochs', type=int, default=300)
    parser.add_argument('--batch-size', type=int, default=16)
    parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='train,test sizes')
    parser.add_argument('--rect', action='store_true', help='rectangular training')
    parser.add_argument('--resume', nargs='?', const='get_last', default=False,
                        help='resume from given path/to/last.pt, or most recent run if blank.')
    parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
    parser.add_argument('--notest', action='store_true', help='only test final epoch')
    parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
    parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
    parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
    parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
    parser.add_argument('--weights', type=str, default='', help='initial weights path')
    parser.add_argument('--name', default='', help='renames results.txt to results_name.txt if supplied')
    parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
    parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
    parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
    opt = parser.parse_args()

总结一下:

  1. cfg,data,weights:前面看过了是一定要传的两个参;
  2. hyp:参数咱暂时用不上,是指定一些超参数用的(学习率啥的);
  3. epochs: 轮数,默认300,需要指定;
  4. batch-size:一次喂多少数据,我这内存就能给16,所以可以不传按默认16;
  5. img-size: 训练和测试数据集的图片尺寸(个人理解为分辨率),默认640,640nargs='+' 表示参数可设置一个或多个
  6. rect: 只要加上’–rect’程序就会将rect设为true,作用未知(应该是训练时启用矩形训练);
  7. resume: 重新训练(个人理解epoch会从头计算);
  8. notest:only test final epoch(这样训练中间变化趋势应该就看不到了);
  9. evolve:进化超参数(hyp),可以试试;
  10. cache-images:cache images for faster training,可以试试;
  11. name:renames results.txt to results_name.txt if supplied;
  12. device:cuda device, i.e. 0 or 0,1,2,3 or cpu,我这默认已经用了gtx1060了,不用改;
  13. single-cls:train as single-class dataset,暂时没用;

以下这些都没太看懂
noautoanchor:disable autoanchor check
nosave:only save final checkpoint
bucket:gsutil bucket(应该关于谷歌云,应该用不上)
multi-scale:vary img-size +/- 50%%

读下来我的命令行语句应该改为:

python train.py --epoch 53 --data .\data\junk2020.yaml --cfg .\models\yolov5s.yaml --weight runs\exp10\weights\best.pt --evolve --cache-images 

测试一下
内存可能不太够,电脑差点崩掉,中途杀了python,所以那个cache没能力就先别加了。。。
evolve之后的hyp也不知道存在哪了,,,明天再说吧。。。。

python train.py --epoch 80 --data .\data\junk2020.yaml --cfg .\models\yolov5s.yaml --weight runs\exp10\weights\best.pt --evolve

结果就是evolve出错

Traceback (most recent call last):
  File "train.py", line 449, in <module>
    print_mutation(hyp, results, opt.bucket)
  File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 823, in print_mutation
    b = '%10.3g' * len(hyp) % tuple(hyp.values())  # hyperparam values
TypeError: must be real number, not str

由于不好debug,这边先把evolve去了。

3.可视化结果解释

解释一下result.png里都是啥:
在这里插入图片描述

  1. GIoU:推测为GIoU损失函数均值,越小方框越准;
  2. Objectness:推测为目标检测loss均值,越小目标检测越准;
  3. Classification:推测为分类loss均值,越小分类越准;
  4. Precision:准确率(找对的/找到的);
  5. Recall:召回率(找对的/该找对的);
  6. mAP@0.5 & mAP@0.5:0.95:这里说的挺好,总之就是AP是用Precision和Recall作为两轴作图后围成的面积,m表示平均,@后面的数表示判定iou为正负样本的阈值,@0.5:0.95表示阈值取0.5:0.05:0.95后取均值。

4.evolve报错解决

Traceback (most recent call last):
  File "train.py", line 449, in <module>
    print_mutation(hyp, results, opt.bucket)
  File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 823, in print_mutation
    b = '%10.3g' * len(hyp) % tuple(hyp.values())  # hyperparam values
TypeError: must be real number, not str

这波看这句b = '%10.3g' * len(hyp) % tuple(hyp.values()),意思是把hyp这个字典的value都提出来形成一个元组,然后以10.3g批量格式化。

hyp = {'optimizer': 'SGD',  # ['adam', 'SGD', None] if none, default is SGD
       'lr0': 0.01,  # initial learning rate (SGD=1E-2, Adam=1E-3)
       'momentum': 0.937,  # SGD momentum/Adam beta1
       'weight_decay': 5e-4,  # optimizer weight decay
       'giou': 0.05,  # giou loss gain
       'cls': 0.58,  # cls loss gain
       'cls_pw': 1.0,  # cls BCELoss positive_weight
       'obj': 1.0,  # obj loss gain (*=img_size/320 if img_size != 320)
       'obj_pw': 1.0,  # obj BCELoss positive_weight
       'iou_t': 0.20,  # iou training threshold
       'anchor_t': 4.0,  # anchor-multiple threshold
       'fl_gamma': 0.0,  # focal loss gamma (efficientDet default is gamma=1.5)
       'hsv_h': 0.014,  # image HSV-Hue augmentation (fraction)
       'hsv_s': 0.68,  # image HSV-Saturation augmentation (fraction)
       'hsv_v': 0.36,  # image HSV-Value augmentation (fraction)
       'degrees': 0.0,  # image rotation (+/- deg)
       'translate': 0.0,  # image translation (+/- fraction)
       'scale': 0.5,  # image scale (+/- gain)
       'shear': 0.0}  # image shear (+/- deg)

观察values,第一项为字符串’SGD’,所以格式化出现了问题。
b = '%10.3g' * len(hyp) % tuple(hyp.values())改为
b = '%10s' * 1 % (list(hyp.values())[0],) + '%10.3g' * (len(hyp) - 1) % tuple( list(hyp.values())[1:])
训练一轮试试

Traceback (most recent call last):
  File "train.py", line 449, in <module>
    print_mutation(hyp, results, opt.bucket)
  File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 837, in print_mutation
    x = np.unique(np.loadtxt('evolve.txt', ndmin=2), axis=0)  # load unique rows
  File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1146, in loadtxt
    for x in read_data(_loadtxt_chunksize):
  File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1074, in read_data
    items = [conv(val) for (conv, val) in zip(converters, vals)]
  File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1074, in <listcomp>
    items = [conv(val) for (conv, val) in zip(converters, vals)]
  File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 781, in floatconv
    return float(x)
ValueError: could not convert string to float: 'SGD'

这里是np.loadtxt('evolve.txt', ndmin=2)这里txt里有字符串,所以出错。

把第一项去掉看看

Traceback (most recent call last):
  File "train.py", line 437, in <module>
    hyp[k] = x[i + 7] * v[i]  # mutate
IndexError: index 18 is out of bounds for axis 0 with size 18

待续。。。

  • 61
    点赞
  • 526
    收藏
    觉得还不错? 一键收藏
  • 54
    评论
评论 54
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值