在跑yolov9时候,断点训练出现“ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's ”问题,查找了很多原因,都说是optimizer的SGD优化的问题,所以暂时的解决办法是:
在进行断点训练的时候把重新读入的optimizer注释掉:
在train_dual中找到下面这几行代码:
# Resume
best_fitness, start_epoch = 0.0, 0
if pretrained:
if resume:
best_fitness, start_epoch, epochs = smart_resume(ckpt, optimizer, ema, weights, epochs, resume)
del ckpt, csd
然后进入到smart_resume函数中:
if ckpt['optimizer'] is not None:
# optimizer.load_state_dict(ckpt['optimizer']) # optimizer
best_fitness = ckpt['best_fitness']
if ema and ckpt.get('ema'):
# ema.ema.load_state_dict(ckpt['ema'].float().state_dict()) # EMA
ema.updates = ckpt['updates']
将optimizer.load_state_dict(ckpt['optimizer']) 和ema.ema.load_state_dict(ckpt['ema'].float().state_dict())都注释掉。如图所示
成功开始训练: