【pytorch】loss 不下降记录

1 篇文章 0 订阅
1 篇文章 0 订阅

在使用pytorch进行训练的时候,loss一直维持在同一个很大的数附近震荡,很明显是模型有问题,经过了长时间的查找,才发现pytorch早已提示了错误,而自己忽略了。·

/home/anaconda3/envs/M/lib/python3.6/site-packages/torch/nn/modules/loss.py:528: UserWarning: Using a target size (torch.Size([256])) that is different to the input size (torch.Size([256, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
/home/anaconda3/envs/M/lib/python3.6/site-packages/torch/nn/modules/loss.py:528: UserWarning: Using a target size (torch.Size([186])) that is different to the input size (torch.Size([186, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
epoch =    0   loss = 35.4517
epoch =   50   loss = 11.7757
epoch =  100   loss = 11.7866
epoch =  150   loss = 11.9761
epoch =  200   loss = 11.6557
epoch =  250   loss = 11.6361
epoch =  300   loss = 11.7160
epoch =  350   loss = 11.6047
epoch =  400   loss = 11.6412
epoch =  450   loss = 11.6390
epoch =  500   loss = 11.7137
epoch =  550   loss = 11.6388
epoch =  600   loss = 11.7710
epoch =  650   loss = 11.6640
epoch =  700   loss = 11.6348
epoch =  750   loss = 11.7574
epoch =  800   loss = 11.5811
epoch =  850   loss = 11.7239
epoch =  900   loss = 11.8139
epoch =  950   loss = 11.7817

如上图所示,pytorch提示我loss输入的两个参数(input, target)的维度不一致,target是一维的,而input是二维的(多了维度为1的一维),虽然在数量上是一致的,但pytorch不会自动进行转置,只会进行brodecasting操作,及广播操作,可能的结果是把target的第一个数广播为input同样的维度,即所有的target结果为一样的,这也解释了为什么预测出来的结果都是在一个很小的范围内震荡,如下图:

预测结果:[2.542457342147827, 2.467282295227051, 2.467350721359253, 2.4719324111938477, 2.4960854053497314, 2.5416831970214844, 2.5303843021392822]
实际结果:[1.9479999542236328, 2.5869998931884766, 2.2699999809265137, 2.38100004196167, 1.4229999780654907, 1.8990000486373901, 1.4620000123977661]

解决办法:

通过tensor的squeeze操作(减少维度)或者unsqueeze操作(增加维度)来使input和target的维度相同,修改之后,loss变得正常了,也不再提示warning,如下图:

Starting training with saved checkpoints
epoch =    0   loss = 29.6703
epoch =   50   loss = 1.3070
epoch =  100   loss = 0.3954
epoch =  150   loss = 0.2571
epoch =  200   loss = 0.2827
epoch =  250   loss = 0.2776
epoch =  300   loss = 0.1670
epoch =  350   loss = 0.1842

第二种情况,loss一直维持在高位。


[INFO] EPOCH: 1/200 --> Val loss: 1.590242, Val accuracy: 0.2767

[INFO] EPOCH: 2/200 --> Val loss: 1.589383, Val accuracy: 0.2767

[INFO] EPOCH: 3/200 --> Val loss: 1.592154, Val accuracy: 0.2767

[INFO] EPOCH: 4/200 --> Val loss: 1.590423, Val accuracy: 0.2767

检查之后发现,是loss function 中weight_decay 设置有问题

# Loss & Optimizer                                                                                                                                                                              
          criterion = nn.NLLLoss()                                                                                                                                                                        
          optimizer = optim.Adam(self.model.parameters(), lr = self.init_lr, weight_decay = 0.001)                                                                                                        
          #optimizer = optim.Adam(self.model.parameters(), lr = self.init_lr)  

当不设置weight_decay 时,loss能正常下降,当设置了weight_decay时,loss就维持不变了。此时应该减小weight_decay 的值。weight_decay太大会导致learning rate衰减太快,导致的结果就是loss不再变化。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值