模型蒸馏时loss不收敛 huggingface

  1. Check the Data: Ensure that your data is clean, properly preprocessed, and balanced. Imbalanced datasets can cause the model to be biased towards the majority class, leading to poor convergence.

  2. Learning Rate: The learning rate you’ve set (5e-3) might be too high. Try reducing the learning rate to see if it helps the loss to converge. You can also use learning rate 2e-5 schedulers that adjust the learning rate during training.

  3. Batch Size: Experiment with different batch sizes. Sometimes smaller batch sizes can help with convergence as they provide a more noisy gradient which can help escape local minima.

  4. Model Complexity: If your model is too complex for the task, it might overfit and not converge well. Try simplifying the model or using regularization techniques like dropout.

  5. Loss Function: Make sure you are using the appropriate loss function for your task. For binary classification, Binary Cross-Entropy is commonly used.

trainer = Trainer(
...
compute_loss=lambda outputs, labels: torch.nn.functional.binary_cross_entropy_with_logits(outputs.logits, labels.float().unsqueeze(1))
)
  1. Early Stopping: Use early stopping to prevent overfitting. This will stop the training process if the model’s performance on the validation set doesn’t improve for a specified number of epochs.
trainer = Trainer(
...
callbacks=[EarlyStoppingCallback(early_stopping_patience=3)] # Added early stopping
)

Remember to import EarlyStoppingCallback from transformers if you decide to use early stopping.

  1. Gradient Clipping: Gradient clipping can prevent exploding gradients which can cause the loss to diverge.
training_args = TrainingArguments(
..
max_grad_norm=1.0, # Gradient clipping
)
  • 7
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值