问题1
主要是深拷贝会破坏一些模型的特性,模型被
self.best_model = copy.deepcopy(model).cpu()
保存之后,保存就只能保存模型的参数了,而不能保存全部的模型
torch.save(self.best_model.state_dict(),'...')
再调用的时候定义完模型直接调用参数
model.load_state_dict(torch.load('model.pth'))
问题2
change the
optimizer = torch.optim.AdamW(model.parameters(),lr=1e-5)
deberta = DebertaV2Model.from_pretrained("/home/xiaoguzai/模型/deberta-v3-large")
model = ClassificationModel(deberta)
to
deberta = DebertaV2Model.from_pretrained("/home/xiaoguzai/模型/deberta-v3-large")
model = ClassificationModel(deberta)
optimizer = torch.optim.AdamW(model.parameters(),lr=1e-5)
Because at the first one,
original AdamW(model.parameters(),lr=1e-5)
refer to the model below,
but at the second one,
AdamW(model.parameters(),lr=1e-5)
refer to the real model