self.embedding.weight.requires=False即可不进行训练
加载预训练权重:self.embedding.weight.data.copy_(tensor)
如果要对输入进行求导,需再输入上加一个input=Variable(input,requires_grad),然后可以通过input.requires_grad查看是否有梯度
model里有一个linear1,可以通过model.linear1.weight查看linear1参数,RNN_stock.linear_layer.weight.data.copy_(RNN_stock.linear_layer.weight-RNN_stock.linear_layer.weight.grad)可以更新参数,train_x_batch.grad.data查看梯度,train_x_batch=train_x_batch-train_x_batch.grad对输入进行更新。
tensor在GPU里,tensor.detach().cpu().numpy()即可转移到CPU中
保存模型
torch.save(reason_model.state_dict(), ‘./best_model/’ + ‘word_by_word_attention_best.pkl’)
reason_model.load_state_dict(torch.load(’./best_model/best.pkl’))
train_data_train = TensorDataset(data_pre_train, data_hyp_train, data_label_train)
train_sampler_train = RandomSampler(train_data_train)
train_loader_train = DataLoader(dataset=train_data_train,
batch_size=args.batch_size,
sampler=train_sampler_train)
初始化权重
def weights_init(m):
classname=m.class.name
if classname.find(‘Linear’)!=-1:
nn.init.uniform_(m.weight,-0.5,0.5)
nn.init.constant_(m.bias,0.0)
reason_model.apply(weights_init)
word2vec=models.keyedvectors.load_word2vec_format(data_dir,binary=True)
word2vec_load[(da1 for da1 in [‘man’,‘eat’,‘srfhdgdhgsg’])]
params=filter(lambda p:p.requires_grad,cnn_model.parameters())