阅读NBNet Pytorch源码总结
1:torch.set_num_threads可以设置Pytorch进行CPU多线程并行计算时占用的线程数目,用来限制CPU多线程并行计算时占用的线程数目
不设置时,torch会自动加载很多的CPU,导致CPU占用率很高
torch.set_printoptions:设置打印tensor时的数值精度和格式
checkpoint的使用:
用于断点训练或者防止自己的训练出问题
```python
if (epoch+1) % checkpoint_interval == 0:
checkpoint = {"model_state_dict": net.state_dict(),
"optimizer_state_dict": optimizer.state_dict(),
"epoch": epoch}
path_checkpoint = "./checkpoint_{}_epoch.pkl".format(epoch)
torch.save(checkpoint, path_checkpoint)
if epoch > 5:
print("训练意外中断...")
break
# ============================ step 5+/5 断点恢复 ============================
path_checkpoint = "./checkpoint_4_epoch.pkl"
checkpoint = torch.load(path_checkpoint)
net.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch']
scheduler.last_epoch = start_epoch
#模型的保存两种方式:
#1:只保存权重,不保存网络结果
torch.save(model.state_dict(), mymodel.pth)#只保存模型权重参数,不保存模型结构
model = My_model(*args, **kwargs) #这里需要重新模型结构,My_model
model.load_state_dict(torch.load(mymodel.pth))#这里根据模型结构,调用存储的模型参数model.eval()
2:权重和网络结构都保存
torch.save(model,mymodel.pth)
model = torch.load(mymodel.pth)
model.eval()
args的功能
对带有action = store_true的东西,默认是False,
'--cuda', '-c', action='store_true',
当命令行输入了python train.py --cuda后,值就变为True