pytorch复制模型

最新推荐文章于 2025-02-27 18:07:23 发布

Reza.

最新推荐文章于 2025-02-27 18:07:23 发布

阅读量4.2k

点赞数 4

分类专栏：深度学习文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/weixin_43301333/article/details/128587337

版权

深度学习专栏收录该内容

83 篇文章

订阅专栏

很多时候，我们需要把模型copy一份，两份模型用于不同的用途（e.g., 分别训练、teacher / student model）。虽然torch并没有提供类似于model.clone()这种接口，但是这个功能可以简单通过copy.deepcopy()实现。如下图所示：
在这里插入图片描述

copy.deepcopy()不仅可以把原先的模型的所有参数，原模原样复制一份，连device也照样能够复制。并且各自优化，互不干扰。

验证如下：

>>> import torch
>>> import copy 
>>> m=torch.nn.Linear(3,3).to("cuda:4")
>>> mc=copy.deepcopy(m) ## deepcopy original model parameters

>>> t1=torch.randn(3,3)  # for model 'm'
>>> t1=t1.to("cuda:4")
>>> t1
tensor([[-0.6198,  0.2503,  0.9287],
        [ 0.6553, -0.6422, -2.0498],
        [-0.7867, -0.6862,  1.9102]], device='cuda:4')

>>> t2=torch.randn(3,3)  # for model 'mc'
>>> t2=t2.to("cuda:4")
>>> t2
tensor([[ 0.9616,  1.1679, -0.3201],
        [ 0.6383, -0.4115, -1.5540],
        [ 0.6649,  0.8439, -1.3090]], device='cuda:4')

>>> out1=m(t1)
>>> loss1=torch.sum(out1)
# we can find that the loss1 can backpropogate to m
>>> torch.autograd.grad(loss1,list(m.parameters())[0],retain_graph=True)
(tensor([[-0.7512, -1.0782,  0.7891],
        [-0.7512, -1.0782,  0.7891],
        [-0.7512, -1.0782,  0.7891]], device='cuda:4'),)
# but it can not bw backpropogated to mc
>>> torch.autograd.grad(loss1,list(mc.parameters())[0],retain_graph=True)
Traceback (most recent call last):
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

# the following results are similar to the conclusions above
>>> out2=mc(t2)
>>> loss2=torch.sum(out2)
>>> torch.autograd.grad(loss2,list(m.parameters())[0],retain_graph=True)
Traceback (most recent call last):
RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.
>>> torch.autograd.grad(loss2,list(mc.parameters())[0],retain_graph=True)
(tensor([[ 2.2648,  1.6003, -3.1831],
        [ 2.2648,  1.6003, -3.1831],
        [ 2.2648,  1.6003, -3.1831]], device='cuda:4'),)