Pytorch在训练时冻结某些层使其不参与反向传播

最新推荐文章于 2023-08-28 15:49:00 发布

Douzi1024

最新推荐文章于 2023-08-28 15:49:00 发布

阅读量1.8k

点赞数 2

本文链接：https://blog.csdn.net/Xiao_CangTian/article/details/127971214

版权

本文介绍了在Pytorch中如何在训练过程中冻结网络的某些层，以防止它们参与反向传播。两种方法分别是通过设置requires_grad=False和仅将需要更新的参数传递给优化器。最优做法是在保持requires_grad=False的同时，不将冻结的参数包含在优化器中，以节省显存并提高训练速度。

摘要由CSDN通过智能技术生成

笔记摘抄：https://blog.csdn.net/qq_36429555/article/details/118547133

定义网络

# 定义一个简单的网络
class net(nn.Module):
    def __init__(self, num_class=10):
        super(net, self).__init__()
        self.fc1 = nn.Linear(8, 4)
        self.fc2 = nn.Linear(4, num_class)
    
    
    def forward(self, x):
        return self.fc2(self.fc1(x))

情况一：当不冻结层时

model = net()

# 情况一：不冻结参数时
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-2)  # 传入的是所有的参数

# 训练前的模型参数
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

for epoch in range(10):
    x = torch.randn((3, 8))
    label = torch.randint(0,10,[3]).long()
    output = model(x)
    
    loss = loss_fn(output, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

# 训练后的模型参数
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

结果：

(bbn) jyzhang@admin2-X10DAi:~/test$ python -u "/home/jyzhang/test/net.py"
model.fc1.weight Parameter containing:
tensor([[ 0.3362, -0.2676, -0.3497, -0.3009, -0.1013, -0.2316, -0.0189,  0.1430],
        [-0.2486,  0.2900, -0.1818, -0.0942,