pytorch 更新部分参数(冻结参数)注意事项

实验的pytorch版本1.2.0

在训练过程中可能需要固定一部分模型的参数,只更新另一部分参数。有两种思路实现这个目标,一个是设置不要更新参数的网络层为false,另一个就是在定义优化器时只传入要更新的参数。当然最优的做法是,优化器中只传入requires_grad=True的参数,这样占用的内存会更小一点,效率也会更高。

一、设置参数为false

import torch
import torch.nn as nn
import torch.optim as optim

# 定义一个简单的网络
class net(nn.Module):
    def __init__(self, num_class=10):
        super(net, self).__init__()
        self.fc1 = nn.Linear(8, 4)
        self.fc2 = nn.Linear(4, num_class)
    def forward(self, x):
        return self.fc2(self.fc1(x))


model = net()

# 冻结fc1层的参数
for name, param in model.named_parameters():
    if "fc1" in name:
        param.requires_grad = False


loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-2)  # 传入的是所有的参数
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

for epoch in range(10):
    x = torch.randn((3, 8))
    label = torch.randint(0,10,[3]).long()
    output = model(x)

    loss = loss_fn(output, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

由实验的结果可以看出:只要设置requires_grad=False虽然传入模型所有的参数,仍然只更新requires_grad=True的。

二、直接传入要更新的参数

# 定义一个简单的网络
class net(nn.Module):
    def __init__(self, num_class=3):
        super(net, self).__init__()
        self.fc1 = nn.Linear(8, 4)
        self.fc2 = nn.Linear(4, num_class)
    def forward(self, x):
        return self.fc2(self.fc1(x))


model = net()

# 冻结fc1层的参数
# for name, param in model.named_parameters():
#     if "fc1" in name:
#         param.requires_grad = False


loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.fc2.parameters(), lr=1e-2)  # 只传入fc2的参数
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

for epoch in range(10):
    x = torch.randn((3, 8))
    label = torch.randint(0,3,[3]).long()
    output = model(x)

    loss = loss_fn(output, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)
print()

可以看出:只会更新优化器传入的参数,对于没有传入的参数虽然可以求导,但是仍然不会更新参数。

三、最优写法:

就是将上面两种结合起来,不更新的参数设置为False同时不传入。

# 定义一个简单的网络
class net(nn.Module):
    def __init__(self, num_class=3):
        super(net, self).__init__()
        self.fc1 = nn.Linear(8, 4)
        self.fc2 = nn.Linear(4, num_class)
    def forward(self, x):
        return self.fc2(self.fc1(x))


model = net()

# 冻结fc1层的参数
for name, param in model.named_parameters():
    if "fc1" in name:
        param.requires_grad = False


loss_fn = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.fc2.parameters(), lr=1e-2)
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

for epoch in range(10):
    x = torch.randn((3, 8))
    label = torch.randint(0,3,[3]).long()
    output = model(x)

    loss = loss_fn(output, label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)
print()

 

  • 47
    点赞
  • 102
    收藏
    觉得还不错? 一键收藏
  • 11
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值