为了fine-tuning网络,经常会让有预训练参数的层的学习率比较小,而新的层的学习率比较大。然后看了网上很多的教程,都是这样去定义参数组的:
# eg.
import torch
from torchvision.models.vgg import vgg16_bn
from torch import optim
num_classes = 2
lr = 1e-3
model = vgg16_bn(pretrained=True)
model.classifier[6] = torch.nn.Linear(4096, 2)
params_group1 = list(map(id, model.classifier[6].parameters()))
params_group2 = filter(lambda p: id(p) not in params_group1, model.parameters())
optimizer = optim.Adam([{'params': params_group1, 'lr': lr},
{'params': params_group2, 'lr': lr*0.1}], lr=lr)
然后就会出现如下错误:
因为使用id(object)得到的是对象的内存地址,所以是int,因此将代码改成:
params_group1 = [params for params in model.classifier[6].parameters()]
params_group1_id = list(map(id, params_group1))
params_group2 = filter(lambda p: id(p) not in params_group1_id, net.parameters())