我们在做具体的项目时,经常需要我们改变基础网络,但手头上的数据有限,所以想高质量的完成项目,我们需要使用预训练模型。这里我介绍一种方法能够在改变基础网络的同时加载预训练模型。
1.在原有backbone中增加层并载入pretrained model
class ResNet(nn.Module):
def __init__(self, block, layers, num_classes=1000, zero_init_residual=False):
super(ResNet, self).__init__()
self.inplanes = 64
self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.conv1_sj = nn.Conv2d(35, 64, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = nn.BatchNorm2d(64)
self.bn1_sj = nn.BatchNorm2d(64)
self.relu = nn.ReLU(inplace=False)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(576 * block.expansion, num_classes)
def _make_layer(self, block, planes, blocks, stride=1):
downsample = None
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.Sequential(
conv1x1(self.inplanes, planes * block.expansion, stride),
nn.BatchNorm2d(planes * block.expansion),
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample))
self.inplanes = planes * block.expansion
for _ in range(1, blocks):
layers.append(block(self.inplanes, planes))
return nn.Sequential(*layers)
def forward(self, x):
。。。。。
我们以resnet18为例,上面可以看到我对resnet18做了一些调整,增加了一下代码。
self.conv1_sj = nn.Conv2d(35, 64, kernel_size=7, stride=2, padding=3, bias=False)
self.fc = nn.Linear(576 * block.expansion, num_classes)
在使用预训练模型的时候需要注意,我们在训练新数据时,需要把预训练模型中有的层设置小的learning rate,而自己增加的层设置稍微大一点的Lr。尽可能的保持原有分类性能并增加对新数据的泛化性能。
model_ft = models.resnet18(pretrained=False)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 9)
conv_params = list(map(id,model_ft.conv1_sj.parameters())) #提出前两个卷积层存放参数的地址
conv_params += list(map(id,model_ft.fc.parameters()))
print(conv_params)
rest_params = filter(lambda x:id(x) not in conv_params,model_ft.parameters())
state_dict = torch.load('./wights/latest.pt')
model_ft.load_state_dict(state_dict['model_state_dict'])
optimizer_ft = optim.SGD([{'params':model_ft.conv1_sj.parameters(),'lr':0.001},{'params':rest_params,'lr':0.0003}])
我们找出新增层的id,以及原有层的id,然后在优化器中分别设置lr。
2.减少Backbone的层并载入pretrained
net-b = Net-b() #别忘记传递必要的参数
net-b_dict = net-b.state_dict()
state_dict = torch.load(net-a_ckpt_path) #加载预先训练好net-a的.pth文件
new_state_dict = OrderedDict() #不是必要的【from collections import OrderedDict】
new_state_dict = {k:v for k,v in state_dict.items() if k in net-b_dict} #删除net-b不需要的键
net-b_dict.update(new_state_dict) #更新参数
net-b.load_state_dict(net-b_dict) #加载参数
for name, para in net-a.named_parameters():
print(name, torch.max(para))
for name, para in net-b.named_parameters():
print(name, torch.max(para))
我们有pretrained model net-a_ckpt_path,需要加载到net-b中,net-b是net-a的一部分,我们将net-b需要的层写入new_state_dict,更新参数并加载进来即可。最后用代码检验是否加载成功。