pytorch 如何在改变基础网络之后使用预训练模型

最新推荐文章于 2024-07-26 15:01:37 发布

小小小绿叶

最新推荐文章于 2024-07-26 15:01:37 发布

阅读量4.3k

点赞数 4

分类专栏： pytorch 文章标签：深度学习神经网络人工智能 pytorch

本文链接：https://blog.csdn.net/litt1e/article/details/103584016

版权

pytorch 专栏收录该内容

11 篇文章 3 订阅

订阅专栏

我们在做具体的项目时，经常需要我们改变基础网络，但手头上的数据有限，所以想高质量的完成项目，我们需要使用预训练模型。这里我介绍一种方法能够在改变基础网络的同时加载预训练模型。

1.在原有backbone中增加层并载入pretrained model

class ResNet(nn.Module):

    def __init__(self, block, layers, num_classes=1000, zero_init_residual=False):
        super(ResNet, self).__init__()
        self.inplanes = 64
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.conv1_sj = nn.Conv2d(35, 64, kernel_size=7, stride=2, padding=3,
                               bias=False)
        self.bn1 = nn.BatchNorm2d(64)
        self.bn1_sj = nn.BatchNorm2d(64)
        self.relu = nn.ReLU(inplace=False)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.layer1 = self._make_layer(block, 64, layers[0])
        self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
        self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
        self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(576 * block.expansion, num_classes)
  def _make_layer(self, block, planes, blocks, stride=1):
        downsample = None
        if stride != 1 or self.inplanes != planes * block.expansion:
            downsample = nn.Sequential(
                conv1x1(self.inplanes, planes * block.expansion, stride),
                nn.BatchNorm2d(planes * block.expansion),
            )

        layers = []
        layers.append(block(self.inplanes, planes, stride, downsample))
        self.inplanes = planes * block.expansion
        for _ in range(1, blocks):
            layers.append(block(self.inplanes, planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        。。。。。

我们以resnet18为例，上面可以看到我对resnet18做了一些调整，增加了一下代码。

self.conv1_sj = nn.Conv2d(35, 64, kernel_size=7, stride=2, padding=3, bias=False)
self.fc = nn.Linear(576 * block.expansion, num_classes)

在使用预训练模型的时候需要注意，我们在训练新数据时，需要把预训练模型中有的层设置小的learning rate，而自己增加的层设置稍微大一点的Lr。尽可能的保持原有分类性能并增加对新数据的泛化性能。

model_ft = models.resnet18(pretrained=False)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 9)

conv_params = list(map(id,model_ft.conv1_sj.parameters()))   #提出前两个卷积层存放参数的地址

conv_params += list(map(id,model_ft.fc.parameters()))
print(conv_params)

rest_params = filter(lambda x:id(x) not in conv_params,model_ft.parameters())
state_dict = torch.load('./wights/latest.pt')
model_ft.load_state_dict(state_dict['model_state_dict'])

optimizer_ft = optim.SGD([{'params':model_ft.conv1_sj.parameters(),'lr':0.001},{'params':rest_params,'lr':0.0003}])

我们找出新增层的id，以及原有层的id，然后在优化器中分别设置lr。

2.减少Backbone的层并载入pretrained

net-b = Net-b()  #别忘记传递必要的参数
net-b_dict = net-b.state_dict()
state_dict = torch.load(net-a_ckpt_path)	#加载预先训练好net-a的.pth文件
new_state_dict = OrderedDict()		#不是必要的【from collections import OrderedDict】 

new_state_dict = {k:v for k,v in state_dict.items() if k in net-b_dict}	#删除net-b不需要的键
net-b_dict.update(new_state_dict)	#更新参数
net-b.load_state_dict(net-b_dict)	#加载参数

for name, para in net-a.named_parameters():
    print(name, torch.max(para))
for name, para in net-b.named_parameters():
    print(name, torch.max(para))