pytorch 修改网络结构后加载预训练模型

最新推荐文章于 2024-08-05 18:14:11 发布

真的只会一点点

最新推荐文章于 2024-08-05 18:14:11 发布

阅读量6.8k

点赞数 26

文章标签：深度学习

本文链接：https://blog.csdn.net/weixin_46269983/article/details/110670666

版权

训练模型时，我们通常会加入预训练模型来初始化网络。以AlexNet为例：

class AlexNet(nn.Module):
    def __init__(self, num_classes=1000, init_weights=False):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(  
            nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2),  # input[3, 224, 224]  output[48, 55, 55] 
            nn.ReLU(inplace=True),  # inplace 对原值进行覆盖，节省内存
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[48, 27, 27] kernel_num为原论文一半
            nn.Conv2d(48, 128, kernel_size=5, padding=2),           # output[128, 27, 27]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 13, 13]
            nn.Conv2d(128, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 192, kernel_size=3, padding=1),          # output[192, 13, 13]
            nn.ReLU(inplace=True),
            nn.Conv2d(192, 128, kernel_size=3, padding=1),          # output[128, 13, 13]
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),                  # output[128, 6, 6]
            # nn.Conv2d(128, 128, kernel_size=3, padding=1),  # add myself
        )
        self.classifier = nn.Sequential(
            nn.Dropout(p=0.3),
            #全连接层
            nn.Linear(128 * 6 * 6, 2048),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.5),
            nn.Linear(2048, 2048),
            nn.ReLU(inplace=True),
            nn.Linear(2048, num_classes),
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        x = self.features(x)
        x = torch.flatten(x, start_dim=1)  # 展平或者view()
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')  # 凯明大神的初始化方法
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)  # 正态分布赋值
                nn.init.constant_(m.bias, 0)

如果网络没有修改

导入预训练模型的方法很简单，只需要如下代码：

net = AlexNet(num_classes=5, init_weights=True)
net.to(device)
net.load_state_dict(torch.load('alexnet_model.pth'))

其中，alexnet_model.pth为预训练权重。

如果网络发生修改

我们在特征提取最后加上一层

nn.Conv2d(128, 128, kernel_size=3, padding=1),

也就是说将上面AlexNet中self.features中最后一行注释打开。如果再执行上面导入预训练模型的方式，会报以下错：
在这里插入图片描述 feature.13 不存在，也就是说，我们新加的卷积层，在预训练模型alexnet_model.pth中是不存在的。
应该使用以下方法加载：

net = AlexNet(num_classes=5, init_weights=True)
net.to(device)
# net.load_state_dict(torch.load('alexnet_model.pth'))

net_dict = net.state_dict()
predict_model = torch.load('alexnet_model.pth')
print('start')
state_dict = {k: v for k, v in predict_model.items() if k in net_dict.keys()}
# 寻找网络中公共层，并保留预训练参数
print(state_dict.keys())
net_dict.update(state_dict)  # 将预训练参数更新到新的网络层
net.load_state_dict(net_dict)  # 加载预训练参数

net_dict = net.state_dict() 取出网络的模型，其中里面的参数是调用:_initialize_weights(self)生成的。

predict_model = torch.load('alexnet_model.pth')是加载预训练模型，并没有将里面的参数加载到网络中。

state_dict = {k: v for k, v in predict_model.items() if k in net_dict.keys()}是寻找net_dict模型和预训练模型中相同得网络层，将其保存在state_dict中。

net_dict.update(state_dict)是将net_dict和 predict_model共同的网络层中的参数同步到net_dict中。

最后net.load_state_dict(net_dict)将参数加载进去。
在这里插入图片描述从上图可以看出，feature13中的参数是调用 _initialize_weights()函数初始化来的，其他层中的参数是predict_model里面的参数。

AlexNet 代码来自知乎上的一篇文章，使用AlexNet对五种花的分类，非常不错。
https://zhuanlan.zhihu.com/p/180554948

真的只会一点点

关注

26
点赞
踩
154

收藏

觉得还不错? 一键收藏
0
评论
pytorch 修改网络结构后加载预训练模型

训练模型时，我们通常会加入预训练模型来初始化网络。以AlexNet为例：class AlexNet(nn.Module): def __init__(self, num_classes=1000, init_weights=False): super(AlexNet, self).__init__() self.features = nn.Sequential( nn.Conv2d(3, 48, kernel_size=11, strid
复制链接

扫一扫