[weights]1. 加载权重之加载部分权重

最新推荐文章于 2024-04-20 13:11:12 发布

呼啦圈正在输入中...

最新推荐文章于 2024-04-20 13:11:12 发布

阅读量1.8k

点赞数 3

分类专栏：权重使用

本文链接：https://blog.csdn.net/weixin_45745378/article/details/112644980

版权

权重使用专栏收录该内容

2 篇文章 0 订阅

订阅专栏

加载部分网络权重

一、加载部分权重的意义

可以用于迁移学习。迁移学习就是通过迁移已经训练好的权重继续进行学习，可以加速网络的训练速度，可以在之前的学习的基础上进行学习，但是要注意，优化器的使用，如果是别人的网络，一般都是最后使用SGD优化器的，因此当我们加载别人训练好的权重的时候，一定要注意，不能使用Adam优化器，这个优化器，会破坏之前的权重分布，达不到迁移学习的效果。【注意】：我们使用其他人的网络进行迁移学习的时候，一定要注意是迁移的是特征提取层，可以对其网络进行修改，但是最好是修改最后面的全连接层（或者是分类层，回归层），特征提取层的部分一般不要做修改（其实也是有修改方法的，应用下面的方法就可以，但是不建议）。
用于对网络中权重进行修改。情况一（自己懒）：有时候，我们需要对网络中的某个地方进行修改，但是这个时候网络已经发生改变，直接使用权重会导致出现问题，而我们有需要之前提前训练好的权重，不想重新训练而消耗时间。情况二：（例如：进行人脸检测的时候，刚开始只进行了人脸的回归和置信度的判断，而忘记了对数据进行处理）

二、加载部分权重的方法

下面是一个简单的网络，网络结构如下:

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 32, 3, 1, 1),
            nn.BatchNorm2d(32),
            nn.PReLU(),
            nn.Conv2d(32, 32, 3, 2, 0),
            nn.BatchNorm2d(32),
            nn.PReLU()
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(32, 64, 3, 1, 0),
            nn.BatchNorm2d(64),
            nn.PReLU(),
            nn.Conv2d(64, 64, 3, 2, 0),
            nn.BatchNorm2d(64),
            nn.PReLU()
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(64, 64, 3, 1, 0),
            nn.BatchNorm2d(64),
            nn.PReLU(),
            nn.Conv2d(64, 64, 2, 2, 0),
            nn.BatchNorm2d(64),
            nn.PReLU()
        )
        self.conv4 = nn.Sequential(
            nn.Conv2d(64, 128, 2, 1, 0),
            nn.BatchNorm2d(128),
            nn.PReLU()
        )
        self.fc = nn.Sequential(
            nn.Linear(3 * 3 * 128, 256),
            nn.PReLU()
        )
        self.out1 = nn.Linear(256, 1)
        self.out2 = nn.Linear(256, 4)

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = self.conv3(out)
        out = self.conv4(out)
        out = out.reshape(x.size(0), -1)
        out = self.fc(out)
        confidence = torch.sigmoid(self.out1(out))  
        offset = self.out2(out)
        return confidence, offset
"""
('out2.weight', tensor([[ 0.0707,  0.0037, -0.0167,  ...,  0.0078, -0.0061, -0.0198],
        [ 0.1338,  0.0018,  0.0071,  ...,  0.0016, -0.0098, -0.0068],
        [-0.0671, -0.0028, -0.0131,  ..., -0.0198,  0.0182,  0.0117],
        [-0.0068,  0.0103,  0.0111,  ..., -0.0253,  0.0257,  0.0047]],
       device='cuda:0')), 
('out2.bias', tensor([ 0.0151, -0.0139, -0.0132,  0.0394], device='cuda:0'))])

'out2.weight' --> torch.Size([4, 256])
'out2.bias'   --> torch.Size([4])
"""

上面输出的就是out2网络层的经过训练的权重和形状，下面的代码就对out2网络层进行了修改，如下：

# 网络进行了修改
class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv2d(3, 32, 3, 1, 1),
            nn.BatchNorm2d(32),
            nn.PReLU(),
            nn.Conv2d(32, 32, 3, 2, 0),
            nn.BatchNorm2d(32),
            nn.PReLU()
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(32, 64, 3, 1, 0),
            nn.BatchNorm2d(64),
            nn.PReLU(),
            nn.Conv2d(64, 64, 3, 2, 0),
            nn.BatchNorm2d(64),
            nn.PReLU()
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(64, 64, 3, 1, 0),
            nn.BatchNorm2d(64),
            nn.PReLU(),
            nn.Conv2d(64, 64, 2, 2, 0),
            nn.BatchNorm2d(64),
            nn.PReLU()
        )
        self.conv4 = nn.Sequential(
            nn.Conv2d(64, 128, 2, 1, 0),
            nn.BatchNorm2d(128),
            nn.PReLU()
        )
        self.fc = nn.Sequential(
            nn.Linear(3 * 3 * 128, 256),
            nn.PReLU()
        )
        self.out1 = nn.Linear(256, 1)
        self.out2 = nn.Linear(256, 14)   # 修改了网络由4个输出变成了14个输出

    def forward(self, x):
        out = self.conv1(x)
        out = self.conv2(out)
        out = self.conv3(out)
        out = self.conv4(out)
        out = out.reshape(x.size(0), -1)
        out = self.fc(out)
        confidence = torch.sigmoid(self.out1(out))  
        offset = self.out2(out)
        return confidence, offset


if __name__ = '__main__':
	net = Net().cuda()
	save_path = r"./params/net.pth"
	pre_weights = torch.load(save_path)
    print(pre_weights)
    print(pre_weights['out2.weight'].shape)
    print(pre_weights['out2.bias'].shape)
   '新的[10,4]的权重，cat到之前的权重参数中去'
    add_weights = torch.randn((10, pre_weights['out2.weight'].shape[1])).cuda()
    add_bias = torch.randn(10).cuda()
    
    pre_weights['out2.weight'] = torch.cat(
        [pre_weights['out2.weight'], pre_weights['out2.weight'], pre_weights['out2.weight'],
         pre_weights['out2.weight'][0:2]], dim=0)
         
    pre_weights['out2.bias'] = torch.cat(
        [pre_weights['out2.bias'], pre_weights['out2.bias'], pre_weights['out2.bias'],
         pre_weights['out2.bias'][0:2]], dim=0)
         
    net.load_state_dict(pre_weights)  # 加载经过修改后的权重，可以适应最新的网络

三、使用效果

使用过后，网络可以在之前的权重上进行训练，速度很快，对于任务较少的网络，可以适当增加网络的精度和R2分数。同时，在加载了之前的参数的时候，14个维度的R2分数，也会上升的比较快。
这里面需要注意的就是对增加的权重参数的初始化问题，需要一定的经验，新增加的权重会对网络的继续训练有一点的影响，当时影响没有很大。

以上属于个人见解，如果有不对的地方，希望可以指正