CNN模型裁剪/迁移学习（Pytorch 官方文档）

最新推荐文章于 2023-04-09 21:56:38 发布

MiaL

最新推荐文章于 2023-04-09 21:56:38 发布

阅读量2.1k

点赞数 3

分类专栏： Pytorch学习

Pytorch学习专栏收录该内容

8 篇文章 1 订阅

订阅专栏

CNN模型裁剪和迁移学习

迁移学习两种途径

In practice, very few people train an entire Convolutional Network from scratch (with random initialization), because it is relatively rare to have a dataset of sufficient size. Instead, it is common to pretrain a ConvNet on a very large dataset (e.g. ImageNet, which contains 1.2 million images with 1000 categories), and then use the ConvNet either as an initialization or a fixed feature extractor for the task of interest.

实践中，几乎没有人从0开始训练完整的卷积神经网络，这是由于缺乏足够大的数据集。相反，通常使用已经在大型数据集训练的网络（如 ImageNet等）,使用预训练的卷积网络可以作为初始化任务，或者作为特征提取器。

These two major transfer learning scenarios look as follows:
Finetuning the convnet: Instead of random initializaion, we initialize the network with a pretrained network, like the one that is trained on imagenet 1000 dataset. Rest of the training looks as usual.
ConvNet as fixed feature extractor: Here, we will freeze the weights for all of the network except that of the final fully connected layer. This last fully connected layer is replaced with a new one with random weights and only this layer is trained.

这两种迁移学习的场景如下：
微调网络：使用预训练的网络进行初始化，而不是随机初始化，如采用在imagenet数据集训练过的网络，剩下的训练和通常的训练一样。（译者注：站在巨人肩膀上，对各层参数进行微调和更新，继续在新任务中训练）。
卷积网络特征提取器：除全连接层外，冻结所有权重参数。将最终层替换为一个具有随机权重的全连接层，仅仅对这一层进行训练。（译者注：仅裁剪重新训练全连接层，全面的所有层都作为特征提取器）。

核心代码解释

微调预训练卷积网络（Finetuning the convnet）

model_ft = models.resnet18(pretrained=True)  #加载预训练网络
num_ftrs = model_ft.fc.in_features  #获取全连接层输入特征
model_ft.fc = nn.Linear(num_ftrs, 2) #重置全连接层
model_ft = model_ft.to(device)   # 设置采用设备，Cpu or Gpu？
criterion = nn.CrossEntropyLoss()  #选取损失函数
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9) # 梯度优化算法
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1) # 学习率更新
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                   num_epochs=25)  #训练及更新

以上代码重新设置了全连接层，权重参数随机初始化，微调整个网络

卷积网络作为特征提取器（ConvNet as fixed feature extractor）

model_conv = torchvision.models.resnet18(pretrained=True)  #加载预训练模型
for param in model_conv.parameters(): 
    param.requires_grad = False     #冻结参数
num_ftrs = model_conv.fc.in_features  #获取全连接层输入特征数
model_conv.fc = nn.Linear(num_ftrs, 2)  #重置全连接层
model_conv = model_conv.to(device) 
criterion = nn.CrossEntropyLoss()
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
model_conv = train_model(model_conv, criterion, optimizer_conv,
                         exp_lr_scheduler, num_epochs=25)  #训练模型

以上模型冻结原有参数的前提下，对初始化的全连接层权重参数进行训练。
测试两种方式，微调模型epoch=1，特征提取模型epoch=3下准确率相当，时间相当，两种方式没有明显差别。

可能在其他预训练模型上有差别，未完待续。

MiaL

关注

3
点赞
踩
14

收藏

觉得还不错? 一键收藏
1
评论
CNN模型裁剪/迁移学习（Pytorch 官方文档）

CNN模型裁剪和迁移学习迁移学习两种途径In practice, very few people train an entire Convolutional Network from scratch (with random initialization), because it is relatively rare to have a dataset of sufficient size....
复制链接

扫一扫