Pytorch基础-05-迁移学习-01(自定义VGGNet之猫狗分类)

最新推荐文章于 2023-07-15 09:45:00 发布

骚火棍

最新推荐文章于 2023-07-15 09:45:00 发布

阅读量882

点赞数 1

分类专栏： Pytorch基础文章标签： Kaggle猫狗分类

本文链接：https://blog.csdn.net/Galen_xia/article/details/108797407

版权

Pytorch基础专栏收录该内容

9 篇文章 9 订阅

订阅专栏

什么是迁移学习？

在深度神经网络算法的应用过程中，如果我们面对的是数据规模较大的问题，那么在搭建好深度神经网络模型后，我们势必要花费大量的算力和时间去训练模型和优化参数，最后耗费了这么多资源得到的模型只能解决这一个问题，性价比非常低。如果我们用这么多资源训练的模型能够解决同一类问题，那么模型的性价比会提高很多，这就促使使用迁移模型解决同一类问题的方法出现。因为该方法的出现，我们通过对一个训练好的模型进行细微调整，就能将其应用到相似的问题中，最后还能取得很好的效果；另外，对于原始数据较少的问题，我们也能够通过采用迁移模型进行有效解决，所以，如果能够选取合适的迁移学习方法，则会对解决我们所面临的问题有很大的帮助。

下面我们以Kaggle网站上的“Dogs vs.Cats”竞赛项目为例，对迁移学习进行学习。此数据集以免费公开。在这个数据集的训练数据集中一共有 25000张猫和狗的图片，其中包含12500张猫的图片和12500张狗的图片。在测试数据集中有12500张图片，不过其中的猫狗图片是无序混杂的，而且没有对应的标签。这些数据集将被用于对模型进行训练和对参数进行优化，以及在最后对模型的泛化能力进行验证。数据下载链接：https://download.csdn.net/download/Galen_xia/12888565

1 .数据准备
数据集下载完后，我们从猫狗图片中各抽取2000张，共4000张作为验证数据集。文件结构如下：

|-DvC
    |-train
        |- cat
        |- dog
    |-valid
        |- cat
        |- dog

2 .数据预处理、载入和预览

import torch
from torchvision import datasets,transforms
import os
import time
from torch.autograd import Variable

data_dir = 'DvC'
data_transform = {
    x:transforms.Compose([transforms.Resize([64,64]),transforms.ToTensor()]) for x in ['train','valid']
}

image_datasets = {
    x:datasets.ImageFolder(root=os.path.join(data_dir,x),transform = data_transform[x]) for x in ['train','valid']
}

dataloader = {
    x:torch.utils.data.DataLoader(dataset=image_datasets[x],batch_size=16,shuffle=True) for x in ['train','valid']
}

在进行数据的载入时我们使用torch.transforms中的Resize类将原始图片的大小统一缩放至64×64。在以上代码中对数据的变换和导入都使用了字典的形式，因为我们需要分别对训练数据集和验证数据集的数据载入方法进行简单定义，所以使用字典可以简化代码，也方便之后进行相应的调用和操作。

os.path.join就是来自之前提到的 os包的方法，它的作用是将输入参数中的两个名字拼接成一个完整的文件路径。其他常用的os.path类方法查看下面这个链接：
https://blog.csdn.net/Galen_xia/article/details/108800213

下面获取一个批次的数据并进行数据预览和分析，代码如下：

x_example,y_example = next(iter(dataloader['train']))
print ('x_example个数{}'.format(len(x_example)))
print ('y_example个数{}'.format(len(y_example)))

x_example个数16
y_example个数16

以上代码通过next和iter迭代操作获取一个批次的装载数据。
x_example是Tensor数据类型的变量，因为做了图片大小的缩放变换，所以现在图片的大小全部是64×64了，那么X_example的维度就是（16, 3, 64, 64），16代表在这个批次中有16张图片；3代表色彩通道数，因为原始图片是彩色的，所以使用了R、G、B这三个通道；64代表图片的宽度值和高度值。
y_example也是Tensor数据类型的变量，不过其中的元素全部是0和 1。为什么会出现0和1？这是因为在进行数据装载时已经对dog文件夹和cat文件夹下的内容进行了独热编码（One-Hot Encoding），所以这时的0和1不仅是每张图片的标签，还分别对应猫的图片和狗的图片。我们可以做一个简单的打印输出，来验证这个独热编码的对应关系，代码如下：

index_classes = image_datasets['train'].class_to_idx
print (index_classes)

{‘cat’: 0, ‘dog’: 1}

不过，为了增加之后绘制的图像标签的可识别性，我们还需要通过image_datasets[“train”].classes将原始标签的结果存储在名为example_clasees的变量中。代码如下：

example_classes = image_datasets['train'].classes
print (example_classes)

[‘cat’, ‘dog’]

我们使用Matplotlib对一个批次的图片进行绘制，具体的代码如下：

import matplotlib.pyplot as plt
%matplotlib inline

img = torchvision.utils.make_grid(x_example)
img = img.numpy().transpose([1,2,0])
print ([example_classes[i.item()] for i in y_example])
plt.imshow(img)
plt.show()

[‘dog’, ‘dog’, ‘cat’, ‘cat’, ‘dog’, ‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘cat’, ‘cat’, ‘dog’]

在这里插入图片描述
3 .模型搭建和参数优化

自定义VGGNet：基于VGG16架构来搭建一个简化版的VGGNet模型，这个简化版模型要求输入的图片大小全部缩放到64×64，而在标准的VGG16架构模型中输入的图片大小应当是224×224的；同时简化版模型删除了VGG16最后的三个卷积层和池化层，也改变了全连接层中的连接参数，这一系列的改变都是为了减少整个模型参与训练的参数数量。简化版模型的搭建代码如下：

class Models(torch.nn.Module):
    def __init__(self):
        super(Models, self).__init__()
        self.Conv = torch.nn.Sequential(
            torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2)
        )

        self.Classes = torch.nn.Sequential(
            torch.nn.Linear(4 * 4 * 512, 1024),
            torch.nn.ReLU(),
            torch.nn.Dropout(p=0.5),
            torch.nn.Linear(1024, 1024),
            torch.nn.ReLU(),
            torch.nn.Dropout(p=0.5),
            torch.nn.Linear(1024, 2)
        )

    def forward(self, input):
        x = self.Conv(input)
        x = x.view(-1, 4 * 4 * 512)
        x = self.Classes(x)
        return x

在搭建好模型后，通过print对搭建的模型进行打印输出来显示模
型中的细节，打印输出的代码如下：

model = Models()
print (model)

Models(
  (Conv): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU()
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU()
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU()
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU()
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU()
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU()
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU()
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU()
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU()
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU()
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (Classes): Sequential(
    (0): Linear(in_features=8192, out_features=1024, bias=True)
    (1): ReLU()
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=1024, out_features=1024, bias=True)
    (4): ReLU()
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=1024, out_features=2, bias=True)
  )
)

然后，定义好模型的损失函数和对参数进行优化的优化函数，代码
如下：

loss_f = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)

epoch_n = 10
time_open = time.time()

for epoch in range(epoch_n):
    print('Epoch {}/{}'.format(epoch + 1, epoch_n))
    print('----' * 10)

    for phase in ['train', 'valid']:
        if phase == 'train':
            print('Training...')
            model.train(True)
        else:
            print('Validing...')
            model.train(False)

        running_loss = 0.0
        running_corrects = 0

        for batch, data in enumerate(dataloader[phase], 1):
            x, y = data
            x, y = Variable(x), Variable(y)
            y_pred = model(x)
            _, pred = torch.max(y_pred.data, 1)
            optimizer.zero_grad()
            loss = loss_f(y_pred, y)

            if phase == 'train':
                loss.backward()
                optimizer.step()

            running_loss += loss.data
            running_corrects += torch.sum(pred == y.data)

            if batch % 500 == 0 and phase == 'train':
                print('Batch {},Train Loss:{},Train ACC:{}'.format(
                    batch, running_loss / batch, 100 * running_corrects / (16 * batch)))

        epoch_loss = running_loss * 16 / len(image_datasets[phase])
        epoch_acc = 100 * running_corrects / len(image_datasets[phase])

        print('{} Loss:{} ACC:{}'.format(phase, epoch_loss, epoch_acc))

time_end = time.time() - time_open
print(time_end)

在代码中优化函数使用的是Adam，损失函数使用的是交叉熵，训练次数总共是10 次，最后的输出结果如下：

......
Epoch 9/10
----------------------------------------
Training...
Batch 500,Train Loss:0.5421267151832581,Train ACC:72
Batch 1000,Train Loss:0.5455202460289001,Train ACC:72
train Loss:0.5441449284553528 ACC:72
Validing...
valid Loss:0.5180226564407349 ACC:74
Epoch 10/10
----------------------------------------
Training...
Batch 500,Train Loss:0.5260576009750366,Train ACC:73
Batch 1000,Train Loss:0.5251520872116089,Train ACC:73
train Loss:0.5234497785568237 ACC:73
Validing...
valid Loss:0.5083054900169373 ACC:75
15662.592585325241

虽然准确率差强人意，但因为全程使用了计算机的 CPU进行计算，所以整个过程非常耗时，下面我们对原始代码进行适当调整，将在模型训练的过程中需要计算的参数全部迁移至 GPUs上，这个过程非常简单和方便，只需重新对这部分参数进行类型转换就可以了，当然，在此之前，我们需要先确认GPUs硬件是否可用，具体的代码如下：

use_gpu = torch.cuda.is_available()
print (use_gpu)

True

返回的值是True，这说明我们的GPUs已经具备了被使用的全部条件，新的训练代码如下：

use_gpu = torch.cuda.is_available()
if use_gpu:
    model = Models().cuda()
else:model = Models

loss_f = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)

epoch_n = 10
time_open = time.time()

for epoch in range(epoch_n):
    print('Epoch {}/{}'.format(epoch + 1, epoch_n))
    print('----' * 10)

    for phase in ['train', 'valid']:
        if phase == 'train':
            print('Training...')
            model.train(True)
        else:
            print('Validing...')
            model.train(False)

        running_loss = 0.0
        running_corrects = 0

        for batch, data in enumerate(dataloader[phase], 1):
            x, y = data
            if use_gpu:
                x, y = Variable(x.cuda()), Variable(y.cuda())
            else:
                x, y = Variable(x), Variable(y)
            y_pred = model(x)
            _, pred = torch.max(y_pred.data, 1)
            optimizer.zero_grad()
            loss = loss_f(y_pred, y)

            if phase == 'train':
                loss.backward()
                optimizer.step()

            running_loss += loss.data
            running_corrects += torch.sum(pred == y.data)

            if batch % 500 == 0 and phase == 'train':
                print('Batch {},Train Loss:{},Train ACC:{}'.format(
                    batch, running_loss / batch, 100 * running_corrects / (16 * batch)))

        epoch_loss = running_loss * 16 / len(image_datasets[phase])
        epoch_acc = 100 * running_corrects / len(image_datasets[phase])

        print('{} Loss:{} ACC:{}'.format(phase, epoch_loss, epoch_acc))

time_end = time.time() - time_open
print(time_end)

在以上代码中，model = model.cuda()和
x, y =Variable(x.cuda()),Variable(y.cuda())就是参与迁移至GPUs的具体代码，在进行10次训练后，输出的结果如下：

......
Epoch 9/10
----------------------------------------
Training...
Batch 500,Train Loss:0.5389402508735657,Train ACC:72
Batch 1000,Train Loss:0.5377429127693176,Train ACC:72
train Loss:0.5390774607658386 ACC:72
Validing...
valid Loss:0.5216068029403687 ACC:74
Epoch 10/10
----------------------------------------
Training...
Batch 500,Train Loss:0.5351917743682861,Train ACC:72
Batch 1000,Train Loss:0.5298460721969604,Train ACC:73
train Loss:0.5265329480171204 ACC:73
Validing...
valid Loss:0.498246431350708 ACC:75
1019.3638801574707

与之前的训练相比，耗时大幅下降，明显比使用CPU进行参数计算在效率上高出不少。

下面我把完整代码贴出来：

import torch
from torchvision import datasets,transforms
import os
import time
from torch.autograd import Variable

data_dir = 'DvC'
data_transform = {
    x:transforms.Compose([transforms.Scale([64,64]),transforms.ToTensor()]) for x in ['train','valid']
}

image_datasets = {
    x:datasets.ImageFolder(root=os.path.join(data_dir,x),transform = data_transform[x]) for x in ['train','valid']
}

dataloader = {
    x:torch.utils.data.DataLoader(dataset=image_datasets[x],batch_size=16,shuffle=True) for x in ['train','valid']
}


class Models(torch.nn.Module):
    def __init__(self):
        super(Models, self).__init__()
        self.Conv = torch.nn.Sequential(
            torch.nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2),

            torch.nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            torch.nn.ReLU(),
            torch.nn.MaxPool2d(kernel_size=2, stride=2)
        )

        self.Classes = torch.nn.Sequential(
            torch.nn.Linear(4 * 4 * 512, 1024),
            torch.nn.ReLU(),
            torch.nn.Dropout(p=0.5),
            torch.nn.Linear(1024, 1024),
            torch.nn.ReLU(),
            torch.nn.Dropout(p=0.5),
            torch.nn.Linear(1024, 2)
        )

    def forward(self, input):
        x = self.Conv(input)
        x = x.view(-1, 4 * 4 * 512)
        x = self.Classes(x)
        return x

use_gpu = torch.cuda.is_available()
if use_gpu:
    model = Models().cuda()
else:model = Models

loss_f = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)

epoch_n = 10
time_open = time.time()

for epoch in range(epoch_n):
    print('Epoch {}/{}'.format(epoch + 1, epoch_n))
    print('----' * 10)

    for phase in ['train', 'valid']:
        if phase == 'train':
            print('Training...')
            model.train(True)
        else:
            print('Validing...')
            model.train(False)

        running_loss = 0.0
        running_corrects = 0

        for batch, data in enumerate(dataloader[phase], 1):
            x, y = data
            if use_gpu:
                x, y = Variable(x.cuda()), Variable(y.cuda())
            else:
                x, y = Variable(x), Variable(y)
            y_pred = model(x)
            _, pred = torch.max(y_pred.data, 1)
            optimizer.zero_grad()
            loss = loss_f(y_pred, y)

            if phase == 'train':
                loss.backward()
                optimizer.step()

            running_loss += loss.data
            running_corrects += torch.sum(pred == y.data)

            if batch % 500 == 0 and phase == 'train':
                print('Batch {},Train Loss:{},Train ACC:{}'.format(
                    batch, running_loss / batch, 100 * running_corrects / (16 * batch)))

        epoch_loss = running_loss * 16 / len(image_datasets[phase])
        epoch_acc = 100 * running_corrects / len(image_datasets[phase])

        print('{} Loss:{} ACC:{}'.format(phase, epoch_loss, epoch_acc))

time_end = time.time() - time_open
print(time_end)

骚火棍

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
4
评论
Pytorch基础-05-迁移学习-01(自定义VGGNet之猫狗分类)

什么是迁移学习？在深度神经网络算法的应用过程中，如果我们面对的是数据规模较大的问题，那么在搭建好深度神经网络模型后，我们势必要花费大量的算力和时间去训练模型和优化参数，最后耗费了这么多资源得到的模型只能解决这一个问题，性价比非常低。如果我们用这么多资源训练的模型能够解决同一类问题，那么模型的性价比会提高很多，这就促使使用迁移模型解决同一类问题的方法出现。因为该方法的出现，我们通过对一个训练好的模型进行细微调整，就能将其应用到相似的问题中，最后还能取得很好的效果；另外，对于原始数据较少的.
复制链接

扫一扫

专栏目录