迁移学习训练分类模型实践第一篇

前言

为了简洁,本文不包含任何训练过程,仅介绍处理数据、构建模型、使用随机初始化权重推断;
关于如何使用预训练模型,训练整个流程,后面继续介绍。

数据获取、预处理

数据集:102 Category Flower Dataset
点击下载

包括102种花卉。每个类别包含40到258张图片。这些图像有很大的尺度,姿势和光线变化。此外,还有一些类别有很大的变化,以及一些非常相似的类别。

!unzip flower_data.zip
# 导入必要的库
from collections import OrderedDict
import numpy as np
import torch
from torch import nn, optim
from torchvision import datasets, transforms, models
import torchvision.transforms.functional as TF
from torch.utils.data import Subset
from thop import profile, clever_format
from torchsummary import summary
from PIL import Image
data_dir = 'flower_data'
input_size = 224
# 用来归一化的均值和标准差
normalize_mean = np.array([0.485, 0.456, 0.406])
normalize_std = np.array([0.229, 0.224, 0.225])

构建模型

使用torchvision提供的resnet,并根据数据集修改模型的分类器,因为所提供的模型是基于ImageNet设计的,分类器是1000类,并不适用与这个数据集。将原始分类器改为102类的分类器。
这里也可以使用其他模型,后续将根据效果和需求适当调整模型

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print(f'Running on: {str(device).upper()}')
    Running on: CUDA
output_size = 102
model = models.resnet18()
# 替换分类器为102类
classifier = OrderedDict()
classifier['layer0'] = nn.Linear(model.fc.in_features, output_size)
classifier['output_function'] = nn.LogSoftmax(dim=1)
model.fc = nn.Sequential(classifier)

model.to(device)
    ResNet(
      (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
      (layer1): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
        (1): BasicBlock(
          (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (layer2): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (layer3): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (layer4): Sequential(
        (0): BasicBlock(
          (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (downsample): Sequential(
            (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
            (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          )
        )
        (1): BasicBlock(
          (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (relu): ReLU(inplace=True)
          (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
      (fc): Sequential(
        (layer0): Linear(in_features=512, out_features=102, bias=True)
        (output_function): LogSoftmax(dim=1)
      )
    )

查看模型参数量和FLOPs

区分一下FLOPSFLOPs

FLOPs:注意s小写,是floating point operations的缩写(s表复数),意指浮点运算数,理解为计算量。可以用来衡量算法/模型的复杂度。

FLOPS什么是FLOPS

参考:知乎

查看模型及其参数量以及FLOPs有助于我们对模型进一步了解,对以后部署也是可以提供优化方向的:

# model.to(device)
_input = torch.randn(1, 3, input_size, input_size).to(device)
flops, params = profile(model, inputs=(_input,))  # 自定义模块需要:custom_ops={YourModule: count_your_model}
flops, params = clever_format([flops, params], '%.6f')
print('FLOPs:', flops, '\tparams:', params ) 
# FLOPs: 1.819066G 	params: 11.689512M
  [INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
  [INFO] Register count_bn() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
  [INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
  [INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
  [WARN] Cannot find rule for <class 'torchvision.models.resnet.BasicBlock'>. Treat it as zero Macs and zero Params.
  [WARN] Cannot find rule for <class 'torch.nn.modules.container.Sequential'>. Treat it as zero Macs and zero Params.
  [INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
  [INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
  [WARN] Cannot find rule for <class 'torch.nn.modules.activation.LogSoftmax'>. Treat it as zero Macs and zero Params.
  [WARN] Cannot find rule for <class 'torchvision.models.resnet.ResNet'>. Treat it as zero Macs and zero Params.
  FLOPs: 1.818607G 	params: 11.228838M

自定义模块需要自己添加hook去计算
custom_ops={ YourModule: count_your_model }
YourModule:自定义模块
count_your_model:自定义模块的计算hook函数
参考:thop.profile.py

# model.to(device)
summary(model, (3, input_size, input_size))
    ----------------------------------------------------------------
            Layer (type)               Output Shape         Param #
    ================================================================
                Conv2d-1         [-1, 64, 112, 112]           9,408
           BatchNorm2d-2         [-1, 64, 112, 112]             128
                  ReLU-3         [-1, 64, 112, 112]               0
             MaxPool2d-4           [-1, 64, 56, 56]               0
                Conv2d-5           [-1, 64, 56, 56]          36,864
           BatchNorm2d-6           [-1, 64, 56, 56]             128
                  ReLU-7           [-1, 64, 56, 56]               0
                Conv2d-8           [-1, 64, 56, 56]          36,864
           BatchNorm2d-9           [-1, 64, 56, 56]             128
                 ReLU-10           [-1, 64, 56, 56]               0
           BasicBlock-11           [-1, 64, 56, 56]               0
               Conv2d-12           [-1, 64, 56, 56]          36,864
          BatchNorm2d-13           [-1, 64, 56, 56]             128
                 ReLU-14           [-1, 64, 56, 56]               0
               Conv2d-15           [-1, 64, 56, 56]          36,864
          BatchNorm2d-16           [-1, 64, 56, 56]             128
                 ReLU-17           [-1, 64, 56, 56]               0
           BasicBlock-18           [-1, 64, 56, 56]               0
               Conv2d-19          [-1, 128, 28, 28]          73,728
          BatchNorm2d-20          [-1, 128, 28, 28]             256
                 ReLU-21          [-1, 128, 28, 28]               0
               Conv2d-22          [-1, 128, 28, 28]         147,456
          BatchNorm2d-23          [-1, 128, 28, 28]             256
               Conv2d-24          [-1, 128, 28, 28]           8,192
          BatchNorm2d-25          [-1, 128, 28, 28]             256
                 ReLU-26          [-1, 128, 28, 28]               0
           BasicBlock-27          [-1, 128, 28, 28]               0
               Conv2d-28          [-1, 128, 28, 28]         147,456
          BatchNorm2d-29          [-1, 128, 28, 28]             256
                 ReLU-30          [-1, 128, 28, 28]               0
               Conv2d-31          [-1, 128, 28, 28]         147,456
          BatchNorm2d-32          [-1, 128, 28, 28]             256
                 ReLU-33          [-1, 128, 28, 28]               0
           BasicBlock-34          [-1, 128, 28, 28]               0
               Conv2d-35          [-1, 256, 14, 14]         294,912
          BatchNorm2d-36          [-1, 256, 14, 14]             512
                 ReLU-37          [-1, 256, 14, 14]               0
               Conv2d-38          [-1, 256, 14, 14]         589,824
          BatchNorm2d-39          [-1, 256, 14, 14]             512
               Conv2d-40          [-1, 256, 14, 14]          32,768
          BatchNorm2d-41          [-1, 256, 14, 14]             512
                 ReLU-42          [-1, 256, 14, 14]               0
           BasicBlock-43          [-1, 256, 14, 14]               0
               Conv2d-44          [-1, 256, 14, 14]         589,824
          BatchNorm2d-45          [-1, 256, 14, 14]             512
                 ReLU-46          [-1, 256, 14, 14]               0
               Conv2d-47          [-1, 256, 14, 14]         589,824
          BatchNorm2d-48          [-1, 256, 14, 14]             512
                 ReLU-49          [-1, 256, 14, 14]               0
           BasicBlock-50          [-1, 256, 14, 14]               0
               Conv2d-51            [-1, 512, 7, 7]       1,179,648
          BatchNorm2d-52            [-1, 512, 7, 7]           1,024
                 ReLU-53            [-1, 512, 7, 7]               0
               Conv2d-54            [-1, 512, 7, 7]       2,359,296
          BatchNorm2d-55            [-1, 512, 7, 7]           1,024
               Conv2d-56            [-1, 512, 7, 7]         131,072
          BatchNorm2d-57            [-1, 512, 7, 7]           1,024
                 ReLU-58            [-1, 512, 7, 7]               0
           BasicBlock-59            [-1, 512, 7, 7]               0
               Conv2d-60            [-1, 512, 7, 7]       2,359,296
          BatchNorm2d-61            [-1, 512, 7, 7]           1,024
                 ReLU-62            [-1, 512, 7, 7]               0
               Conv2d-63            [-1, 512, 7, 7]       2,359,296
          BatchNorm2d-64            [-1, 512, 7, 7]           1,024
                 ReLU-65            [-1, 512, 7, 7]               0
           BasicBlock-66            [-1, 512, 7, 7]               0
    AdaptiveAvgPool2d-67            [-1, 512, 1, 1]               0
               Linear-68                  [-1, 102]          52,326
           LogSoftmax-69                  [-1, 102]               0
    ================================================================
    Total params: 11,228,838
    Trainable params: 11,228,838
    Non-trainable params: 0
    ----------------------------------------------------------------
    Input size (MB): 0.57
    Forward/backward pass size (MB): 62.79
    Params size (MB): 42.83
    Estimated Total Size (MB): 106.20
    ----------------------------------------------------------------

测试模型

使用一张图片作为测试,验证整个过程有没有问题,这里只输出了模型的推断置信度,但是这是随机值,所以并没有将其可是化,因为没有任何参考意义。后续对模型进行训练,对测试图片进行推断,可视化可以直观的了解推断的效果以评价模型的好坏。

def process_image(image):
    ''' 预处理图片,返回numpy数组
    '''
    image = TF.resize(image, 256)
    
    upper_pixel = (image.height - 224) // 2
    left_pixel = (image.width - 224) // 2
    image = TF.crop(image, upper_pixel, left_pixel, 224, 224)
    
    image = TF.to_tensor(image)
    image = TF.normalize(image, normalize_mean, normalize_std)
    
    return image
def predict(image_path, model, topk=5):
    ''' 读取图片预测结果,返回Top5
    '''
    image = Image.open(image_path)
    image = process_image(image)
    
    with torch.no_grad():
        model.eval()
        
        image = image.view(1,3,224,224)
        image = image.to(device)
        
        predictions = model.forward(image)
        
        predictions = torch.exp(predictions)
        top_ps, top_class = predictions.topk(topk, dim=1)
    
    return top_ps, top_class
category = 30
image_name = 'image_03475.jpg'
image_path = data_dir + f'/valid/{category}/{image_name}'

probs, classes = predict(image_path, model)
print(probs)
print(classes)
    tensor([[0.0301, 0.0275, 0.0264, 0.0257, 0.0219]], device='cuda:0')
    tensor([[73, 76,  9, 62, 32]], device='cuda:0')
  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ayiya_Oese

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值