猫十二分类问题

一、项目背景

1. 赛题介绍

  • 本场比赛要求参赛选手对十二种猫进行分类,属于CV方向经典的图像分类任务。图像分类任务作为其他图像任务的基石,可以让大家更快上手计算机视觉。

2. 数据简介

  • 比赛数据集包含12种猫的图片,并划分为训练集与测试集。
  • 训练集: 提供高清彩色图片以及图片所属的分类,共有2160张猫的图片,含标注文件。
  • 测试集: 仅提供彩色图片,共有240张猫的图片,不含标注文件。

二、数据处理

1. 解压PaddleClas与解压数据集

  • 由于pip安装paddleclas时构建python-opencv会卡,所有从github上下载paddleclas压缩包传到AI Studio上。
  • 注:由于用到解压后的paddleclas,会切换到paddleclas目录下,文件路径建议统一使用绝对路径。

In [16]

# 解压数据集
!unzip -oqn /home/aistudio/data/data10954/cat_12_test.zip -d data/
!unzip -oqn /home/aistudio/data/data10954/cat_12_train.zip -d data/
# 解压paddleclas
!unzip -oqn /home/aistudio/PaddleClas-release-2.2.zip
%cd PaddleClas-release-2.2

2. 构建train.txt、val.txt和test.txt

  • train.txt:每列代表一个样本(图片绝对路径\t标签)以\t隔开。
  • val.txt:每列代表一个样本(图片绝对路径\t标签)以\t开;从训练集中取出一部分当验证集。
  • test.txt:每列代表一个样本(图片绝对路径)。

In [15]

import os
import numpy as np


# 数据列表写入.txt文件
def data_to_txt(datas, save_path):
    with open(save_path, 'w') as f:
        for i in datas:
            f.write(f'{i}\n')

# 构造带标签的数据列表
datas_with_label = []
with open('/home/aistudio/data/data10954/train_list.txt', 'r') as f:
    for line in f.readlines():
        line = line.strip()
        datas_with_label.append(f'/home/aistudio/data/{line}')  # 图片绝对路径 标签

# 打乱带标签的数据列表
np.random.shuffle(datas_with_label)

# 按照8:2划分训练集和验证集
train_datas = datas_with_label[len(datas_with_label)//10*2:] 
val_datas = datas_with_label[:len(datas_with_label)//10*2]
print('train_datas len:', len(train_datas))
print('val_datas len:', len(val_datas))

# 写入train.txt、val.txt文件
data_to_txt(train_datas, '/home/aistudio/train.txt')
data_to_txt(val_datas, '/home/aistudio/val.txt')

# 构造测试集数据列表
test_datas = []
test_dir = '/home/aistudio/data/cat_12_test'
for i in os.listdir(test_dir):
    test_datas.append(os.path.join(test_dir, i))
print('test_datas len:', len(test_datas))

# 写入test.txt
data_to_txt(test_datas, '/home/aistudio/test.txt')

3. 自定义数据集读取类

In [3]

import paddle
import numpy as np
from PIL import Image


class CatDataset(paddle.io.Dataset):
    def __init__(self, txtpath, mode='train', transform=None):
        super(CatDataset, self).__init__()

        assert mode in ['train', 'val', 'test'], "mode is one of ['train', 'val', 'test']"
        self.mode = mode
        self.transform = transform
        self.data = []

        with open(txtpath, 'r') as f:
            for line in f.readlines():
                line = line.strip()
                if mode != 'test':
                    self.data.append([line.split('\t')[0], line.split('\t')[1]])
                else:
                    self.data.append(line)
    
    def __getitem__(self, idx):
        if self.mode != 'test':
            img = Image.open(self.data[idx][0]).convert('RGB')
            label = self.data[idx][1]
            if self.transform:
                img = self.transform(img)
            return img.astype('float32'), np.array(label, dtype='int64')
        else:
            img = Image.open(self.data[idx]).convert('RGB')
            if self.transform:
                img = self.transform(img)
            return img.astype('float32')
    
    def __len__(self):
        return len(self.data)

三、模型组网

1. 构建Dataloader

In [14]

import paddle
import paddle.vision.transforms as T


# 数据批大小
batch_size = 64  # 当内存不够的时候,调小
# 尺寸
size = 480  # 可调
# transform
train_transform = T.Compose([
    T.Resize(size=size),
    T.RandomRotation(degrees=30, interpolation='bilinear'),
    T.ColorJitter(0.3, 0.3, 0.3, 0.3),
    T.CenterCrop(size=size),
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])
eval_transform = T.Compose([
    T.Resize(size=size),
    T.CenterCrop(size=size),
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])

# dataset
train_dataset = CatDataset(txtpath='/home/aistudio/train.txt', mode='train', transform=train_transform)
val_dataset = CatDataset(txtpath='/home/aistudio/val.txt', mode='val', transform=eval_transform)
test_dataset = CatDataset(txtpath='/home/aistudio/test.txt', mode='test', transform=eval_transform)

# dataloader
train_dataloader = paddle.io.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
val_dataloader = paddle.io.DataLoader(dataset=val_dataset, batch_size=batch_size)
test_dataloader = paddle.io.DataLoader(dataset=test_dataset, batch_size=batch_size)

# print
print('train_dataloader len:', len(train_dataloader))
print('val_dataloader len:', len(val_dataloader))
print('test_dataloader len:', len(test_dataloader))

2. 组网

In [13]

import paddle
from paddle.metric import Accuracy
from ppcls.arch.backbone import model_zoo


# 分类类别
num_classes = 12
# 模型
model = model_zoo.resnet_vc.ResNet50_vc(class_num=num_classes)  # 可更换模型
# 优化器
optimizer = paddle.optimizer.Adam(learning_rate=1e-4, parameters=model.parameters(), weight_decay=1e-5)  # 优化器可更换 
# 损失函数
loss = paddle.nn.CrossEntropyLoss()
# 评价指标
acc = Accuracy()

# 高层API封装
model = paddle.Model(model)
model.prepare(optimizer, loss, acc)

# 打印模型结构
model.summary((batch_size, 3) + (size, )*2)

In [ ]

# model.save('/home/aistudio/ResNet50_vc_random')

四、模型训练

1. 训练

In [12]

# 训练轮数
epochs = 10

# 模型训练
model.fit(train_dataloader, val_dataloader, epochs=epochs, verbose=1)

In [ ]

# model.save('/home/aistudio/ResNet50_vc_training')

2. 训练数据+验证数据微调

  • 将拆分过的数据集合并

In [11]

all_datas = []
with open('/home/aistudio/train.txt') as f:
    for line in f.readlines():
        all_datas.append(line.strip())

with open('/home/aistudio/val.txt') as f:
    for line in f.readlines():
        all_datas.append(line.strip())

# 将所有数据路径+标签写入.txt文件
with open('/home/aistudio/all.txt', 'w') as f:
    for line in all_datas:
        f.write(f'{line}\n')

# dataset
all_dataset = CatDataset(txtpath='/home/aistudio/all.txt', mode='train', transform=train_transform)
# dataloader
all_dataloader = paddle.io.DataLoader(dataset=all_dataset, batch_size=batch_size, shuffle=True)
# print
print('all_dataloader len:', len(all_dataloader))
# 优化器
optimizer = paddle.optimizer.Adam(learning_rate=1e-6, parameters=model.parameters(), weight_decay=1e-6)
# 高层API封装
model.prepare(optimizer, loss, acc)
# 模型训练
model.fit(all_dataloader, epochs=2, verbose=1)

五、模型预测并保存结果

  • model.predict返回 [ [batch1, batch2, …] ]

In [10]

import numpy as np
import pandas as pd


# 得到结果
results = model.predict(test_dataloader)

# 得到分类类别
new_results = []

for batch in results[0]:
    for result in batch:
        new_results.append(np.argmax(result))

print('new_results len:', len(new_results), '; new_results[0] =', new_results[0])

# 得到图片名称
names = []
for line in test_dataset.data:
    names.append(line.split('/')[-1])
print('names len:', len(names))

# 保存结果
pd_results = pd.DataFrame({'names':names, 'results':new_results})
pd_results.to_csv('/home/aistudio/result.csv', header=False, index=False)

六、迁移学习对比实验

1. 随机初始化的参数,模型第一层的某个通道参数功能可视化

  • 可视化观察,随机初始化的模型第一层参数到底在提取什么样的特征
  • 你也可以根据此方法更细致的查看模型的每个部分

In [ ]

import paddle
import paddle.nn as nn
import paddle.vision.transforms as T
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image


# 读取图片
ori_img = Image.open('/home/aistudio/data/cat_12_train/LTMkHx9w2nfsRiZec3bEVtmujpv7qS1y.jpg').convert('RGB')
# 数据归一化操作
transform = T.Compose([
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])
# 图片归一化
img = transform(ori_img)  # img.shape: (H, W, C) -> (C, H, W)
# 将图片封装成Tensor, 适应paddle卷积API输入的张量维度
# Conv2D需要的输入为 4-D 的Tensor (N, C, H, W) 或 (N, H, W, C), 默认 (N, C, H, W)
img = paddle.to_tensor([img], dtype='float32')

# 参数提取
state_dict = paddle.load('/home/aistudio/ResNet50_vc_random.pdparams')
# 打印参数名称, 第一层卷积的参数名称: conv1_1._conv.weight
# print(state_dict.keys())
# state_dict['conv1_1._conv.weight'].shape: (out_channels, in_channels, H, W)
print("state_dict['conv1_1._conv.weight'].shape:", state_dict['conv1_1._conv.weight'].shape)

# 取出conv1_1._conv.weight的前2个kernel
# conv1_1_kernel.shape: (2, 3, H, W)
conv1_1_kernel = state_dict['conv1_1._conv.weight'][0:2]

# 循环操作
out1_list = []
for i in range(conv1_1_kernel.shape[1]):
    # conv1_1_kernelx.shape: (2, 1, H, W)
    conv1_1_kernelx = conv1_1_kernel[:, i:i+1]
    # conv1_1_kernelx.shape: (2, 1, H, W)
    print("conv1_1_kernelx.shape:", conv1_1_kernelx.shape)
    # 构造卷积
    conv = nn.Conv2D(in_channels=1, out_channels=2, kernel_size=3, stride=2, bias_attr=False)  # 实际的超参数需要根据选用模型的超参数设计
    key = conv.weight.name
    conv.set_state_dict({key:conv1_1_kernelx})
    # 特征提取
    out = conv(img[:, i:i+1])
    out1_list.append(out.numpy()[0, 0])
    out1_list.append(out.numpy()[0, 1])
    # out.shape: (1, 2, H', W')
    print("out.shape:", out.shape)

out1_list.insert(0, ori_img)

2. 随机初始化的参数,经过学习后,操作同1相同

In [ ]

import paddle
import paddle.nn as nn
import paddle.vision.transforms as T
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image


# 读取图片
ori_img = Image.open('/home/aistudio/data/cat_12_train/LTMkHx9w2nfsRiZec3bEVtmujpv7qS1y.jpg').convert('RGB')
# 数据归一化操作
transform = T.Compose([
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])
# 图片归一化
img = transform(ori_img)  # img.shape: (H, W, C) -> (C, H, W)
# 将图片封装成Tensor, 适应paddle卷积API输入的张量维度
# Conv2D需要的输入为 4-D 的Tensor (N, C, H, W) 或 (N, H, W, C), 默认 (N, C, H, W)
img = paddle.to_tensor([img], dtype='float32')

# 参数提取
state_dict = paddle.load('/home/aistudio/ResNet50_vc_training.pdparams')
# 打印参数名称, 第一层卷积的参数名称: conv1_1._conv.weight
# print(state_dict.keys())
# state_dict['conv1_1._conv.weight'].shape: (out_channels, in_channels, H, W)
print("state_dict['conv1_1._conv.weight'].shape:", state_dict['conv1_1._conv.weight'].shape)

# 取出conv1_1._conv.weight的前2个kernel
# conv1_1_kernel.shape: (2, 3, H, W)
conv1_1_kernel = state_dict['conv1_1._conv.weight'][0:2]

# 循环操作
out2_list = []
for i in range(conv1_1_kernel.shape[1]):
    # conv1_1_kernelx.shape: (2, 1, H, W)
    conv1_1_kernelx = conv1_1_kernel[:, i:i+1]
    # conv1_1_kernelx.shape: (2, 1, H, W)
    print("conv1_1_kernelx.shape:", conv1_1_kernelx.shape)
    # 构造卷积
    conv = nn.Conv2D(in_channels=1, out_channels=2, kernel_size=3, stride=2, bias_attr=False)  # 实际的超参数需要根据选用模型的超参数设计
    key = conv.weight.name
    conv.set_state_dict({key:conv1_1_kernelx})
    # 特征提取
    out = conv(img[:, i:i+1])
    out2_list.append(out.numpy()[0, 0])
    out2_list.append(out.numpy()[0, 1])
    # out.shape: (1, 2, H', W')
    print("out.shape:", out.shape)

out2_list.insert(0, ori_img)

3. 预训练参数,操作同1相同

In [ ]

import paddle
import paddle.nn as nn
import paddle.vision.transforms as T
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image


# 读取图片
ori_img = Image.open('/home/aistudio/data/cat_12_train/LTMkHx9w2nfsRiZec3bEVtmujpv7qS1y.jpg').convert('RGB')
# 数据归一化操作
transform = T.Compose([
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225))
])
# 图片归一化
img = transform(ori_img)  # img.shape: (H, W, C) -> (C, H, W)
# 将图片封装成Tensor, 适应paddle卷积API输入的张量维度
# Conv2D需要的输入为 4-D 的Tensor (N, C, H, W) 或 (N, H, W, C), 默认 (N, C, H, W)
img = paddle.to_tensor([img], dtype='float32')

# 参数提取
state_dict = paddle.load('/home/aistudio/ResNet50_vc_pretrained.pdparams')
# 打印参数名称, 第一层卷积的参数名称: conv1_1._conv.weight
# print(state_dict.keys())
# state_dict['conv1_1._conv.weight'].shape: (out_channels, in_channels, H, W)
print("state_dict['conv1_1._conv.weight'].shape:", state_dict['conv1_1._conv.weight'].shape)

# 取出conv1_1._conv.weight的前2个kernel
# conv1_1_kernel.shape: (2, 3, H, W)
conv1_1_kernel = state_dict['conv1_1._conv.weight'][0:2]

# 循环操作
out3_list = []
for i in range(conv1_1_kernel.shape[1]):
    # conv1_1_kernelx.shape: (2, 1, H, W)
    conv1_1_kernelx = conv1_1_kernel[:, i:i+1]
    # conv1_1_kernelx.shape: (2, 1, H, W)
    print("conv1_1_kernelx.shape:", conv1_1_kernelx.shape)
    # 构造卷积
    conv = nn.Conv2D(in_channels=1, out_channels=2, kernel_size=3, stride=2, bias_attr=False)  # 实际的超参数需要根据选用模型的超参数设计
    key = conv.weight.name
    conv.set_state_dict({key:conv1_1_kernelx})
    # 特征提取
    out = conv(img[:, i:i+1])
    out3_list.append(out.numpy()[0, 0])
    out3_list.append(out.numpy()[0, 1])
    # out.shape: (1, 2, H', W')
    print("out.shape:", out.shape)

out3_list.insert(0, ori_img)

4. 可视化对比

In [ ]

import matplotlib.pyplot as plt


for i in range(len(out1_list)):
    plt.figure(figsize=(30, 30))
    plt.subplot(len(out1_list), 1, i+1)
    plt.axis('off')
    plt.imshow(np.concatenate([out1_list[i], out2_list[i], out3_list[i]], axis=1))
plt.show()

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

七、总结

  • 该项目基于paddleclas中的模型和使用paddle高层API,大大减少了代码量
  • 优化
  1. 细调各种超参数
  2. 数据增强方式增删为更好的组合
  3. 更换模型
  4. 加入学习率衰减策略
  5. 使用预训练模型, Finetune
  6. 多模型预测结果投票
  7. 多尺度预测
  • 6
    点赞
  • 90
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 10
    评论
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

乘风破浪的牛马

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值