cats_dogs classification

目录

前言

文件夹分布

训练集与测试集划分

主干网络

数据处理

训练文件 

测试文件

显示结果


 

前言

用了两天时间进行了训练与代码学习,基本搞懂每一个代码,能够将之前零碎的知识串起来。

kaggle是一个为开发商和数据科学家提供举办机器学习竞赛、托管数据库、编写和分享代码的平台,在这上面有非常多的好项目、好资源可供机器学习、深度学习爱好者学习之用。碰巧最近入门了一门非常的深度学习框架:pytorch(如果你对pytorch不甚了解,请点击这里),所以今天我和大家一起用pytorch实现一个图像识别领域的入门项目:猫狗图像识别。
深度学习的基础就是数据,咱们先从数据谈起。此次使用的猫狗分类图像一共25000张,猫狗分别有12500张。下载地址:https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data我们先来简单的看看都是一些什么图片。我们从下载文件里可以看到有两个文件夹:train和test1,分别用于训练和测试。打开train文件夹可以看到有25000 张小猫小狗的图片,图片名字为cat.0.jpg,cat.1.jpg,dog.0.jpg,dog.1.jpg。

 

文件夹分布

imgs 是下载的图片不进行分类。data  数据存储,log是tensorboard 存储文件,output是权值文件,test是测试照片放置的位置。相信你能很好的看清楚了, 

训练集与测试集划分

make_file.py 

注意original_dataset_dir 这个文件路径,base_dir 存储的文件路径  

数据将被分为成data文件。  注意猫狗照片的数量相同 两者大概1000张即可,命名为cat.1.jpg,,,,,dog.1.jpg

import os
import numpy as np
import shutil
# kaggle原始数据集地址
original_dataset_dir = './imgs'
total_num = int(len(os.listdir(original_dataset_dir)) / 2)
random_idx = np.array(range(total_num))
np.random.shuffle(random_idx)

# 待处理的数据集地址
base_dir = './data_2'
if not os.path.exists(base_dir):
    os.mkdir(base_dir)

# 训练集、测试集的划分
sub_dirs = ['train', 'test']
animals = ['cats', 'dogs']
train_idx = random_idx[:int(total_num * 0.8)]
test_idx = random_idx[int(total_num * 0.3):]
numbers = [train_idx, test_idx]
for idx, sub_dir in enumerate(sub_dirs):
    dir = os.path.join(base_dir, sub_dir)
    if not os.path.exists(dir):
        os.mkdir(dir)
    for animal in animals:
        animal_dir = os.path.join(dir, animal)  #
        if not os.path.exists(animal_dir):
            os.mkdir(animal_dir)
        fnames = [animal[:-1] + '.{}.jpg'.format(i) for i in numbers[idx]]
        for fname in fnames:
            src = os.path.join(original_dataset_dir, fname)
            dst = os.path.join(animal_dir, fname)
            shutil.copyfile(src, dst)

        # 验证训练集、验证集、测试集的划分的照片数目
        print(animal_dir + ' total images : %d' % (len(os.listdir(animal_dir))))

运行结果如下 

 

主干网络

simpleNet,py   有四层

import torch.nn as nn
import torch
class SimpleNet(nn.Module):

    def __init__(self):
        super(SimpleNet, self).__init__()
        # 三个卷积层用于提取特征
        self.conv1 = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=8, kernel_size=3, stride=1, padding=0),
            nn.BatchNorm2d(8),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.conv2 = nn.Sequential(
            nn.Conv2d(in_channels=8, out_channels=16, kernel_size=3, stride=1, padding=1),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.conv3 = nn.Sequential(
            nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=1, padding=0),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.conv4 = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=0),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        # self.conv5 = nn.Sequential(
        #     nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=0),
        #     nn.BatchNorm2d(128),
        #     nn.ReLU(),
        #     nn.MaxPool2d(2)
        # )
        # 分类
        self.classifier = nn.Sequential(
            nn.Linear(64 * 7 * 7, 512),
            nn.Linear(512, 2)
        )

    def forward(self, x):
        x = self.conv1(x)
        #print(x.shape)
        x = self.conv2(x)
        #print(x.shape)
        x = self.conv3(x)
        #print(x.shape)
        x = self.conv4(x)
        #print(x.shape)
      # x=  self.conv5(x)
        #print(x.shape)
        x = x.view(x.size(0), -1)
       # print(x.shape)
        x = self.classifier(x)
        #print(x.shape)
        return x

if __name__ == '__main__':
    model = SimpleNet()
    print(model)
    inputs = torch.randn(4,3,150,150)
    output = model(inputs)
    print(output.shape)

显示的网络结构 如下:

SimpleNet(
  (conv1): Sequential(
    (0): Conv2d(3, 8, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv2): Sequential(
    (0): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv3): Sequential(
    (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv4): Sequential(
    (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=3136, out_features=512, bias=True)
    (1): Linear(in_features=512, out_features=2, bias=True)
  )
)
torch.Size([4, 8, 74, 74])
torch.Size([4, 16, 37, 37])
torch.Size([4, 32, 17, 17])
torch.Size([4, 64, 7, 7])
torch.Size([4, 2])

数据处理

ImageDataset.py  注意 将图片设置为150x150大小的文件。

利用 

transforms.Compose 进行数据处理   image_datasets与dataloaders   进行数据加载
import os
import torch
from torch.utils.data import DataLoader
import argparse
from torchvision import transforms,datasets
from PIL import Image

def readImg(path):
    return Image.open(path)

def ImageDataset(args):
    data_transforms = {
        'train':transforms.Compose([
            transforms.Resize(150),
            transforms.CenterCrop(150),
            #transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
        ]),
        'test':transforms.Compose([
            transforms.Resize(150),
            transforms.CenterCrop(150),
            #transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
        ]),
        }
    data_dir = args.data_dir
    image_datasets = {x:datasets.ImageFolder(os.path.join(data_dir,x),
                        data_transforms[x],loader=readImg)
                        for x in ['train','test']}
    dataloaders = {x: DataLoader(image_datasets[x],
                        batch_size=args.batch_size, shuffle=(x == 'train'),
                        num_workers=args.num_workers)
                        for x in ['train', 'test']}
    dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'test']}
    class_names = image_datasets['train'].classes
    return dataloaders,dataset_sizes,class_names
if __name__=='__main__':
    parser = argparse.ArgumentParser(description='classification')
    parser.add_argument('--data-dir', type=str, default='./data')
    parser.add_argument('--num-workers', type=int, default=0)
    parser.add_argument('--batch-size', type=int, default=4)
    args = parser.parse_args()
    #dataloders, dataset_sizes, class_names = ImageDataset(args)
    dataloders, dataset_sizes, class_names = ImageDataset(args)
    print(dataloders.__len__())
    print(class_names)
    print(dataset_sizes)

训练文件 

train.py

'''
进行训练
'''
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import time
import os
import json
from math import ceil
import argparse
import copy
from ImageDataset import ImageDataset
from SimpleNet import SimpleNet
from tensorboardX import SummaryWriter

writer = SummaryWriter(log_dir='log')
def train_model(args, model, criterion, optimizer, scheduler, num_epochs, dataset_sizes, use_gpu):
    since = time.time()
    best_model_wts = copy.deepcopy(model.state_dict())  #导入权重
    best_acc = 0.0
    device = torch.device('cuda' if use_gpu else 'cpu')
    for epoch in range(args.start_epoch, num_epochs):
        # 每一个epoch中都有一个训练和一个验证过程(Each epoch has a training and validation phase)
        for phase in ['train', 'test']:
            if phase == 'train':
                scheduler.step(epoch)
                # 设置为训练模式(Set model to training mode)
                model.train()
            else:
                # 设置为验证模式(Set model to evaluate mode)
                model.eval()

            running_loss = 0.0
            running_corrects = 0
            tic_batch = time.time()
            # 在多个batch上依次处理数据(Iterate over data)
            for i, (inputs, labels) in enumerate(dataloders[phase]):
                inputs = inputs.to(device)
                labels = labels.to(device)
                # 梯度置零(zero the parameter gradients)
                optimizer.zero_grad()
                # 前向传播(forward)
                # 训练模式下才记录梯度以进行反向传播(track history if only in train)
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)
                    # 训练模式下进行反向传播与梯度下降(backward + optimize only if in training phase)
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # 统计损失和准确率(statistics)
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

                batch_loss = running_loss / (i * args.batch_size + inputs.size(0))
                batch_acc = running_corrects.double() / (i * args.batch_size + inputs.size(0))

                if phase == 'train' and (i + 1) % args.print_freq == 0:
                    print(
                        '[Epoch {}/{}]-[batch:{}/{}] lr:{:.6f} {} Loss: {:.6f}  Acc: {:.4f}  Time: {:.4f} sec/batch'.format(
                            epoch + 1, num_epochs, i + 1, ceil(dataset_sizes[phase] / args.batch_size),
                            scheduler.get_lr()[0], phase, batch_loss, batch_acc,
                            (time.time() - tic_batch) / args.print_freq))
                    tic_batch = time.time()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            if epoch == 0:
                os.remove('result.txt')
            with open('result.txt', 'a') as f:
                f.write('Epoch:{}/{} {} Loss: {:.4f} Acc: {:.4f} \n'.format(epoch + 1, num_epochs, phase, epoch_loss,
                                                                            epoch_acc))

            print('{}, Epoch-{} Loss: {:.4f} Acc: {:.4f}'.format(phase,epoch, epoch_loss, epoch_acc))

            writer.add_scalar(phase + '/Loss', epoch_loss, epoch)
            writer.add_scalar(phase + '/Acc', epoch_acc, epoch)

        if (epoch + 1) % args.save_epoch_freq == 0:
            if not os.path.exists(args.save_path):
                os.makedirs(args.save_path)
            torch.save(model.state_dict(), os.path.join(args.save_path, "epoch_" + str(epoch) + ".pth"))

        # 深拷贝模型(deep copy the model)
        if phase == 'test' and epoch_acc > best_acc:
            best_acc = epoch_acc
            best_model_wts = copy.deepcopy(model.state_dict())

    # 将model保存为graph
    writer.add_graph(model, (inputs,))

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
    print('Best val Accuracy: {:4f}'.format(best_acc))

    # 载入最佳模型参数(load best model weights)
    model.load_state_dict(best_model_wts)
    return model


if __name__ == '__main__':

    parser = argparse.ArgumentParser(description='classification')
    # 图片数据的根目录(Root catalog of images)
    parser.add_argument('--data-dir', type=str, default='data')
    parser.add_argument('--batch-size', type=int, default=4)
    parser.add_argument('--num-epochs', type=int, default=50)
    parser.add_argument('--lr', type=float, default=0.001)
    parser.add_argument('--num-workers', type=int, default=0)
    parser.add_argument('--print-freq', type=int, default=1)
    parser.add_argument('--save-epoch-freq', type=int, default=1)
    parser.add_argument('--save-path', type=str, default='output')
    parser.add_argument('--resume', type=str, default='', help='For training from one checkpoint')
    parser.add_argument('--start-epoch', type=int, default=0, help='Corresponding to the epoch of resume')
    args = parser.parse_args()

    # read data
    dataloders, dataset_sizes, class_names = ImageDataset(args)

    with open('class_names.json', 'w') as f:
        json.dump(class_names, f)

    # use gpu or not
    use_gpu = torch.cuda.is_available()
    print("use_gpu:{}".format(use_gpu))

    # get model
    model = SimpleNet()

    if args.resume:
        if os.path.isfile(args.resume):
            print(("=> loading checkpoint '{}'".format(args.resume)))
            model.load_state_dict(torch.load(args.resume))
        else:
            print(("=> no checkpoint found at '{}'".format(args.resume)))

    if use_gpu:
        model = torch.nn.DataParallel(model)
        model.to(torch.device('cuda'))
    else:
        model.to(torch.device('cpu'))

    # 用交叉熵损失函数(define loss function)
    criterion = nn.CrossEntropyLoss()

    # 梯度下降(Observe that all parameters are being optimized)
    optimizer_ft = optim.Adam(model.parameters(), lr=args.lr, weight_decay=0.00004)

    # Decay LR by a factor of 0.98 every 1 epoch
    exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=1, gamma=0.98)

    model = train_model(args=args,
                        model=model,
                        criterion=criterion,
                        optimizer=optimizer_ft,
                        scheduler=exp_lr_scheduler,
                        num_epochs=args.num_epochs,
                        dataset_sizes=dataset_sizes,
                        use_gpu=use_gpu)

    torch.save(model.state_dict(), os.path.join(args.save_path, 'best_model_wts.pth'))

    writer.close()

测试文件

test.py

预测结果如下

测试思想:导入模型 加载权重 再model.eval 首先进行数据集的预处理 然后进行预测,找出预测结果最大值位置进行最终判定

'''
测试分类
'''

from PIL import Image
from torchvision import transforms
import torch
import torch.nn as nn
from torch.autograd import Variable
import os
import json
from SimpleNet import SimpleNet

def predict_image(model, image_path):
    image = Image.open(image_path)

    # 测试时截取中间的90x90
    transformation1 = transforms.Compose([
        transforms.Resize(150),
        transforms.CenterCrop(150),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])

    ])

    # 预处理图像
    image_tensor = transformation1(image).float()

    # 额外添加一个批次维度,因为PyTorch将所有的图像当做批次
    image_tensor = image_tensor.unsqueeze_(0)

    if torch.cuda.is_available():
        image_tensor.cuda()

    # 将输入变为变量
    inputs = Variable(image_tensor)

    # 预测图像的类别
    output = model(inputs)

    index = output.cuda().data.cpu().numpy().argmax()

    return index

if __name__ == '__main__':

    best_model_path = './output/best_model_wts.pth'
    model = SimpleNet()
    model = nn.DataParallel(model)
   #model.load_state_dict(torch.load(best_model_path))
    model.load_state_dict(torch.load(best_model_path))
    model.eval()

    with open('class_names.json', 'r') as f:
        class_names = json.load(f)

    img_path = './test/cat179.jpg'
    predict_class = class_names[predict_image(model, img_path)]
    print(predict_class)

显示结果

能够看出来 test 与train 都收敛,结果还是不错。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

兜兜转转m

一毛钱助力博主实现愿望

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值