李宏毅2020机器学习作业3——Convolutional Neural Network


开始之前声明:本文参考了李宏毅机器学习作业说明(需翻墙),基本上是将代码复现了一遍,说明中用的是google colab(由谷歌提供的免费的云平台),我用的是Jupyter Notebook


本文用到的资料在百度网盘自取点击下载,提取码:akti。
今天不知道怎么回事,自己的数据百度云分享不了了,用b站大佬整理的数据(作业1-13数据都有)
请将作业3所需资料下载解压,确保资料中有3个文件,分别是training。testing,validation,并保存到自己的目录当中。


【博主的环境:Anaconda3+Jupyter Notebook,python3.6.8】


作业要求:在收集来的资料中均是食物的照片,共有11类,Bread, Dairy product, Dessert, Egg, Fried food, Meat, Noodles/Pasta, Rice, Seafood, Soup, and Vegetable/Fruit.我们要创建一个CNN,用来实现食物的分类。

我们可以借助于深度学习的框架(例如:Tensorflow/pytorch等)来帮助我们快速实现网络的搭建,在这里我们利用Pytorch来实现,不懂Pytorch的同学可以看看官网的Pytorch60分钟入门,看完基本可以了解Pytorch框架的用法


现在开始跟着我一步步copy~~
开始之前先导入需要的库:

  • pandas:一个强大的分析结构化数据的工具集
  • numpy: Python的一个扩展程序库,支持大量的维度数组与矩阵运算
  • os:可以对路径和文件进行操作的模块
  • time:提供各种与时间相关的函数
  • pytorch:深度学习框架(下载时注意有无gpu)
  • opencv:开源的图像处理库

没有库的请自行安装(Jupyter Notebook安装方法:进入自己的环境,conda install 库名字 即可)

import os
import numpy as np
import cv2
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import pandas as pd
from torch.utils.data import DataLoader, Dataset
import time

创建一个读取文件的函数readfile

def readfile(path, label):
    # label 是一個 boolean variable,代表需不需要回傳 y 值
    image_dir = sorted(os.listdir(path))
    x = np.zeros((len(image_dir), 128, 128, 3), dtype=np.uint8)
    y = np.zeros((len(image_dir)), dtype=np.uint8)
    for i, file in enumerate(image_dir):
        img = cv2.imread(os.path.join(path, file))
        x[i, :, :] = cv2.resize(img,(128, 128))
        if label:
          y[i] = int(file.split("_")[0])
    if label:
      return x, y
    else:
      return x

设置好我们数据的存放路径,将数据加载进来

workspace_dir = 'E:\\jupyter\\data\\hw3\\food-11'
print("Reading data")
train_x, train_y = readfile(os.path.join(workspace_dir, "training"), True)
print("Size of training data = {}".format(len(train_x)))
val_x, val_y = readfile(os.path.join(workspace_dir, "validation"), True)
print("Size of validation data = {}".format(len(val_x)))
test_x = readfile(os.path.join(workspace_dir, "testing"), False)
print("Size of Testing data = {}".format(len(test_x)))

打印一下看看

Reading data
Size of training data = 9866
Size of validation data = 3430
Size of Testing data = 3347

训练集有9866张图片,验证集有3430张图片,测试集有3347张图片

接下来对数据增强,就是增加数据量

# training 时做 data augmentation
train_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.RandomHorizontalFlip(), # 随机将图片水平翻转
    transforms.RandomRotation(15), # 随机旋转图片
    transforms.ToTensor(), # 将图片向量化,并 normalize 到 [0,1] (data normalization)
])
# testing 时不需要做 data augmentation
test_transform = transforms.Compose([
    transforms.ToPILImage(),                                    
    transforms.ToTensor(),
])

在 PyTorch 中,我们可以利用 torch.utils.data 的 Dataset 及 DataLoader 來"包装" data,使后续的 training 及 testing 更为方便。

class ImgDataset(Dataset):
    def __init__(self, x, y=None, transform=None):
        self.x = x
        # label is required to be a LongTensor
        self.y = y
        if y is not None:
            self.y = torch.LongTensor(y)
        self.transform = transform
    def __len__(self):
        return len(self.x)
    def __getitem__(self, index):
        X = self.x[index]
        if self.transform is not None:
            X = self.transform(X)
        if self.y is not None:
            Y = self.y[index]
            return X, Y
        else:
            return X

训练过程中,我们采用分批次训练(加快参数更新速度),设置好我们的batch_size大小,这里设置为128,然后包装好我们的数据

batch_size = 128
train_set = ImgDataset(train_x, train_y, train_transform)
val_set = ImgDataset(val_x, val_y, test_transform)
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_set, batch_size=batch_size, shuffle=False)

数据我们已经构建好了
接着就是利用pytorch来构建我们的模型
利用nn.Conv2d,nn.BatchNorm2d,nn.ReLU,nn.MaxPool2d这4个函数来构建一个5层的CNN

  • nn.Conv2d:卷积层
  • nn.BatchNorm2d:归一化
  • nn.ReLU:激活层
  • nn.MaxPool2d:最大池化层

卷积层之后进入到一个3层全连接层,最后输出结果

代码实现如下

class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)
        # input 維度 [3, 128, 128]
        self.cnn = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),  # [64, 128, 128]
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),      # [64, 64, 64]

            nn.Conv2d(64, 128, 3, 1, 1), # [128, 64, 64]
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),      # [128, 32, 32]

            nn.Conv2d(128, 256, 3, 1, 1), # [256, 32, 32]
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),      # [256, 16, 16]

            nn.Conv2d(256, 512, 3, 1, 1), # [512, 16, 16]
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),       # [512, 8, 8]
            
            nn.Conv2d(512, 512, 3, 1, 1), # [512, 8, 8]
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),       # [512, 4, 4]
        )
        self.fc = nn.Sequential(
            nn.Linear(512*4*4, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 11)
        )

    def forward(self, x):
        out = self.cnn(x)
        out = out.view(out.size()[0], -1)
        return self.fc(out)

模型构建好之后,我们就可以开始训练了

model = Classifier().cuda()
loss = nn.CrossEntropyLoss() #  交叉熵损失函数
optimizer = torch.optim.Adam(model.parameters(), lr=0.001) # optimizer 使用 Adam
num_epoch = 30  #迭代次数

#训练
for epoch in range(num_epoch):
    epoch_start_time = time.time()
    train_acc = 0.0  #计算每个opoch的精度与损失
    train_loss = 0.0
    val_acc = 0.0
    val_loss = 0.0

    model.train() # 确保 model 是在 train model (开启 Dropout 等...)
    for i, data in enumerate(train_loader):
        optimizer.zero_grad() # 用 optimizer 将 model 参数的 gradient 归零
        train_pred = model(data[0].cuda()) # 利用 model 进行向前传播,计算预测值
        batch_loss = loss(train_pred, data[1].cuda()) # 计算 loss (注意 prediction 跟 label 必须同时在 CPU 或是 GPU 上)
        batch_loss.backward() # 利用 back propagation 算出每个参数的 gradient
        optimizer.step() # 以 optimizer 用 gradient 更新参数值
        train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
        train_loss += batch_loss.item()
    
    model.eval()
    with torch.no_grad():
        for i, data in enumerate(val_loader):
            val_pred = model(data[0].cuda())
            batch_loss = loss(val_pred, data[1].cuda())

            val_acc += np.sum(np.argmax(val_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
            val_loss += batch_loss.item()

        #将結果 print 出來
        print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f | Val Acc: %3.6f loss: %3.6f' % \
            (epoch + 1, num_epoch, time.time()-epoch_start_time, \
             train_acc/train_set.__len__(), train_loss/train_set.__len__(), val_acc/val_set.__len__(), val_loss/val_set.__len__()))

运行开始训练
注意:如果没有cuda加速可能无法进行训练,可以放在cpu上跑或者在google cloab上运行
运行时间得一小会。。。。

[001/030] 27.24 sec(s) Train Acc: 0.237888 Loss: 0.017894 | Val Acc: 0.269096 loss: 0.016925
[002/030] 21.78 sec(s) Train Acc: 0.346949 Loss: 0.014638 | Val Acc: 0.387172 loss: 0.014022
[003/030] 21.92 sec(s) Train Acc: 0.398439 Loss: 0.013529 | Val Acc: 0.373761 loss: 0.013851
[004/030] 21.91 sec(s) Train Acc: 0.437563 Loss: 0.012768 | Val Acc: 0.324198 loss: 0.015293
[005/030] 22.03 sec(s) Train Acc: 0.462700 Loss: 0.012086 | Val Acc: 0.451603 loss: 0.012170
[006/030] 21.90 sec(s) Train Acc: 0.501318 Loss: 0.011245 | Val Acc: 0.393878 loss: 0.014743
[007/030] 21.87 sec(s) Train Acc: 0.524427 Loss: 0.010855 | Val Acc: 0.421574 loss: 0.013640
[008/030] 21.88 sec(s) Train Acc: 0.558889 Loss: 0.010173 | Val Acc: 0.334111 loss: 0.017887
[009/030] 21.88 sec(s) Train Acc: 0.571052 Loss: 0.009840 | Val Acc: 0.484257 loss: 0.012081
[010/030] 21.88 sec(s) Train Acc: 0.601155 Loss: 0.009098 | Val Acc: 0.479009 loss: 0.012725
[011/030] 21.91 sec(s) Train Acc: 0.607642 Loss: 0.008935 | Val Acc: 0.516910 loss: 0.010935
[012/030] 21.90 sec(s) Train Acc: 0.632982 Loss: 0.008324 | Val Acc: 0.513411 loss: 0.011666
[013/030] 21.91 sec(s) Train Acc: 0.652139 Loss: 0.007836 | Val Acc: 0.581633 loss: 0.009890
[014/030] 21.92 sec(s) Train Acc: 0.674032 Loss: 0.007296 | Val Acc: 0.572886 loss: 0.010333
[015/030] 21.90 sec(s) Train Acc: 0.688830 Loss: 0.007036 | Val Acc: 0.487755 loss: 0.013652
[016/030] 21.90 sec(s) Train Acc: 0.700892 Loss: 0.006790 | Val Acc: 0.597376 loss: 0.010099
[017/030] 21.89 sec(s) Train Acc: 0.719542 Loss: 0.006308 | Val Acc: 0.567930 loss: 0.010902
[018/030] 21.95 sec(s) Train Acc: 0.732009 Loss: 0.006244 | Val Acc: 0.560641 loss: 0.011560
[019/030] 21.87 sec(s) Train Acc: 0.747213 Loss: 0.005798 | Val Acc: 0.468805 loss: 0.015137
[020/030] 21.92 sec(s) Train Acc: 0.738192 Loss: 0.005893 | Val Acc: 0.618950 loss: 0.009629
[021/030] 21.84 sec(s) Train Acc: 0.751571 Loss: 0.005559 | Val Acc: 0.634985 loss: 0.009336
[022/030] 21.84 sec(s) Train Acc: 0.767484 Loss: 0.005285 | Val Acc: 0.603207 loss: 0.010479
[023/030] 21.81 sec(s) Train Acc: 0.782789 Loss: 0.004811 | Val Acc: 0.658601 loss: 0.009200
[024/030] 21.84 sec(s) Train Acc: 0.810055 Loss: 0.004360 | Val Acc: 0.420408 loss: 0.019338
[025/030] 21.85 sec(s) Train Acc: 0.801642 Loss: 0.004485 | Val Acc: 0.658017 loss: 0.009225
[026/030] 21.82 sec(s) Train Acc: 0.833671 Loss: 0.003745 | Val Acc: 0.643440 loss: 0.010182
[027/030] 21.84 sec(s) Train Acc: 0.832860 Loss: 0.003715 | Val Acc: 0.634694 loss: 0.010398
[028/030] 21.83 sec(s) Train Acc: 0.842489 Loss: 0.003504 | Val Acc: 0.638192 loss: 0.011266
[029/030] 21.82 sec(s) Train Acc: 0.839651 Loss: 0.003549 | Val Acc: 0.630029 loss: 0.010706
[030/030] 21.82 sec(s) Train Acc: 0.855159 Loss: 0.003248 | Val Acc: 0.667638 loss: 0.009832

这是博主的训练结果,看到模型在训练集上的精度达到了85.52%,在验证集上达到了66.76%,模型好像过拟合了。。。效果不太好

我们将训练集和验证集放到一起训练一次看看效果

将训练集和验证集组成一个训练集

train_val_x = np.concatenate((train_x, val_x), axis=0)
train_val_y = np.concatenate((train_y, val_y), axis=0)
train_val_set = ImgDataset(train_val_x, train_val_y, train_transform)
train_val_loader = DataLoader(train_val_set, batch_size=batch_size, shuffle=True)

再次进行训练

model_best = Classifier().cuda()
loss = nn.CrossEntropyLoss() # 因為是 classification task,所以 loss 使用 CrossEntropyLoss
optimizer = torch.optim.Adam(model_best.parameters(), lr=0.001) # optimizer 使用 Adam
num_epoch = 30

for epoch in range(num_epoch):
    epoch_start_time = time.time()
    train_acc = 0.0
    train_loss = 0.0

    model_best.train()
    for i, data in enumerate(train_val_loader):
        optimizer.zero_grad()
        train_pred = model_best(data[0].cuda())
        batch_loss = loss(train_pred, data[1].cuda())
        batch_loss.backward()
        optimizer.step()

        train_acc += np.sum(np.argmax(train_pred.cpu().data.numpy(), axis=1) == data[1].numpy())
        train_loss += batch_loss.item()

        #將結果 print 出來
    print('[%03d/%03d] %2.2f sec(s) Train Acc: %3.6f Loss: %3.6f' % \
      (epoch + 1, num_epoch, time.time()-epoch_start_time, \
      train_acc/train_val_set.__len__(), train_loss/train_val_set.__len__()))

运行

[001/030] 26.32 sec(s) Train Acc: 0.253911 Loss: 0.017218
[002/030] 26.08 sec(s) Train Acc: 0.371540 Loss: 0.014059
[003/030] 26.08 sec(s) Train Acc: 0.424789 Loss: 0.012766
[004/030] 26.14 sec(s) Train Acc: 0.474128 Loss: 0.011835
[005/030] 26.15 sec(s) Train Acc: 0.516095 Loss: 0.010933
[006/030] 26.10 sec(s) Train Acc: 0.556333 Loss: 0.010062
[007/030] 26.99 sec(s) Train Acc: 0.589576 Loss: 0.009341
[008/030] 27.90 sec(s) Train Acc: 0.615298 Loss: 0.008768
[009/030] 28.58 sec(s) Train Acc: 0.636658 Loss: 0.008259
[010/030] 27.47 sec(s) Train Acc: 0.669901 Loss: 0.007618
[011/030] 26.39 sec(s) Train Acc: 0.687575 Loss: 0.007043
[012/030] 27.26 sec(s) Train Acc: 0.707055 Loss: 0.006702
[013/030] 27.37 sec(s) Train Acc: 0.720743 Loss: 0.006370
[014/030] 26.93 sec(s) Train Acc: 0.730144 Loss: 0.006119
[015/030] 27.02 sec(s) Train Acc: 0.743908 Loss: 0.005744
[016/030] 27.15 sec(s) Train Acc: 0.766396 Loss: 0.005277
[017/030] 27.49 sec(s) Train Acc: 0.771661 Loss: 0.005122
[018/030] 27.22 sec(s) Train Acc: 0.789260 Loss: 0.004735
[019/030] 27.08 sec(s) Train Acc: 0.803400 Loss: 0.004456
[020/030] 27.11 sec(s) Train Acc: 0.813403 Loss: 0.004269
[021/030] 27.07 sec(s) Train Acc: 0.818818 Loss: 0.004065
[022/030] 27.03 sec(s) Train Acc: 0.835289 Loss: 0.003661
[023/030] 27.03 sec(s) Train Acc: 0.846420 Loss: 0.003390
[024/030] 27.41 sec(s) Train Acc: 0.856423 Loss: 0.003150
[025/030] 27.09 sec(s) Train Acc: 0.859807 Loss: 0.003017
[026/030] 27.20 sec(s) Train Acc: 0.878911 Loss: 0.002682
[027/030] 27.07 sec(s) Train Acc: 0.893126 Loss: 0.002409
[028/030] 27.00 sec(s) Train Acc: 0.892223 Loss: 0.002377
[029/030] 27.01 sec(s) Train Acc: 0.897864 Loss: 0.002257
[030/030] 27.10 sec(s) Train Acc: 0.902978 Loss: 0.002161

可以看出模型在训练集上的精度达到了90.3%,相对之前又提高了5%左右
但是博主感觉还是过拟合

我们在测试集上跑一下看看预测结果

test_set = ImgDataset(test_x, transform=test_transform)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

model_best.eval()
prediction = []
with torch.no_grad():
    for i, data in enumerate(test_loader):
        test_pred = model_best(data.cuda())
        test_label = np.argmax(test_pred.cpu().data.numpy(), axis=1)
        for y in test_label:
            prediction.append(y)
#保存预测结果
with open("predict.csv", 'w') as f:
    f.write('Id,Category\n')
    for i, y in  enumerate(prediction):
        f.write('{},{}\n'.format(i, y))

打开predict文件可以看到我们在测试集上的预测结果

因为测试集没有标签,没法验证我们的模型效果,可以找几张testing中的图片看看预测的Label是否正确

博主对比了几张披萨的预测结果,感觉误差还是挺大的 - -

有错误的地方希望大家批评指正,谢谢!

  • 6
    点赞
  • 70
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值