第5周学习:ShuffleNet & EfficientNet & 迁移学习

视频+论文

ShuffleNet v1 v2

ShuffleNet v1

在这里插入图片描述

这个网络提出了channel shuffle的思想,比组卷积效果更好。ShuffleNet Unit由GConv和DWConv ,既减少了参数量,不同组之间信息也有交流。

ShuffleNet v2

在这里插入图片描述
在这里插入图片描述
结论:
影响模型推理速度的因素:

  • FLOPs,复杂度
  • MAC,内存访问消耗memory access cost
  • 并行等级
  • 平台GPU或者ARM

4条设计高效网络的准测

  1. 保持FLOPs不变,当卷积层的输入特征矩阵与输出特征矩阵channel相等时,MAC最小
  2. 保持FLOPs不变,当GConv的groups数量增大时,MAC也会增大
  3. 网络设计的碎片化程度越高,速度越慢
  4. Element-wise操作带来的影响是不可忽视的

EfficientNet

论文思想

在这里插入图片描述
因此,通过NAS搜索得到EfficientNetB0

EfficientNet B0在这里插入图片描述

同样的,可以更改网络的width等参数,获得B0-B7网络
在这里插入图片描述
在这里插入图片描述

multi-head self-attention

self-attention,主要就是Attention那个公式。
在这里插入图片描述
在这里插入图片描述
multi-head self-attention就是多个self-attention的组合运算,类似组卷积的概念

代码练习

VGG猫狗大战

首先是原代码,使用的SGD算法

在这里插入图片描述


换成Adam后,正确率上升了。快到ResNet50的准确率了

在这里插入图片描述

AI艺术鉴赏挑战赛题

我在冠军代码的基础上简化了一些流程,比如数据增强和投票结果;也更改了一些,比如最后的训练和测试代码,最终完成了如下代码。经测试,在练习赛中有较高的准确率;由于时间和设备关系,并没有进一步的训练和调整参数。
在这里插入图片描述

代码

首先先引入包

import os
import time
import copy
import sys
import csv
import cv2
import math
import shutil
import glob
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import collections
from PIL import Image

from tqdm import tqdm

from torch.optim import lr_scheduler
import torchvision
from torchvision import models,transforms,datasets,utils
from torchvision.models import EfficientNet_B4_Weights,EfficientNet_B3_Weights,ResNet50_Weights
from torch.utils.data import Dataset,DataLoader

import numpy as np
import torch.optim as optim
from torch.optim import lr_scheduler
# import IBN_ResNet as ibn
from torch.autograd import Variable
from torch.backends import cudnn
import pandas as pd

# from sklearn.model_selection import train_test_split, StratifiedKFold, KFold
from torch.nn.parameter import Parameter

挂载谷歌硬盘,注意要把Art.zip放到谷歌硬盘

from google.colab import drive
drive.mount('/content/drive')

解压文件,然后进入文件夹

!unzip "/content/drive/MyDrive/Art.zip" -d "/content/sample_data/art"
os.chdir("/content/sample_data/art/")

接下来整理数据集和对图像进行处理,参考的是冠军代码,更改了一些参数来适应新版的pytorch。

def Train_Testloader(data):
    np.random.shuffle(data)
    lab_dict,label = dictmap(data,num=1)
    train_size = int((len(data)) * 0.8)
    # test_size = len(data_jpg) - train_size
    train_idx = data[:train_size]
    test_idx = data[train_size:]
    train_dataset = mydataset(train_idx,lab_dict,loader=default_loader,mode='train')
    test_dataset = mydataset(test_idx,lab_dict,loader=default_loader,mode='test')
    
    train_loader = DataLoader(train_dataset,batch_size=8,shuffle=True,num_workers=10,pin_memory=True)
    test_loader = DataLoader(test_dataset,batch_size=8,shuffle=False,num_workers=10,pin_memory=True)
    
    return train_loader,test_loader
def dictmap(data, num=1):
    label = []
    fnames = []
    for line in data:
        name = int(line.split('.')[0].split('/')[num])
        fnames.append(name)
        label.append(file['label'][name])
    
    lab_dict = dict(zip(fnames,label))
    
    return lab_dict,label


def default_loader(path):
    return Image.open(path).convert('RGB')
class mydataset(Dataset):
    def __init__(self,data,lab_dict,loader=default_loader,mode='train'):
        super(mydataset,self).__init__() 
        la = []

        for line in data:
            la.append(lab_dict[int(line.split('.')[0].split('/')[1])])
            
        
        self.data = data
        self.loader = loader
        self.mode = mode
        if self.mode == 'train':
            self.transforms = transforms.Compose([
                transforms.Resize(660),
                transforms.RandomCrop(600),#660 600 93.125
                transforms.RandomHorizontalFlip(),
                # transforms.RandomVerticalFlip(),
                # transforms.RandomRotation(40),
                # transforms.RandomAffine(20),
                # transforms.ColorJitter(0.3,0.6,0.3,0.2),#0.6
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225]),
                transforms.RandomErasing(p=0.6,scale=(0.02,0.33),ratio=(0.3,0.33),value=0,inplace=False)
            ])
        else:
            self.transforms = transforms.Compose([
                transforms.Resize(660),
                transforms.RandomCrop(600),
                transforms.RandomHorizontalFlip(),
                # transforms.RandomVerticalFlip(),
                # transforms.RandomRotation(40),
                # transforms.RandomAffine(20),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225])
            ])
        
        self.la = la
        
    
    def __getitem__(self,index):
        fn = self.data[index] 
        label = self.la[index]
        img = self.loader(fn)
        if self.transforms is not None:
            img = self.transforms(img)
        return img,torch.from_numpy(np.array(label))
    
    def __len__(self):
        return len(self.data)

这一步有一个很重要的地方,数据集的数据分布需要注意。我自己写的数据处理函数没有考虑这一步,导致train_acc和val_acc差距很大
在这里插入图片描述
接下来进行构建数据集。

data_jpg = np.array(glob.glob('train/*.jpg'))
file = pd.read_csv('train.csv')
train_loader,test_loader = Train_Testloader(data_jpg)

网络这里是参考的冠军代码,使用了efficientnet_b3的预训练模型。这里没有冻结最后几层,因为发现效果不太好,就直接训练了全部网络。

net = models.efficientnet_b3(weights = EfficientNet_B3_Weights.IMAGENET1K_V1)
net.classifier[0] = nn.Dropout(p=0.4, inplace=True)   
net.classifier[1] = nn.Linear(1536,49)

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net.to(device)
optimizer = torch.optim.Adam(net.parameters(),lr=3e-4)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)
criterion = nn.CrossEntropyLoss()

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
def train_model(model, criterion, optimizer, scheduler, num_epochs, train_loader, test_loader):
    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0
    best_epoch = 0
    trainx_loss = []
    trainx_acc = []
    testx_loss = []
    testx_acc = []
    
    
    for epoch in range(num_epochs):
        print('Epoch{}/{}'.format(epoch,num_epochs - 1))      
        print('-'*20)
        
        start = time.time()
        
        train_data_lenth = 0
        test_data_lenth = 0    
        running_loss = 0.0
        running_corrects = 0
        test_loss = 0.0 
        test_corrects = 0
        
        
        model.train()
        #train    
        for data in train_loader:
            inputs,labels = data
            inputs = inputs.to(device)
            labels = labels.to(device)
            #zero the parameter gradients
            optimizer.zero_grad()
            
            #mixup
            # inputs, targets_a, targets_b, lam = mixup_data(inputs, labels, alpha=0.2, use_cuda=True)
            outputs = model(inputs)
            #forward
            
            #mixup loss
            # loss_mix = mixup_criterion(targets_a,targets_b,lam)
            # loss = loss_mix(criterion,outputs)
            
            _,preds = torch.max(outputs.data,1)
            #original loss
            loss = criterion(outputs,labels)

            #backward
            loss.backward()
            optimizer.step()
            
            #statistics
            running_loss +=loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data).item()
            train_data_lenth += inputs.size(0)
        
        scheduler.step()
        
        epoch_loss = running_loss / train_data_lenth
        trainx_loss.append(epoch_loss)
        epoch_acc = running_corrects / train_data_lenth
        trainx_acc.append(epoch_acc)
        run_time = time.time() - start
        print('Train Loss:{:.4f} Acc{:.4f} Time{:.2f}s'.format(epoch_loss,epoch_acc,run_time))

        
        
        #test
        
        model.eval()
        with torch.no_grad():
            for data in test_loader:
                inputs,labels = data
                inputs = inputs.to(device)
                labels = labels.to(device)
                #forward
                outputs = model(inputs)
                _,preds = torch.max(outputs.data,1)
                loss = criterion(outputs,labels)
                test_loss +=loss.item() * inputs.size(0)
                test_corrects += torch.sum(preds == labels.data).item()
                test_data_lenth += inputs.size(0) 
            
            
            Test_loss = test_loss / test_data_lenth
            testx_loss.append(Test_loss)
            Test_acc = test_corrects / test_data_lenth
            testx_acc.append(Test_acc)
            
            print('Test Loss:{:.4f} Acc{:.4f}'.format(Test_loss,Test_acc))
    
            
            if Test_acc>best_acc:
                best_acc = Test_acc
                best_epoch = epoch
                best_model_wts = copy.deepcopy(model.state_dict())
            print('Best Acc:{:.4f}'.format(best_acc))
        
        pthpath = '/content/resnet.pth'
        torch.save(best_model_wts,pthpath)
        
    print('Trainging complete ,Bets Epoch:{:.0f} Best Acc:{:.4f}'.format(best_epoch,best_acc))
train_model(net,criterion,optimizer,scheduler,20,train_loader,test_loader)

训练部分也是参考的冠军代码,删改了一些函数以适应colab的环境。

net.eval()
trans = transforms.Compose([
                transforms.Resize(660),
                transforms.RandomCrop(600),
                transforms.RandomHorizontalFlip(),
                # transforms.RandomVerticalFlip(),
                # transforms.RandomRotation(40),
                # transforms.RandomAffine(20),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225])
            ])
TTA_times = 9
submit = {'uuid': [], 'label': []}
with torch.no_grad():
  for i in range(0, 800):
    img_path = '/content/sample_data/art/test/%d.jpg' % i
    raw_img = Image.open(img_path).convert('RGB')
    results = np.zeros(49)
    for j in range(TTA_times):
      img = trans(raw_img)
      img = img.unsqueeze(0).to(device)
      out = net(img)
      out = torch.softmax(out, dim=1)
      _, pred = torch.max(out.cpu(), dim=1)

      results[pred] += 1
    pred = np.argmax(results)
    print(i, ',', pred)
    submit['uuid'].append(i)
    submit['label'].append(pred)
df = pd.DataFrame(submit)
df.to_csv(os.path.join('/content/sample_data/art/', 'result.csv'), encoding='utf-8', index=False, header=False)

最后是结果输出csv,这里没有使用冠军的代码,用了一个效果更好的,也没有进行投票法等操作。

遇到的问题

  1. 训练集和验证集结果差距很大,可能是没有考虑数据的分布
  2. 感觉最近进度太快了,两边一起学习,有时候很多基础性的东西都没时间沉淀了,还是得自己找时间看书。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值