GT第五次研讨会
一、几点注意事项:
自动测试(OLD)
为了能得到全面公正的竞赛/项目结果,平台通过获取训练得到的模型,以及运行开发者的测试代码进行结果的输出,最后根据输出结果计算评价指标的值,对所有开发者的算法进行排名。再次进入在线编码
,在/project/ev_sdk
路径下,编写测试代码,即根据比赛/项目的规定,规范化测试代码的输入输出。
EV_SDK是由本公司自研的用于自动测试和后续模型落地的标准模型接口。为了简化竞赛开发者的开发工作,用于比赛自动测试的SDK经过了简化,并且可以选择C++或者python两种封装方式。详细的封装SDK的方法可以参考在线编码
的/project/ev_sdk
的路径下的README.md
文件。
为了尽量简单,在此介绍python方法封装SDK的方法。当使用Python接口发起测试时,系统仅会运行/projetc/ev_sdk/src/ji.py
内的代码,用户需要根据自己的模型名称、模型输入输出、模型推理逻辑,修改src/ji.py
。
**这里展示的是针对resnet18+50一般情况下爱transform的ji.py
- 实现模型初始化:
# src/ji.py
def init():
# 测试时选择的文件名
pth = '/usr/local/ev_sdk/model/models.pkl'
model = torch.load(pth)
return model
- 实现模型推理:
# src/ji.py
def process_image(net, input_image, args=None):
img = input_image
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = Image.fromarray(img)
#作相应的图像变换,resize,totensor,normalize注意totensor一定要在normalize之前
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
pic_transform = transforms.Compose([transforms.Resize(112),transforms.ToTensor(),transforms.Normalize(norm_mean,norm_std)])
img = pic_transform(img)
img = np.array(img)
img = img.transpose(0,2,1)
img = torch.tensor([img])
img = img.to(device)
net.eval()
with torch.no_grad():
out = net(img)
print(out)
_, pred = torch.max(out.data, 1)
data = json.dumps({'class': class_dict[pred[0].item()]},indent=4)
return data
其中process_image接口返回值,必须是JSON格式的字符串,并且格式符合要求。
根据实际项目,将结果封装成项目所规定的输入输出格式
示例代码中使用的是目标检测类项目,因此需要根据实际项目,添加检测类别信息:
目前最佳性能分+准确度
新的ji.py
import cv2
import json
import numpy as np
import torch
from skimage import io,transform,color
from PIL import Image
from torchvision import transforms, models
from data_Augmentation import *
# 自己的模型
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
def init():
# 测试时选择的文件名
pth = '/usr/local/ev_sdk/model/models.pkl'
model = torch.load(pth)
print('load model finish')
return model
# 根据训练的标签设置
class_dict = {}
f = open('/usr/local/ev_sdk/src/class.txt','r')
a = f.read()
class_dict = eval(a)
class_dict = {value:key for key, value in class_dict.items()}
f.close()
def process_image(net, input_image, args=None):
print('begin proecess')
transform = get_transforms(input_size=224, test_size=224, backbone=None)
print('get transform')
img = input_image
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = Image.fromarray(img)
#作相应的图像变换,resize,totensor,normalize注意totensor一定要在normalize之前
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
# pic_transform = transforms.Compose([transforms.Resize(112),transforms.ToTensor(),transforms.Normalize(norm_mean,norm_std)])
pic_transform = transform['val_test']
img = pic_transform(img)
img = np.array(img)
img = img.transpose(0,2,1)
img = torch.tensor([img])
img = img.to(device)
net.eval()
with torch.no_grad():
out = net(img)
print(out)
_, pred = torch.max(out.data, 1)
data = json.dumps({'class': class_dict[pred[0].item()]},indent=4)
print(data)
return data
if __name__ == '__main__':
net = init()
x = cv2.imread('../pic/0.jpg')
process_image(net, x)
这里主要修改了transform的方式,新的模型更换了transform在下个部分进行讲解;
讲下注意事项
- 在训练模型时会在前面加上:
model.train()
#在测试模型时在前面使用:
model.eval()
-
同时发现,如果不写这两个程序也可以运行,这是因为这两个方法是针对在网络训练和测试时采用不同方式的情况,比如Batch Normalization 和Dropout。
-
使用PyTorch进行训练和测试时一定注意要把实例化的model指定train/eval,eval()时,框架会自动把BN和DropOut固定住,不会取平均,而是用训练好的值,不然的话,一旦test的batch_size过小,很容易就会被BN层导致生成图片颜色失真极大!
Class Inpaint_Network()
......
Model = Inpaint_Nerwoek()
#train:
Model.train(mode=True)
.....
#test:
Model.eval()
训练时是针对每个min-batch的,但是在测试中往往是针对单张图片,即不存在min-batch的概念。由于网络训练完毕后参数都是固定的,因此每个批次的均值和方差都是不变的,因此直接结算所有batch的均值和方差。
所有Batch Normalization的训练和测试时的操作不同。
还有一个重要问题也是我们之前一直结果和随机猜测差不多原因就是我们的normalize部分没有和训练时读入数据一直
最后就是convert_model.sh中的注意点,一定要remove掉未量化二代model.dlc否则量化结果会一直是GPU的版本
二、已尝试的模型
-
- resnet18 epoch50 常规transform
网络主体架构resnet18
train_transform = transforms.Compose([
transforms.Resize(256),
transforms.RandomAffine(degrees=0, translate=(0.05, 0.05)),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
- 关于优化器的选择
optimizer = optim.SGD(resnet18.parameters(), lr=LR, momentum=0.9)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=lr_decay_step, gamma=0.1)
-
- resnet34 这个本质上是double的resnet18效果略有提升,但是幅度不大,不作具体展示
-
- resnet50 epoch10,加入尝龟数据增强,过拟合非常严重
-
- efficient_net_b4 相比较resnet更轻量化和更好性能的网络架构,具体代码如下,transform采用对应适合的版本,进行了30epoch,没有进行精细的调参暂且只试验了一遍,网络相对复杂,性能分无法打满,早期版本过拟合也是十分严重,这里采用了一个新的损失函数LabelSmoothSoftmaxCE进行了标签平滑
#!/usr/bin/python
# -*- encoding: utf-8 -*-
import torch
import torch.nn as nn
class LabelSmoothSoftmaxCE(nn.Module):
def __init__(self,
lb_pos=0.9,
lb_neg=0.005,
reduction='mean',
lb_ignore=255,
):
super(LabelSmoothSoftmaxCE, self).__init__()
self.lb_pos = lb_pos
self.lb_neg = lb_neg
self.reduction = reduction
self.lb_ignore = lb_ignore
self.log_softmax = nn.LogSoftmax(1)
def forward(self, logits, label):
logs = self.log_softmax(logits)
ignore = label.data.cpu() == self.lb_ignore
n_valid = (ignore == 0).sum()
label = label.clone()
label[ignore] = 0
lb_one_hot = logits.data.clone().zero_().scatter_(1, label.unsqueeze(1), 1)
label = self.lb_pos * lb_one_hot + self.lb_neg * (1-lb_one_hot)
ignore = ignore.nonzero()
_, M = ignore.size()
a, *b = ignore.chunk(M, dim=1)
label[[a, torch.arange(label.size(1)), *b]] = 0
if self.reduction == 'mean':
loss = -torch.sum(torch.sum(logs*label, dim=1)) / n_valid
elif self.reduction == 'none':
loss = -torch.sum(logs*label, dim=1)
return loss
if __name__ == '__main__':
torch.manual_seed(15)
criteria = LabelSmoothSoftmaxCE(lb_pos=0.9, lb_neg=5e-3)
net1 = nn.Sequential(
nn.Conv2d(3, 3, kernel_size=3, stride=2, padding=1),
)
net1.cuda()
net1.train()
net2 = nn.Sequential(
nn.Conv2d(3, 3, kernel_size=3, stride=2, padding=1),
)
net2.cuda()
net2.train()
with torch.no_grad():
inten = torch.randn(2, 3, 5, 5).cuda()
lbs = torch.randint(0, 3, [2, 5, 5]).cuda()
lbs[1, 3, 4] = 255
lbs[1, 2, 3] = 255
print(lbs)
import torch.nn.functional as F
logits1 = net1(inten)
logits1 = F.interpolate(logits1, inten.size()[2:], mode='bilinear')
logits2 = net2(inten)
logits2 = F.interpolate(logits2, inten.size()[2:], mode='bilinear')
# loss1 = criteria1(logits1, lbs)
loss = criteria(logits1, lbs)
# print(loss.detach().cpu())
loss.backward()
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-m9Ei6Hb7-1621562613678)(https://i.loli.net/2021/05/13/OYv7f8HcjsIPXbk.png)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GMyfTVhw-1621562613678)(https://i.loli.net/2021/05/14/OrCmoWTGifRaX6N.png)]
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-X7bLV5Pv-1621562613680)(https://i.loli.net/2021/05/13/g4PhECH8FXZQ5B1.jpg)]
efficient_net.py --(b3 288 输入) 同时尝试了b0 224输入的结果
valid_acc
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ksZaQyRM-1621562613681)(https://i.loli.net/2021/05/14/rl2RNKfzOgdhnPM.png)]
valid_loss
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-iiS7dsWZ-1621562613681)(https://i.loli.net/2021/05/14/KOzCnJ9WDNakc71.png)]
train_acc
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-nemZD1gA-1621562613682)(https://i.loli.net/2021/05/14/u7XaxSQRoV6E8Zl.png)]
train_loss
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-FopePOyi-1621562613682)(https://i.loli.net/2021/05/14/AtjS2OJgh7DikRm.png)]
efficientnet代码
# -*- coding: utf-8 -*-
from torchvision import datasets, transforms
import torch
import numpy as np
import matplotlib.pyplot as plt
from torch import nn
import torch.optim as optim
import argparse
import warnings
import torch.optim.lr_scheduler as lr_scheduler
from torch.utils.data.dataloader import default_collate # 导入默认的拼接方式
from efficientnet_pytorch import EfficientNet
from trash_dataloader import TrashDataset
from label_smooth import LabelSmoothSoftmaxCE
import os
from trash_dataloader import TrashDataset
from torch.utils.data import DataLoader
from ev_toolkit import plot_tool
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
warnings.filterwarnings("ignore")
# to the ImageFolder structure
# Number of classes in the dataset
num_classes = 146
# Batch size for training (change depending on how much memory you have)
batch_size = 64 # 批处理尺寸(batch_size)
# Number of epochs to train for
EPOCH = 50
# Flag for feature extracting. When False, we finetune the whole model,
# when True we only update the reshaped layer params
# feature_extract = True
feature_extract = False
# 超参数设置
pre_epoch = 0 # 定义已经遍历数据集的次数
def my_collate_fn(batch):
'''
batch中每个元素形如(data, label)
'''
# 过滤为None的数据
batch = list(filter(lambda x: x[0] is not None, batch))
if len(batch) == 0: return torch.Tensor()
return default_collate(batch) # 用默认方式拼接过滤后的batch数据
# 用Adam优化器
net = EfficientNet.from_pretrained('efficientnet-b0')
num_ftrs = net._fc.in_features
net._fc = nn.Linear(num_ftrs, num_classes)
# 显示网络信息
print(net)
# Detect if we have a GPU available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 训练使用多GPU,测试单GPU
if torch.cuda.device_count() > 1:
print("Let's use", torch.cuda.device_count(), "GPUs!")
net = nn.DataParallel(net)
net.to(device)
# Send the model to GPU
net = net.to(device)
norm_mean = [0.485, 0.456, 0.406]
norm_std = [0.229, 0.224, 0.225]
train_transform = transforms.Compose([
transforms.Resize((112,112)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
val_transform = transforms.Compose([
transforms.Resize((112,112)),
transforms.ToTensor(),
transforms.Normalize(norm_mean, norm_std),
])
# Create training and validation datasets
# # 获取数据集文件夹下的类别文件夹名称赋值为type_list
train_dir = '../../../../home/data'
# 构建DataLoder
train_data= TrashDataset(data_dir=train_dir, transform=train_transform)
valid_data = TrashDataset(data_dir=train_dir, transform=val_transform)
train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True)
valid_loader = DataLoader(dataset=valid_data, batch_size=batch_size)
# 参数设置,使得我们能够手动输入命令行参数,就是让风格变得和Linux命令行差不多
parser = argparse.ArgumentParser(description='PyTorch DeepNetwork Training')
parser.add_argument('--outf', default='./model/model', help='folder to output images and model checkpoints') # 输出结果保存路径
args = parser.parse_args()
params_to_update = net.parameters()
print("Params to learn:")
if feature_extract:
params_to_update = []
for name, param in net.named_parameters():
if param.requires_grad == True:
params_to_update.append(param)
print("\t", name)
else:
for name, param in net.named_parameters():
if param.requires_grad == True:
print("\t", name)
def main():
train_curve = list()
train_acc = list()
ii = 0
LR = 1e-3 # 学习率
best_acc = 0 # 初始化best test accuracy
print("Start Training, DeepNetwork!") # 定义遍历数据集的次数
# criterion
criterion = LabelSmoothSoftmaxCE()
# optimizer
optimizer = optim.Adam(params_to_update, lr=LR, betas=(0.9, 0.999), eps=1e-9)
# scheduler
scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', factor=0.7, patience=3, verbose=True)
with open("./log/acc.txt", "w") as f:
with open("./log/log.txt", "w")as f2:
for epoch in range(pre_epoch, EPOCH):
# scheduler.step(epoch)
print('\nEpoch: %d' % (epoch + 1))
net.train()
sum_loss = 0.0
correct = 0.0
total = 0.0
for i, data in enumerate(train_loader):
# 准备数据
length = len(train_loader)
input, target = data
input, target = input.to(device), target.to(device)
# 训练
optimizer.zero_grad()
# forward + backward
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()
# 每训练1个batch打印一次loss和准确率
sum_loss += loss.item()
_, predicted = torch.max(output.data, 1)
total += target.size(0)
correct += predicted.eq(target.data).cpu().sum()
train_curve.append(loss.item())
train_acc.append(correct / total)
if (i + 1 + epoch * length) /50 == 0:
print('[epoch:%d, iter:%d] Loss: %.03f | Acc: %.3f%% '
% (epoch + 1, (i + 1 + epoch * length), sum_loss / (i + 1),
100. * float(correct) / float(total)))
f2.write('%03d %05d |Loss: %.03f | Acc: %.3f%% '
% (epoch + 1, (i + 1 + epoch * length), sum_loss / (i + 1),
100. * float(correct) / float(total)))
f2.write('\n')
f2.flush()
# 每训练完一个epoch测试一下准确率
print("Waiting Test!")
with torch.no_grad():
correct = 0
total = 0
for data in valid_loader:
net.eval()
images, labels = data
images, labels = images.to(device), labels.to(device)
outputs = net(images)
# 取得分最高的那个类 (outputs.data的索引号)
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).cpu().sum()
print('测试分类准确率为:%.3f%%' % (100. * float(correct) / float(total)))
acc = 100. * float(correct) / float(total)
scheduler.step(acc)
# 将每次测试结果实时写入acc.txt文件中
if (ii % 1 == 0):
print('Saving model......')
# torch.save(net, '%s/net_%03d.pth' % (args.outf, epoch + 1))
f.write("EPOCH=%03d,Accuracy= %.3f%%" % (epoch + 1, acc))
f.write('\n')
f.flush()
# 记录最佳测试分类准确率并写入best_acc.txt文件中
if acc > best_acc:
f3 = open("./log/best_acc.txt", "w")
f3.write("EPOCH=%d,best_acc= %.3f%%" % (epoch + 1, acc))
f3.close()
best_acc = acc
print("Training Finished, TotalEPOCH=%d" % EPOCH)
path_model = '../models/202159efficient_20epoch/models.pkl'
torch.save(net, path_model)
print('{} is save!'.format(path_model))
return train_curve,train_acc
def save_plot(train_curve,train_acc):
train_x = list(range(len(train_curve)))
train_loss = np.array(train_curve)
train_acc = np.array(train_acc)
train_iters = len(train_loader)
fig_loss = plt.figure(figsize = (10,6))
plt.plot(train_x, train_loss)
plot_tool.update_plot(name='loss', img=plt.gcf())
fig_loss.savefig('../result-graphs/loss.png')
fig_acc = plt.figure(figsize = (10,6))
plt.plot(train_x, train_acc)
plot_tool.update_plot(name='acc', img=plt.gcf())
fig_acc = plt.gcf()
fig_acc.savefig('../result-graphs/acc.png')
print('acc-loss曲线绘制已完成')
if __name__ == "__main__":
train_curve,train_acc = main()
save_plot(train_curve,train_acc)
优化器同上resnet18
-
- 网络主体结构resnext50_32x4d + 改进优化器与数据增强,xt为2016年对残差网络的改进型,其参数量相当于resnet50但是层数和结构融合了inception的并行卷积所以可以等同于resnet101
data_Augmentation.py
#数据增强
import random
import math
import torch
from PIL import Image, ImageOps, ImageFilter
from torchvision import transforms
class Resize(object):
def __init__(self, size, interpolation=Image.BILINEAR):
self.size = size
self.interpolation = interpolation
def __call__(self, img):
# padding
ratio = self.size[0] / self.size[1]
w, h = img.size
if w / h < ratio:
t = int(h * ratio)
w_padding = (t - w) // 2
img = img.crop((-w_padding, 0, w+w_padding, h))
else:
t = int(w / ratio)
h_padding = (t - h) // 2
img = img.crop((0, -h_padding, w, h+h_padding))
img = img.resize(self.size, self.interpolation)
return img
class RandomRotate(object):
def __init__(self, degree, p=0.5):
self.degree = degree
self.p = p
def __call__(self, img):
if random.random() < self.p:
rotate_degree = random.uniform(-1*self.degree, self.degree)
img = img.rotate(rotate_degree, Image.BILINEAR)
return img
class RandomGaussianBlur(object):
def __init__(self, p=0.5):
self.p = p
def __call__(self, img):
if random.random() < self.p:
img = img.filter(ImageFilter.GaussianBlur(
radius=random.random()))
return img
def get_train_transform(mean, std, size):
train_transform = transforms.Compose([
Resize((int(size * (256 / 224)), int(size * (256 / 224)))),
transforms.RandomCrop(size),
transforms.RandomHorizontalFlip(),
# RandomRotate(15, 0.3),
# RandomGaussianBlur(),
transforms.ToTensor(),
transforms.Normalize(mean=mean, std=std),
])
return train_transform
def get_test_transform(mean, std, size):
return transforms.Compose([
Resize((int(size * (256 / 224)), int(size * (256 / 224)))),
transforms.CenterCrop(size),
transforms.ToTensor(),
transforms.Normalize(mean=mean, std=std),
])
def get_transforms(input_size=288, test_size=288, backbone=None):
mean, std = [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]
if backbone is not None and backbone in ['pnasnet5large', 'nasnetamobile']:
mean, std = [0.5, 0.5, 0.5], [0.5, 0.5, 0.5]
transformations = {}
transformations['train'] = get_train_transform(mean, std, input_size)
transformations['val'] = get_test_transform(mean, std, test_size)
return transformations
self_optimizer.py
#新的优化器
import errno
import os
import sys
import time
import math
import torch.nn as nn
import torch.nn.init as init
from torch.autograd import Variable
import torch
import shutil
# import adabound
# from utils.radam import RAdam, AdamW
import torchvision.transforms as transforms
import math
import torch
from torch.optim.optimizer import Optimizer, required
class RAdam(Optimizer):
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0):
defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay)
self.buffer = [[None, None, None] for ind in range(10)]
super(RAdam, self).__init__(params, defaults)
def __setstate__(self, state):
super(RAdam, self).__setstate__(state)
def step(self, closure=None):
loss = None
if closure is not None:
loss = closure()
for group in self.param_groups:
for p in group['params']:
if p.grad is None:
continue
grad = p.grad.data.float()
if grad.is_sparse:
raise RuntimeError('RAdam does not support sparse gradients')
p_data_fp32 = p.data.float()
state = self.state[p]
if len(state) == 0:
state['step'] = 0
state['exp_avg'] = torch.zeros_like(p_data_fp32)
state['exp_avg_sq'] = torch.zeros_like(p_data_fp32)
else:
state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32)
state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32)
exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
beta1, beta2 = group['betas']
exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
exp_avg.mul_(beta1).add_(1 - beta1, grad)
state['step'] += 1
buffered = self.buffer[int(state['step'] % 10)]
if state['step'] == buffered[0]:
N_sma, step_size = buffered[1], buffered[2]
else:
buffered[0] = state['step']
beta2_t = beta2 ** state['step']
N_sma_max = 2 / (1 - beta2) - 1
N_sma = N_sma_max - 2 * state['step'] * beta2_t / (1 - beta2_t)
buffered[1] = N_sma
# more conservative since it's an approximated value
if N_sma >= 5:
step_size = math.sqrt(
(1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (
N_sma_max - 2)) / (1 - beta1 ** state['step'])
else:
step_size = 1.0 / (1 - beta1 ** state['step'])
buffered[2] = step_size
if group['weight_decay'] != 0:
p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32)
# more conservative since it's an approximated value
if N_sma >= 5:
denom = exp_avg_sq.sqrt().add_(group['eps'])
p_data_fp32.addcdiv_(-step_size * group['lr'], exp_avg, denom)
else:
p_data_fp32.add_(-step_size * group['lr'], exp_avg)
p.data.copy_(p_data_fp32)
return loss
class PlainRAdam(Optimizer):
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0):
defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay)
super(PlainRAdam, self).__init__(params, defaults)
def __setstate__(self, state):
super(PlainRAdam, self).__setstate__(state)
def step(self, closure=None):
loss = None
if closure is not None:
loss = closure()
for group in self.param_groups:
for p in group['params']:
if p.grad is None:
continue
grad = p.grad.data.float()
if grad.is_sparse:
raise RuntimeError('RAdam does not support sparse gradients')
p_data_fp32 = p.data.float()
state = self.state[p]
if len(state) == 0:
state['step'] = 0
state['exp_avg'] = torch.zeros_like(p_data_fp32)
state['exp_avg_sq'] = torch.zeros_like(p_data_fp32)
else:
state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32)
state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32)
exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
beta1, beta2 = group['betas']
exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
exp_avg.mul_(beta1).add_(1 - beta1, grad)
state['step'] += 1
beta2_t = beta2 ** state['step']
N_sma_max = 2 / (1 - beta2) - 1
N_sma = N_sma_max - 2 * state['step'] * beta2_t / (1 - beta2_t)
if group['weight_decay'] != 0:
p_data_fp32.add_(-group['weight_decay'] * group['lr'], p_data_fp32)
# more conservative since it's an approximated value
if N_sma >= 5:
step_size = group['lr'] * math.sqrt(
(1 - beta2_t) * (N_sma - 4) / (N_sma_max - 4) * (N_sma - 2) / N_sma * N_sma_max / (
N_sma_max - 2)) / (1 - beta1 ** state['step'])
denom = exp_avg_sq.sqrt().add_(group['eps'])
p_data_fp32.addcdiv_(-step_size, exp_avg, denom)
else:
step_size = group['lr'] / (1 - beta1 ** state['step'])
p_data_fp32.add_(-step_size, exp_avg)
p.data.copy_(p_data_fp32)
return loss
class AdamW(Optimizer):
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0, warmup=0):
defaults = dict(lr=lr, betas=betas, eps=eps,
weight_decay=weight_decay, warmup=warmup)
super(AdamW, self).__init__(params, defaults)
def __setstate__(self, state):
super(AdamW, self).__setstate__(state)
def step(self, closure=None):
loss = None
if closure is not None:
loss = closure()
for group in self.param_groups:
for p in group['params']:
if p.grad is None:
continue
grad = p.grad.data.float()
if grad.is_sparse:
raise RuntimeError('Adam does not support sparse gradients, please consider SparseAdam instead')
p_data_fp32 = p.data.float()
state = self.state[p]
if len(state) == 0:
state['step'] = 0
state['exp_avg'] = torch.zeros_like(p_data_fp32)
state['exp_avg_sq'] = torch.zeros_like(p_data_fp32)
else:
state['exp_avg'] = state['exp_avg'].type_as(p_data_fp32)
state['exp_avg_sq'] = state['exp_avg_sq'].type_as(p_data_fp32)
exp_avg, exp_avg_sq = state['exp_avg'], state['exp_avg_sq']
beta1, beta2 = group['betas']
state['step'] += 1
exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
exp_avg.mul_(beta1).add_(1 - beta1, grad)
denom = exp_avg_sq.sqrt().add_(group['eps'])
bias_correction1 = 1 - beta1 ** state['step']
bias_correction2 = 1 - beta2 ** state['step']
if group['warmup'] > state['step']:
scheduled_lr = 1e-8 + state['step'] * group['lr'] / group['warmup']
else:
scheduled_lr = group['lr']
step_size = scheduled_lr * math.sqrt(bias_correction2) / bias_correction1
if group['weight_decay'] != 0:
p_data_fp32.add_(-group['weight_decay'] * scheduled_lr, p_data_fp32)
p_data_fp32.addcdiv_(-step_size, exp_avg, denom)
p.data.copy_(p_data_fp32)
return loss
__all__ = ['get_mean_and_std', 'init_params', 'mkdir_p', 'AverageMeter', 'get_optimizer', 'save_checkpoint']
def get_mean_and_std(dataset):
'''Compute the mean and std value of dataset.'''
dataloader = trainloader = torch.utils.data.DataLoader(dataset, batch_size=1, shuffle=True, num_workers=2)
mean = torch.zeros(3)
std = torch.zeros(3)
print('==> Computing mean and std..')
for inputs, targets in dataloader:
for i in range(3):
mean[i] += inputs[:,i,:,:].mean()
std[i] += inputs[:,i,:,:].std()
mean.div_(len(dataset))
std.div_(len(dataset))
return mean, std
def init_params(net):
'''Init layer parameters.'''
for m in net.modules():
if isinstance(m, nn.Conv2d):
init.kaiming_normal(m.weight, mode='fan_out')
if m.bias:
init.constant(m.bias, 0)
elif isinstance(m, nn.BatchNorm2d):
init.constant(m.weight, 1)
init.constant(m.bias, 0)
elif isinstance(m, nn.Linear):
init.normal(m.weight, std=1e-3)
if m.bias:
init.constant(m.bias, 0)
def mkdir_p(path):
'''make dir if not exist'''
try:
os.makedirs(path)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(path):
pass
else:
raise
class AverageMeter(object):
"""Computes and stores the average and current value
Imported from https://github.com/pytorch/examples/blob/master/imagenet/main.py#L247-L262
"""
def __init__(self):
self.reset()
def reset(self):
self.val = 0
self.avg = 0
self.sum = 0
self.count = 0
def update(self, val, n=1):
self.val = val
self.sum += val * n
self.count += n
self.avg = self.sum / self.count
def get_optimizer(model, args):
parameters = []
for name, param in model.named_parameters():
if 'fc' in name or 'class' in name or 'last_linear' in name or 'ca' in name or 'sa' in name:
parameters.append({'params': param, 'lr': args.lr * args.lr_fc_times})
else:
parameters.append({'params': param, 'lr': args.lr})
if args.optimizer == 'sgd':
return torch.optim.SGD(parameters,
# model.parameters(),
args.lr,
momentum=args.momentum, nesterov=args.nesterov,
weight_decay=args.weight_decay)
elif args.optimizer == 'rmsprop':
return torch.optim.RMSprop(parameters,
# model.parameters(),
args.lr,
alpha=args.alpha,
weight_decay=args.weight_decay)
elif args.optimizer == 'adam':
return torch.optim.Adam(parameters,
# model.parameters(),
args.lr,
betas=(args.beta1, args.beta2),
weight_decay=args.weight_decay)
elif args.optimizer == 'AdaBound':
return adabound.AdaBound(parameters,
# model.parameters(),
lr=args.lr, final_lr=args.final_lr)
elif args.optimizer == 'radam':
return RAdam(parameters, lr=args.lr, betas=(args.beta1, args.beta2),
weight_decay=args.weight_decay)
else:
raise NotImplementedError
def save_checkpoint(state, is_best, single=True, checkpoint='checkpoint', filename='checkpoint.pth.tar'):
if single:
fold = ''
else:
fold = str(state['fold']) + '_'
cur_name = 'checkpoint.pth.tar'
filepath = os.path.join(checkpoint, fold + cur_name)
curpath = os.path.join(checkpoint, fold + 'model_cur.pth')
torch.save(state, filepath)
torch.save(state['state_dict'], curpath)
if is_best and state['epoch'] >= 5:
model_name = 'model_' + str(state['epoch']) + '_' + str(int(round(state['train_acc']*100, 0))) + '_' + str(int(round(state['acc']*100, 0))) + '.pth'
model_path = os.path.join(checkpoint, fold + model_name)
torch.save(state['state_dict'], model_path)
def save_checkpoint2(state, is_best, checkpoint='checkpoint', filename='checkpoint.pth.tar'):
# best_model = '/application/search/qlmx/clover/garbage/code/image_classfication/predict/'
fold = str(state['fold']) + '_'
filepath = os.path.join(checkpoint, fold + filename)
model_path = os.path.join(checkpoint, fold + 'model_cur.pth')
torch.save(state, filepath)
torch.save(state['state_dict'], model_path)
if is_best:
shutil.copyfile(filepath, os.path.join(checkpoint, fold + 'model_best.pth.tar'))
shutil.copyfile(model_path, os.path.join(checkpoint, fold + 'model_best.pth'))
- 该版本还有一个修改,就是重新切分出了训练集和验证集,注意需要在start_train.sh中加入对数据集预处理代码preprocess.py的运行命令
preprocess.py
# 工具类
import os
import random
import shutil
from shutil import copy2
def data_set_split(src_data_folder, target_data_folder, train_scale=0.9, val_scale=0.1):
'''
读取源数据文件夹,生成划分好的文件夹,分为trian、val、test三个文件夹进行
:param src_data_folder: 源文件夹 E:/biye/gogogo/note_book/torch_note/data/utils_test/data_split/src_data
:param target_data_folder: 目标文件夹 E:/biye/gogogo/note_book/torch_note/data/utils_test/data_split/target_data
:param train_scale: 训练集比例
:param val_scale: 验证集比例
:param test_scale: 测试集比例
:return:
'''
print("开始数据集划分")
class_names = os.listdir(src_data_folder)
# 在目标目录下创建文件夹
split_names = ['train', 'val']
for split_name in split_names:
split_path = os.path.join(target_data_folder, split_name)
if os.path.isdir(split_path):
pass
else:
os.mkdir(split_path)
# 然后在split_path的目录下创建类别文件夹
for class_name in class_names:
class_split_path = os.path.join(split_path, class_name)
if os.path.isdir(class_split_path):
pass
else:
os.mkdir(class_split_path)
# 按照比例划分数据集,并进行数据图片的复制
# 首先进行分类遍历
for class_name in class_names:
current_class_data_path = os.path.join(src_data_folder, class_name)
current_all_data = os.listdir(current_class_data_path)
current_data_length = len(current_all_data)
current_data_index_list = list(range(current_data_length))
random.shuffle(current_data_index_list)
train_folder = os.path.join(os.path.join(target_data_folder, 'train'), class_name)
val_folder = os.path.join(os.path.join(target_data_folder, 'val'), class_name)
train_stop_flag = current_data_length * train_scale
val_stop_flag = current_data_length * (train_scale + val_scale)
current_idx = 0
train_num = 0
val_num = 0
for i in current_data_index_list:
src_img_path = os.path.join(current_class_data_path, current_all_data[i])
if current_idx <= train_stop_flag:
copy2(src_img_path, train_folder)
# print("{}复制到了{}".format(src_img_path, train_folder))
train_num = train_num + 1
elif (current_idx > train_stop_flag) and (current_idx <= val_stop_flag):
copy2(src_img_path, val_folder)
# print("{}复制到了{}".format(src_img_path, val_folder))
val_num = val_num + 1
current_idx = current_idx + 1
print("*********************************{}*************************************".format(class_name))
print("训练集{}:{}张".format(train_folder, train_num))
print("验证集{}:{}张".format(val_folder, val_num))
if __name__ == '__main__':
base_dir = "../../../../home/data"
num = os.listdir(base_dir)
src_data_folder = os.path.join(base_dir,num[0])
os.makedirs("./split_data/",exist_ok=True)
target_data_folder = "./split_data/"
data_set_split(src_data_folder, target_data_folder)
-
- 暂未测试已有代码的模型Resnet+注意力机制 - - - CBAM
注意点:因为不能改变ResNet的网络结构,所以CBAM不能加在block里面(也可以加在block里面的,但是此时预训练参数就不能用了),因为加进去网络结构发生了变化,所以不能用预训练参数。加在最后一层卷积和第一层卷积不改变网络,可以用预训练参数。
import torch.nn as nn
import math
try:
from torch.hub import load_state_dict_from_url
except ImportError:
from torch.utils.model_zoo import load_url as load_state_dict_from_url
import torch
## 通道注意力机制
class ChannelAttention(nn.Module):
def __init__(self, in_planes, ratio=16):
super(ChannelAttention, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.max_pool = nn.AdaptiveMaxPool2d(1)
self.fc1 = nn.Conv2d(in_planes, in_planes // 16, 1, bias=False)
self.relu1 = nn.ReLU()
self.fc2 = nn.Conv2d(in_planes // 16, in_planes, 1, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
out = avg_out + max_out
return self.sigmoid(out)
## 空间注意力机制
class SpatialAttention(nn.Module):
def __init__(self, kernel_size=7):
super(SpatialAttention, self).__init__()
assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
padding = 3 if kernel_size == 7 else 1
self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
self.sigmoid = nn.Sigmoid()
def forward(self, x):
avg_out = torch.mean(x, dim=1, keepdim=True)
max_out, _ = torch.max(x, dim=1, keepdim=True)
x = torch.cat([avg_out, max_out], dim=1)
x = self.conv1(x)
return self.sigmoid(x)
class ResNet(nn.Module):
def __init__(self, block, layers, num_classes=1000, zero_init_residual=False,
groups=1, width_per_group=64, replace_stride_with_dilation=None,
norm_layer=None):
super(ResNet, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
self._norm_layer = norm_layer
self.inplanes = 64
self.dilation = 1
if replace_stride_with_dilation is None:
# each element in the tuple indicates if we should replace
# the 2x2 stride with a dilated convolution instead
replace_stride_with_dilation = [False, False, False]
if len(replace_stride_with_dilation) != 3:
raise ValueError("replace_stride_with_dilation should be None "
"or a 3-element tuple, got {}".format(replace_stride_with_dilation))
self.groups = groups
self.base_width = width_per_group
self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=7, stride=2, padding=3,
bias=False)
self.bn1 = norm_layer(self.inplanes)
self.relu = nn.ReLU(inplace=True)
# 网络的第一层加入注意力机制
self.ca = ChannelAttention(self.inplanes)
self.sa = SpatialAttention()
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers[0])
self.layer2 = self._make_layer(block, 128, layers[1], stride=2,
dilate=replace_stride_with_dilation[0])
self.layer3 = self._make_layer(block, 256, layers[2], stride=2,
dilate=replace_stride_with_dilation[1])
self.layer4 = self._make_layer(block, 512, layers[3], stride=2,
dilate=replace_stride_with_dilation[2])
# 网络的卷积层的最后一层加入注意力机制
self.ca1 = ChannelAttention(self.inplanes)
self.sa1 = SpatialAttention()
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512 * block.expansion, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
# This improves the model by 0.2~0.3% according to https://arxiv.org/abs/1706.02677
if zero_init_residual:
for m in self.modules():
if isinstance(m, Bottleneck):
nn.init.constant_(m.bn3.weight, 0)
elif isinstance(m, BasicBlock):
nn.init.constant_(m.bn2.weight, 0)
def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
norm_layer = self._norm_layer
downsample = None
previous_dilation = self.dilation
if dilate:
self.dilation *= stride
stride = 1
if stride != 1 or self.inplanes != planes * block.expansion:
downsample = nn.Sequential(
conv1x1(self.inplanes, planes * block.expansion, stride),
norm_layer(planes * block.expansion),
)
layers = []
layers.append(block(self.inplanes, planes, stride, downsample, self.groups,
self.base_width, previous_dilation, norm_layer))
self.inplanes = planes * block.expansion
for _ in range(1, blocks):
layers.append(block(self.inplanes, planes, groups=self.groups,
base_width=self.base_width, dilation=self.dilation,
norm_layer=norm_layer))
return nn.Sequential(*layers)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.ca(x) * x
x = self.sa(x) * x
x = self.maxpool(x)
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.ca1(x) * x
x = self.sa1(x) * x
x = self.avgpool(x)
x = x.reshape(x.size(0), -1)
x = self.fc(x)
return x
三、数据不平衡的解决方案
import os
import random
import torch
from torch.utils.data import Dataset
import torchvision
from PIL import Image
class MedicalDataset(Dataset):
def __init__(self, root, split, data_ratio=1.0, ret_name=False):
assert split in ['train', 'val', 'test']
self.ret_name = ret_name
self.cls_to_ind_dict = dict()
self.ind_to_cls_dict = list()
self.img_list = list()
self.cls_list = list()
self.cls_num = dict()
classes = ['WA', 'WKY']
if split=='test':
for idx, cls in enumerate(classes):
self.cls_to_ind_dict[cls] = idx
self.ind_to_cls_dict.append(cls)
img_list = sorted(os.listdir(os.path.join(root, split, cls)))
self.cls_num[cls] = len(img_list)
for img_fp in img_list:
self.img_list.append(os.path.join(root, split, cls, img_fp))
self.cls_list.append(idx)
else:
img_list_temp, cls_list_temp = [],[]
for idx, cls in enumerate(classes):
self.cls_to_ind_dict[cls] = idx
self.ind_to_cls_dict.append(cls)
if cls == 'WA': #WA的训练集数量不用扩
img_list = sorted(os.listdir(os.path.join(root, split, cls)))
self.cls_num[cls] = len(img_list)
for img_fp in img_list:
self.img_list.append(os.path.join(root, split, cls, img_fp))
self.cls_list.append(idx)
print(cls, '=======================')
print(len(self.img_list), len(self.cls_list))
else:
img_list = sorted(os.listdir(os.path.join(root, split, cls)))
for img_fp in img_list:
img_list_temp.append(os.path.join(root, split, cls, img_fp))
cls_list_temp.append(idx)
img_list_temp = [val for val in img_list_temp for i in range(3)] #将原来的img_list重复三遍
cls_list_temp = [val for val in cls_list_temp for i in range(3)]
self.cls_num[cls] = len(img_list_temp) #记录每个类别的新数目
print(cls, '=======================')
print(len(img_list_temp), len(cls_list_temp))
self.img_list = self.img_list + img_list_temp
self.cls_list = self.cls_list + cls_list_temp
print(len(self.img_list), len(self.cls_list))
# 强制水平翻转
self.trans0 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.RandomHorizontalFlip(p=1),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 强制垂直翻转
self.trans1 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.RandomVerticalFlip(p=1),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 旋转-90~90
self.trans2 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.RandomRotation(90),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 亮度在0-2之间增强,0是原图
self.trans3 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.ColorJitter(brightness=1),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 修改对比度,0-2之间增强,0是原图
self.trans4 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.ColorJitter(contrast=2),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 颜色变化
self.trans5 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.ColorJitter(hue=0.5),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
# 混合
self.trans6 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.ColorJitter(brightness=1, contrast=2, hue=0.5),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
self.trans_list = [self.trans0, self.trans1, self.trans2, self.trans3, self.trans4, self.trans5, self.trans6]
def __getitem__(self, index):
name = self.img_list[index]
img = Image.open(name)
num = random.randint(0, 6)
img = self.trans_list[num](img)
label = self.cls_list[index]
if self.ret_name:
return img, label, name
else:
return img, label
def __len__(self):
return len(self.img_list)
# 混合
self.trans6 = torchvision.transforms.Compose([torchvision.transforms.Resize(256),
torchvision.transforms.RandomCrop(224),
torchvision.transforms.ColorJitter(brightness=1, contrast=2, hue=0.5),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])
])
self.trans_list = [self.trans0, self.trans1, self.trans2, self.trans3, self.trans4, self.trans5, self.trans6]
def __getitem__(self, index):
name = self.img_list[index]
img = Image.open(name)
num = random.randint(0, 6)
img = self.trans_list[num](img)
label = self.cls_list[index]
if self.ret_name:
return img, label, name
else:
return img, label
def __len__(self):
return len(self.img_list)
四、关于进一步的规划
-
- 下周四是比赛截止日期5月20日
-
- 目前考虑结合较为完善的labelsmooth技术和注意力机制,在不导入预训练模型的情况开展训练,网络主体框架使用resnext50;
-
- 根据自动测试的错误分类结果对应数据的各个类别的分布逐个排查在那些类别上有问题,目前能够直观的发现的就是太阳能电池板一类的垃圾图片无法正确识别;这一块需要针对性的寻找对不均衡样本的采样方法来改善其准确率和召回率