动手学深度学习 leaf-classify实践
基本思路
-
数据增强:翻转(水平、垂直)、平移、旋转、改变明亮对比度、Normalize
def get_transform(mode): if mode=='train': transform=albu.Compose([ albu.Resize(256,256), albu.HorizontalFlip(0.5), albu.VerticalFlip(0.5), albu.Rotate(limit=180, p=0.6), albu.RandomBrightnessContrast(), # 随机明亮对比度 albu.ShiftScaleRotate(shift_limit=0.25, scale_limit=0.1, rotate_limit=0), albu.Normalize(), ToTensorV2()]) else: transform=albu.Compose([ albu.Resize(256, 256), albu.Normalize(), ToTensorV2()]) return transform
-
多模型融合(resnet50d、resnest50d、seresnext50_32x4d)
分别调用timm的模型库中预训练模型,在现有的树叶🍃集上进行微调,训练策略采用5折交叉的方法在训练集上进行训练,每一折保存若干模型,让模型具有较好的泛化性。将预测结果放在一起进行投票(求众数),生成最终结果。
训练策略
- 损失函数criterion:交叉熵损失函数
- 优化器optimizer: AdamW(效果和Adam+L2类似)
- 学习率:采用sheduler, 余弦退火CosineAnnealingLR
- 5折交叉,模型泛化
基本流程
-
读取数据制作数据集:自定义类,继承torch.utils.data.Dataset, 主要实现_ init _、 _ getitem _ 、 _ len _
class LeafDataset(Dataset): def __init__(self,image_paths,labels,mode='train'): self.mode=mode self.transform=get_transform(self.mode) self.image_paths=image_paths self.labels=labels def __len__(self): return len(self.labels) def __getitem__(self,index): image_path=self.image_paths[index] image=cv2.imread(image_path) image=cv2.cvtColor(image,cv2.COLOR_BGR2RGB) image=self.transform(image=image)['image'] label=self.labels[index] label = class_to_num[label] return image,label
-
使用torch.utils.data.DataLoader加载数据集
train_dataset = LeafDataset(image_paths=train_paths.values, labels=train_labels.values, mode='train') valid_dataset = LeafDataset(image_paths=valid_paths.values, labels=valid_labels.values, mode='valid') train_loader = DataLoader(train_dataset,batch_size=params['batch_size'], shuffle=False,num_workers=params['num_workers']) valid_loader = DataLoader(valid_dataset,batch_size=params['batch_size'], shuffle=False,num_workers=params['num_workers'])
-
定义网络、损失函数、优化器
-
从加载好的数据中取出一个batch(包括了image、label),
- 将Image送入网络
- 将输出与label计算损失
- 优化器梯度归0
- loss反向传播,梯度下降
#训练函数 def train(train_loader,model,criterion,optimizer,epoch,params): model.train() train_losses=[] train_accs=[] for images,labels in tqdm(train_loader): images=images.to(params['device']) labels=labels.to(params['device']) output=model(images) loss=criterion(output,labels) optimizer.zero_grad() loss.backward() optimizer.step() acc = (output.cpu().argmax(dim=-1) == labels.cpu()).float().mean() train_losses.append(loss.item()) train_accs.append(acc) train_loss=np.sum(train_losses) / len(train_losses) train_acc=np.sum(train_accs) / len(train_accs) return train_loss,train_acc