Machine Learing HW3

秀得水乱流

已于 2023-01-01 15:30:01 修改

阅读量176

点赞数

文章标签：深度学习神经网络 cnn Powered by 金山文档

于 2022-12-31 19:28:45 首次发布

本文链接：https://blog.csdn.net/weixin_46260468/article/details/128506068

版权

Task:

1使用CNN完成食物分类任务,共11个classes;

2使用数据增强操作；

3使用Residual提高模型性能；

DataSet:

训练集9866 labeled图像；验证集3430 labeled 图像；测试集：3347图像；

Baseline:

Simple : 0.50099

Medium : 0.73207 Training Augmentation + Train Longer

Strong : 0.81872 Training Augmentation + Model Design + Train Looonger (+ Cross Validation + Ensemble)

Boss : 0.88446 Training Augmentation + Model Design +Test Time Augmentation + Train Looonger (+ Cross Validation + Ensemble)

Simple base：

Sample code，并没有做数据增强，训练没几个epoch就停止了，且模型复杂数据不够，acc如上图；

1基于助教提示，首先优化添加五个数据增强操作；

train_tfm = transforms.Compose([
    # Resize the image into a fixed shape (height = width = 128)
    # You may add some transforms here.
    transforms.RandomResizedCrop((128, 128), scale=(0.7, 1.0)), # 随机截取并resize
    
    # 几何变换
    transforms.RandomHorizontalFlip(0.5),
    transforms.RandomVerticalFlip(0.5), 
    transforms.RandomRotation(180), 
    transforms.RandomAffine(30), 
    
    # 像素变换
    transforms.RandomGrayscale(p=0.2), 
    
    # ToTensor() should be the last one of the transforms.
    transforms.ToTensor(),
])

关于数据增强：为了获得更多的数据，我们只要对现有的数据集进行微小的改变。比如旋转（flips）、移位（translations）、旋转（rotations）等微小的改变。我们的网络会认为这是不同的图片。这样就可以丰富我们的数据集；

几何变换类：即对图像进行几何变换，包括翻转，旋转，移位，裁剪，变形，缩放等各类操作。

像素变化类：属于颜色变换类的数据增强了，常见的包括噪声、模糊、颜色变换、擦除、填充等等。

在数据增强过程中，有一个需要注意的问题：在使用增强技术时，我们必须确保不增加不相关的数据。在使用有限的数据来进行模型的训练，因此数据增强操作是不可缺少的一环。

2尝试网络结构：添加dropout层避免过拟合；

self.fc = nn.Sequential(
    nn.Linear(512*4*4, 1024),
    nn.ReLU(),
    nn.Dropout(0,25),
    nn.Linear(1024, 512),
    nn.ReLU(),
    nn.Dropout(0,25),
    nn.Linear(512, 11)
)

3 调整超参：training epochs， patience.

# The number of training epochs and patience.
n_epochs = 50
patience = 8 # If no improvement in 'patience' epochs, early stop

第一次优化后，得到的为ACC提高至如图：

使用Residual Implementation

class Residual_Network(nn.Module):
    def __init__(self):
        super(Residual_Network, self).__init__()
        
        self.cnn_layer1 = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),
            nn.BatchNorm2d(64),
        )

        self.cnn_layer2 = nn.Sequential(
            nn.Conv2d(64, 64, 3, 1, 1),
            nn.BatchNorm2d(64),
        )

        self.cnn_layer3 = nn.Sequential(
            nn.Conv2d(64, 128, 3, 2, 1),
            nn.BatchNorm2d(128),
        )

        self.cnn_layer4 = nn.Sequential(
            nn.Conv2d(128, 128, 3, 1, 1),
            nn.BatchNorm2d(128),
        )
        self.cnn_layer5 = nn.Sequential(
            nn.Conv2d(128, 256, 3, 2, 1),
            nn.BatchNorm2d(256),
        )
        self.cnn_layer6 = nn.Sequential(
            nn.Conv2d(256, 256, 3, 1, 1),
            nn.BatchNorm2d(256),
        )
        self.fc_layer = nn.Sequential(
            nn.Linear(256* 32* 32, 256),
            nn.ReLU(),
            nn.Linear(256, 11)
        )
        self.relu = nn.ReLU()

    def forward(self, x):
        # input (x): [batch_size, 3, 128, 128]
        # output: [batch_size, 11]

        # Extract features by convolutional layers.
        x1 = self.cnn_layer1(x)
        
        x1 = self.relu(x1)
        
        x2 = self.cnn_layer2(x1)
        
        x2 = self.relu(x2)
        
        x3 = self.cnn_layer3(x2)
        
        x3 = self.relu(x3)
        
        x4 = self.cnn_layer4(x3)
        
        x4 = self.relu(x4)
        
        x5 = self.cnn_layer5(x4)
        
        x5 = self.relu(x5)
        
        x6 = self.cnn_layer6(x5)
        
        x6 = self.relu(x6)
        
        # The extracted feature map must be flatten before going to fully-connected layers.
        xout = x6.flatten(1)

        # The features are transformed by fully-connected layers to obtain the final logits.
        xout = self.fc_layer(xout)
        return xout

model = Residual_Network().to(device)

残差网络最本质公式为：f(x)=g(x)+x