深度学习第四周 MobileNet

最新推荐文章于 2024-04-23 00:36:20 发布

贝叶贝叶贝叶斯

最新推荐文章于 2024-04-23 00:36:20 发布

阅读量1.1k

点赞数

文章标签：深度学习人工智能 cnn

本文链接：https://blog.csdn.net/weixin_45573034/article/details/126115820

版权

MobileNet v1

传统的卷积神经网络，内存的需求大，运算量大，无法在嵌入式设备上运行。例如，ResNet152层网络的权重可达644M，这种权重文件大小基本上不能够在移动设备上运行。

MobileNet是由google公司提出的，专注于嵌入式设备中的轻量级CNN网络，在牺牲了模型的准确率的前提下，大大减少了模型参数和运算量，使得能够在嵌入式设备上能够很好的运行。

Depthwise Convolution(DW)卷积

网络结构

模型参数对比

从上图中可以看出，MobileNet在ImageNet上的正确率与VGG16相差不多，但是计算量和参数个数远远小于VGG16。

超参数设置

$\alpha$ 用来设置模型的卷积核个数的倍率，来控制卷积过程中使用卷积核的个数。

上图就是 $\alpha$ 设置不同的值对模型带来的影响。

$\beta$ 是分辨率的参数，用来控制输入图像的尺寸。

上图中可以看到，随着输入图像的尺寸变化对模型准确率、参数的影响。

MobileNet v2

Inverted residual block(倒残差结构)

在该结构中，最后一层使用的是线性的激活函数，原因是因为，ReLU激活函数，会对低维度的特征造成较大的损失，而在最后一层连接的又是一个维度较低的特征矩阵，为了减小损失而使用线性的激活函数。

MobileNet v3

MobileNet v3 与 v2对比

其中Top-1代表模型的准确度，P-1代表模型的推理速度，可以看到，v3本版比v2的准确度有了提升，并且模型的计算速度也有了提升。

block模块改进

(1) 加入了SE模块

(2) 更新了激活函数

对于激活函数使用h-swish激活函数替换swish，用h-sigmoid替换sigmoid激活函数。

$ReLU6(x)=min(max(x,0),6)$

$h-sigmoid(x)=\frac{ReLU6(x+3))}{6}$

$swish(x)=x\cdot \sigma(x)$

$h-swish(x)=x\cdot \frac{ReLU6(x+3))}{6}$

SENet

现有的很多卷积都在空间维度上来提升网络的性能，SENet考虑使用通道之间的关心来进行网络的优化，基于这点提出了Squeeze-and-Excitation Networks （简称SENet）。

首先是Squeeze 操作，我们顺着空间维度来进行特征压缩，将每个二维的特征通道变成一个实数，这个实数某种程度上具有全局的感受野，并且输出的维度和输入的特征通道数相匹配。其次是Excitation 操作，它是一个类似于循环神经网络中门的机制。通过参数来为每个特征通道生成权重，其中参数被学习用来显式地建模特征通道间的相关性。最后是一个Reweight的操作，我们将Excitation的输出的权重看做是进过特征选择后的每个特征通道的重要性，然后通过乘法逐通道加权到先前的特征上，完成在通道维度上的对原始特征的重标定。

在上图中，使用global average pooling 作为Squeeze 操作。紧接着两个Fully Connected 层组成一个Bottleneck结构去建模通道间的相关性，并输出和输入特征同样数目的权重。

这样做比直接用一个Fully Connected 层的好处在于：1）具有更多的非线性，可以更好地拟合通道间复杂的相关性；2）极大地减少了参数量和计算量。然后通过一个Sigmoid的门获得0~1之间归一化的权重，最后通过一个Scale的操作来将归一化后的权重加权到每个通道的特征上。

SENet构造非常简单，而且很容易被部署，不需要引入新的函数或者层。例如，SE-ResNet-50 相对于ResNet-50 有着10% 模型参数的增长，但是增长的模型参数求实增长在两个全连接层中，实验发现移除掉最后一个stage中3个build block上的SE设定，可以将10% 参数量的增长减少到2%。此时模型的精度几乎无损失。

下面是将SE模块使用到ImageNet测试中：

可以看到，使用SE模块确实使得模型的误差有所减低，甚至SE-ResNet-101 远远地超过了更深的ResNet-152。

3D卷积与2D卷积

2D卷积就是在一个二维矩阵上进行卷积操作如，对图像进行的卷积操作。3D卷积就是一个卷积核（三维）在一个立方体上进行卷积得到输出。3D卷积可以应用在视频分类、图像分割等。

代码练习

本次实现HybridSN高光谱分类

加载数据集

import wget
url_1 = 'http://www.ehu.eus/ccwintco/uploads/6/67/Indian_pines_corrected.mat'
url_2 = 'http://www.ehu.eus/ccwintco/uploads/c/c4/Indian_pines_gt.mat'
Indian_pines_corrected = wget.download(url_1)
Indian_pines_gt = wget.download(url_2)

创建Hybrid网络

class HybridSN(nn.Module):
    def __init__(self):
        super(HybridSN,self).__init__()
        self.conv3d_1 = nn.Sequential(
            nn.Conv3d(1, 8, kernel_size=(7, 3, 3), stride=1, padding=0),
            nn.BatchNorm3d(8),
            nn.ReLU(inplace=True),

        )
        self.conv3d_2 = nn.Sequential(
            nn.Conv3d(8, 16, kernel_size=(5, 3, 3), stride=1, padding=0),
            nn.BatchNorm3d(16),
            nn.ReLU(inplace=True),
        )
        self.conv3d_3 = nn.Sequential(
            nn.Conv3d(16, 32, kernel_size=(3, 3, 3), stride=1, padding=0),
            nn.BatchNorm3d(32),
            nn.ReLU(inplace=True),
        )
        
        self.conv2d_4 = nn.Sequential(
            nn.Conv2d(576, 64, kernel_size=(3, 3), stride=1, padding=0),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
        )
        self.fc1 = nn.Linear(18496,256)
        self.fc2 = nn.Linear(256,128)
        self.fc3 = nn.Linear(128,16)
        self.dropout = nn.Dropout(p = 0.4)

        
    def forward(self,x):
        out = self.conv3d_1(x)
        out = self.conv3d_2(out)
        out = self.conv3d_3(out)
        out = self.conv2d_4(out.reshape(out.shape[0],-1,19,19))
        out = out.reshape(out.shape[0],-1)
        out = F.relu(self.dropout(self.fc1(out)))
        out = F.relu(self.dropout(self.fc2(out)))
        out = self.fc3(out)
        return out

读取并划分数据集

# 地物类别
class_num = 16
X = sio.loadmat('Indian_pines_corrected.mat')['indian_pines_corrected']
y = sio.loadmat('Indian_pines_gt.mat')['indian_pines_gt']

# 用于测试样本的比例
test_ratio = 0.90
# 每个像素周围提取 patch 的尺寸
patch_size = 25
# 使用 PCA 降维，得到主成分的数量
pca_components = 30

print('Hyperspectral data shape: ', X.shape)
print('Label shape: ', y.shape)

print('\n... ... PCA tranformation ... ...')
X_pca = applyPCA(X, numComponents=pca_components)
print('Data shape after PCA: ', X_pca.shape)

print('\n... ... create data cubes ... ...')
X_pca, y = createImageCubes(X_pca, y, windowSize=patch_size)
print('Data cube X shape: ', X_pca.shape)
print('Data cube y shape: ', y.shape)

print('\n... ... create train & test data ... ...')
Xtrain, Xtest, ytrain, ytest = splitTrainTestSet(X_pca, y, test_ratio)
print('Xtrain shape: ', Xtrain.shape)
print('Xtest  shape: ', Xtest.shape)

# 改变 Xtrain, Ytrain 的形状，以符合 keras 的要求
Xtrain = Xtrain.reshape(-1, patch_size, patch_size, pca_components, 1)
Xtest  = Xtest.reshape(-1, patch_size, patch_size, pca_components, 1)
print('before transpose: Xtrain shape: ', Xtrain.shape) 
print('before transpose: Xtest  shape: ', Xtest.shape) 

# 为了适应 pytorch 结构，数据要做 transpose
Xtrain = Xtrain.transpose(0, 4, 3, 1, 2)
Xtest  = Xtest.transpose(0, 4, 3, 1, 2)
print('after transpose: Xtrain shape: ', Xtrain.shape) 
print('after transpose: Xtest  shape: ', Xtest.shape) 


""" Training dataset"""
class TrainDS(torch.utils.data.Dataset): 
    def __init__(self):
        self.len = Xtrain.shape[0]
        self.x_data = torch.FloatTensor(Xtrain)
        self.y_data = torch.LongTensor(ytrain)        
    def __getitem__(self, index):
        # 根据索引返回数据和对应的标签
        return self.x_data[index], self.y_data[index]
    def __len__(self): 
        # 返回文件数据的数目
        return self.len

""" Testing dataset"""
class TestDS(torch.utils.data.Dataset): 
    def __init__(self):
        self.len = Xtest.shape[0]
        self.x_data = torch.FloatTensor(Xtest)
        self.y_data = torch.LongTensor(ytest)
    def __getitem__(self, index):
        # 根据索引返回数据和对应的标签
        return self.x_data[index], self.y_data[index]
    def __len__(self): 
        # 返回文件数据的数目
        return self.len

# 创建 trainloader 和 testloader
trainset = TrainDS()
testset  = TestDS()
train_loader = torch.utils.data.DataLoader(dataset=trainset, batch_size=32, shuffle=True, num_workers=0)
test_loader  = torch.utils.data.DataLoader(dataset=testset,  batch_size=32, shuffle=False, num_workers=0)

进行网络训练

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
# 网络放到GPU上
net = HybridSN().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.001)

# 开始训练
total_loss = 0
loss_epoch = np.array([])
for epoch in tqdm(range(100)):
    sleep(0.01)
    for i, (inputs, labels) in enumerate(train_loader):
        inputs = inputs.to(device)
        labels = labels.to(device)
        # 优化器梯度归零
        optimizer.zero_grad()
        # 正向传播 +　反向传播 + 优化 
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print('[Epoch: %d]   [loss avg: %.4f]   [current loss: %.4f]' %(epoch + 1, total_loss/(epoch+1), loss.item()))
    loss_epoch = np.append(loss_epoch,loss.item())
# #plt.plot(train_x,loss_i,label = '{i} train loss'.format(i=epoch+1))
train_epoch = np.arange(1,101)
plt.plot(train_epoch,loss_epoch,'o')
plt.show()

训练结果：

对模型进行测试

count = 0
# 模型测试
for inputs, _ in test_loader:
    inputs = inputs.to(device)
    outputs = net(inputs)
    outputs = np.argmax(outputs.detach().cpu().numpy(), axis=1)
    if count == 0:
        y_pred_test =  outputs
        count = 1
    else:
        y_pred_test = np.concatenate( (y_pred_test, outputs) )

# 生成分类报告
classification = classification_report(ytest, y_pred_test, digits=4)
print(classification)

测试结果：