UneXt论文学习

读UneXt论文



abstract

大致思路:过去的Unet模型及其模型变种在分割任务上的准确率高,但是权重太大(parameter-heavy),不能在终端使用,因此作者提出了UneXt模型,下面是UneXt的原理和效果(改进点)。
原理:
(1).We propose a tokenized MLP block where we efficiently tokenize and project the convolutional features and use MLPs to model the representation.
(模型的整体情况,conv+MLP)

(2).To further boost the performance, we propose shifting the channels of the inputs while feeding in to MLPs so as to focus on learning local dependencies.( Using tokenized MLPs in latent space reduces the number of parameters and computational complexity while being able to result in a better representation to help segmentation. )
(移位和MLP映射)

(3).The network also consists of skip connections between various levels of encoder and decoder(Unet典型的编码端和解码端的图像特征融合)

效果:
We test UNeXt on multiple medical image segmentation datasets and show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x.

while also obtaining better segmentation performance over the state-of the-art medical image segmentation architectures.
(在获得良好精度的前提下降低了模型的参数量和推理所需时间)

Keywords: Medical Image Segmentation. MLP. Point-of-Care.

1. Introduction

医疗成像装置的一个重要任务是分割,在过去的许多年,许多有效的、健壮的模型被提出来,Unet是一个里程碑的模型。后面基于transformer的模型也被提出来。但是他们没有太多关注模型的权重大小(Note that almost all the above works have focused on improving the performance of the network but do not focus much on the computational complexity, inference time or the
number of parameters, which are essential in many real-world applications.)

point-of-care imaging(及时成像)的好处,所以In this work, we focus on solving this problem and design an effi-
cient network that has less computational overhead, low number of parameters, a faster inference time while also maintaining a good performance. 目的:Designing such a network is essential to suit the shifting trends of medical imaging from laboratory to bed-side.

由于受到最近MLP相关文章的启发,我们设计了UneXt结构,特点:
We still follow a 5-layer deep encoder-decoderarchitecture of UNet with skip connections but change the design of each block.
(1):a convolutional stage followed by an MLP stage
(2):less number of fifilters
(3):Tokenized MLP

总结:
In summary, this paper makes the following contributions:

  1. We propose UNeXt, the first convolutional MLP-based network for image segmentation.
    2)We propose a novel tokenized MLP block with axial shifts to efficiently learn agood representation at the latent space.
  2. We successfully improve the performance on medical image segmentation tasks while having less parameters, high
    inference speed, and low computational complexity.

2.UneXt

模型整体结构图:
论文原图:
在这里插入图片描述
(1)we follow C1 = 32, C2 = 64, C3 = 128, C4 = 160, and C5 = 256. 减少filters的个数,以减少模型的参数量和计算复杂度
(2) Convolutional Stage: 每个Convolutional 层由conv层+BatchNorm层+maxpooling层+ReLu激活函数组成。在解码端用双线性插值上采样。
(3)Shifted MLP: 从代码层面,就是将[B,C,H,W]的特征层转化层B,N,C后进行 N维度或C维度的array移位,然后再经过MLP映射学习,再变成[B,C,H,W]的格式,其中经过MLP层后通过 DWConv进行位置编码。
移位:论文原图
在这里插入图片描述

论文中的公式:
在这里插入图片描述

3.Experiments and Results

We develop UNeXt using Pytorch framework. We use a combination of binary cross entropy (BCE) and dice loss to train UNeXt.
在这里插入图片描述

We use an Adam optimizer with a learning rate of 0.0001 and momentum of 0.9. We also use a cosine annealing learning rate scheduler with a minimum learning rate upto 0.00001. The batch size is set equal to 8. We train UNeXt for a total of 400 epochs. We perform a 80-20 random split thrice across the dataset and report the mean and variance.
作者的实验对比图
在这里插入图片描述
在这里插入图片描述

消融实验

在这里插入图片描述
在这里插入图片描述

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值