《A two-stage 3D Unet framework for multi-class segmentation on full resolution image》2018

最新推荐文章于 2023-12-14 16:37:44 发布

sdusgq

最新推荐文章于 2023-12-14 16:37:44 发布

阅读量349

点赞数

分类专栏：语义分割论文阅读文章标签：计算机视觉深度学习

本文链接：https://blog.csdn.net/sdusgq/article/details/122276698

版权

论文阅读同时被 2 个专栏收录

4 篇文章 0 订阅

订阅专栏

语义分割

3 篇文章 0 订阅

订阅专栏

《A two-stage 3D Unet framework for multi-class segmentation on full resolution image》

一、Abstract
二、Method
三、References
四、展望

一、Abstract

Deep convolutional neural networks (CNNs) have been intensively used for multi-class segmentation of data from different modalities and achieved state-of-the-art performances. However, a common problem when dealing with large, high resolution 3D data is that the volumes input into the deep CNNs has to be either cropped or downsampled due to limited memory capacity of computing devices. These operations lead to loss of resolution and increment of class imbalance in the input data batches, which can downgrade the performances of segmentation algorithms. Inspired by the architecture of image super-resolution CNN (SRCNN) and self-normalization network (SNN), we developed a two-stage modified Unet framework that simultaneously learns to detect a ROI within the full volume and to classify voxels without losing the original resolution. Experiments on a variety of multi-modal volumes demonstrated that, when trained with a simply weighted dice coefficientsand our customized learning procedure, this framework shows better segmentation performances than state-of-the-art Deep CNNs with advanced similarity metrics.

总结：
1、本论文要解决的是以下问题：（1）深度网络中分辨率的缺失（loss of resolution in the inputdata batches）；（2）类不平衡（increment of class imbalance）。
2、作者借鉴了SRCNN（super-resolution CNN）和 SNN（self-normalization network）。这两个网络感兴趣可以看看，但对我目前的工作没看到太大的相关性。
3、作者提出了一个两阶段的UNet，可以在整个卷轴中检测出感兴趣区域并对体素进行分类。笔者也想在微表情序列中提取ROI，有参考价值。

问题与思考：
1、如何通过dice系数训练？

二、Method

hello

总结：
1、two-stage == 两次3d-UNet。

问题与思考：
1、3d-2d conv是什么意思？
2、黑色方框代表什么？
3、identity是什么意思？
4、第一个3d-UNet是3d-pooling，第二个是2d-pooling？
5、channel-wise conv是什么？
6、从下采样层到上采样层的padding是什么意思？

The proposed DCNN model classify all the voxels within an axial slice based on a pre-defined neighborhood of axial slices around it. As shown in Fig. 1 the complete model consists of two concatenated modified U-Nets. Generally follow architecture of the original 2D U-Net, the basic block of this model consists of two convolutional layers, each followed followed by nonlinear activation and a 2 × 2 × 2 pooling layer. Both the contracting (encoding) and the expansive (decoding) paths of the two Unets have 4 basic U-Net blocks [7]. (Note that the activation and pooling layers within each block are not shown in Fig. 1 for clear demonstration.) Each of the two concatenated networks has 23 convolutional layers. The final outputs of both networks are produced by a softmax classification layer.

总结：
1、2d-UNet的每个block也都是含两个卷积层。各种变种的UNet也是含两个卷积层。
2、2d-UNet的每个下采样后都有2×2的maxpool，3d-UNet的每个后面是2×2×2的maxpool。
3、3d-UNet和2d-UNet的encoder和decoder都一样，都含4个block。

问题与思考：
1、如果每层后面接一个softmax classification layer，那么应该得到分类的结果，但在框架图中并没有体现。
2、第三行左图是经过softmax classification layer的分类结果？为什么不是3d的？

The first network (Net1 in Fig. 1) use down-sampled 3D volume to make a coarse prediction of the voxel labels. The produced label volume is then resampled to the original resolution. To capture information from larger effective receptive field, we use slightly dilated 5 × 5 × 5 convolutional kernel with zeropadding which preserves shapes of feature maps. In the nth block of the contracting path, the dilation rate of the convolutional kernel is 2n. This pattern is reversed in the expansive path. Each convolutional layer is followed by a recti- fied linear unit (ReLU), and a dropout layer with a 0.2 dropout rate is attached to each U-Net block. In the test phase, a dynamic-tile layer is introduced between Net1 and Net2 to crop out a region-of-interest (ROI) from both the input and output volume of Net1. This layer is removed when performing end-to-end training to simplify implementation.

总结：
1、这里的ROI是crop out得到的，对我的研究没有帮助。

问题与思考：
1、a coarse prediction of the voxel labels指的是什么？是经过第一个3d-UNet后的输出？
2、5×5×5的卷积层+padding=0，这如何preserve the shape of feature maps？
3、膨胀卷积（dilated convolution）是在第二个3d-UNet上，还是两个都是这样？

The architecture of Net2 is inspired by the deep Super-Resolution Convolutional Neural Network (SRCNN) [15] with skip connections and recursive units [16]. The input of this network is a two-channel 4D volume composed by the output of Net1 and the original data. The convolutional kernel size in the contracting path is 3 × 3 × 3, and 5 × 5 × 5 in the expansive path. Different from Net1, the size of the 3D pooling kernels in the contracting path is 2 × 2 × 1 to keep the number of axial slices. A 3D-2D slice-wise convolution block with 1 × 1 × (K − 1) convolutional kernels are introduced before the expansive path, where K is the number of neighboring slices used to label one single axial slice. No zero-paddings are used so that every K input slices will generate one single axial feature map. Furthermore, K should always be an odd number to prevent generating labels for interpolated slices. The following layers before the output of Net2 perform 2D convolutions and pooling.

总结：
1、Net1是第一个3d-UNet，Net2是第二个3d-UNet。

思考与问题：
1、axial slices 是什么？
2、一般的一维卷积是C×1×1，1×1×y的一维卷积的输入和输出的shape是什么？
3、”No zero-paddings are used so that every K input slices will generate one single axial feature map“如何理解？
4、”K should always be an odd number to prevent generating labels for interpolated slices“如何理解？

table

问题与思考：
1、是不是医学问题都涉及粗分类和细分类的问题？
2、L1和L2分别是什么？

三、References

值得进一步阅读的文献：

12、3d u-net: learning dense volumetric segmentation from sparse annotation

13、Hierarchical 3d fully convolutional networks for multi-organ segmentation

14、Automatic 3d cardiovascular mr segmentation with densely-connected volumetric convnets

17、An adaptive sampling scheme to efficiently train fully convolutional networks for semantic segmentation

四、展望

这篇论文是为了学习而随意写的，没有太多的参考价值。
这篇论文github上没有代码，因此精读精学价值不大。

sdusgq

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
《A two-stage 3D Unet framework for multi-class segmentation on full resolution image》2018

《A two-stage 3D Unet framework for multi-class segmentation on full resolution image》这里写目录标题一、Abstract二、Method三、References四、展望一、AbstractDeep convolutional neural networks (CNNs) have been intensively used for multi-class segmentation of data from differ
复制链接

扫一扫