U-Net：从入门到理解——学习总结与参考资料推荐

Dailin Li

已于 2023-10-14 18:32:44 修改

阅读量150

点赞数 1

分类专栏：深度学习文章标签：深度学习 cnn 神经网络分类笔记经验分享

于 2023-10-14 18:20:43 首次发布

本文链接：https://blog.csdn.net/m0_59500538/article/details/133827987

版权

深度学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

本科导师给的一篇经典图像分割论文，第一次了解该方向，查了两天资料基本读懂了文章，写了个ppt总结，在此做一个小小的记录。
我把读文章时查阅学习过的网站链接都整理在了文尾，有学习需要的可以参考。相信我，即使你是CNN小白，读了这些资料也能两天以内读懂图像分割经典入门之作。
附上原论文链接：U-Net: Convolutional Networks for Biomedical Image Segmentation

Goal of Image Segmentation

To assign a class label to each pixel of an image, i.e. a pixel-wise image classification.
The output is a high-resolution image of the same size as input, where each pixel is classified into a specific class, in this case, foreground(cells) and background.

U-Net structure

Downsampling (Encoding)

“Contracting path to capture context”
Each downsampling step halves the x-y-size and doubles convolution kernels.
Enlarges receptive field area.
Reduces overfitting and computation cost.

Upsampling (Decoding)

“Expanding path to localize precisely”
Each upsampling step doubles the x-y-size and halves convolution kernels.
Recovers lost resolution in downsampling.
The final layer maps feature vectors to pixel-wise segmentation map(foreground and background).

Skip connection (Concatenation)

Downsampling: understands more “what” but less “where”; upsampling: recovers “where”.
Fuses deep features from the decoder with shallow features from the encoder, which have small context and high resolution.

Loss Function

The cross entropy loss function combined with soft-max.
$p_k(\mathbf{x})=\frac{\exp(a_k(\mathbf{x}))}{\sum_{k^{\prime}=1}^K\exp(a_{k^{\prime}}(\mathbf{x}))}, \quad E=\sum_{\mathbf{x}\in\Omega}w(\mathbf{x})\log(p_{\ell(\mathbf{x})}(\mathbf{x}))$
To separate touching objects of the same class, a pixel-wise weighted loss is used, where the separating background labels between touching cells obtain a large weight in the loss function.

Overlap-tile Strategy

Due to memory limitations, large images can be divided into patches as input.
Prediction of a certain patch requires the full context.
In the border region, the missing context is extrapolated by mirror padding to retain more information near the borders. Another use of mirroring is to ensure the image size remains the same after a series of valid convolution.
This strategy thus allows the seamless segmentation of arbitrarily large images.

Data Augmentation

Sometimes only few training samples are available.
Elastic deformation: Perform different elastic distortions on available annotated images to generate new images as training samples. This deformation are actually quite common in tissue and can be simulated efficiently.

Improvements

Make greater use of both shallow and deep features at the same time adding connections between layers, that is, a multi-level fusion. (U-Net++)
Use more advanced loss functions.

References

以上是一些对理解U-Net大有裨益的网页，大致按重要度排序，可选择性了解。共勉！

Dailin Li

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
U-Net：从入门到理解——学习总结与参考资料推荐

本科导师给的一篇经典图像分割论文，第一次了解该方向，查了两天资料基本读懂了文章，写了个ppt总结，在此做一个小小的记录。我把读文章时查阅学习过的网站链接都整理在了文尾，有学习需要的可以参考。相信我，即使你是CNN小白，读了这些资料也能两天以内读懂图像分割经典入门之作。
复制链接

扫一扫

专栏目录