U-Net:从入门到理解——学习总结与参考资料推荐

本科导师给的一篇经典图像分割论文,第一次了解该方向,查了两天资料基本读懂了文章,写了个ppt总结,在此做一个小小的记录。
我把读文章时查阅学习过的网站链接都整理在了文尾,有学习需要的可以参考。相信我,即使你是CNN小白,读了这些资料也能两天以内读懂图像分割经典入门之作。
附上原论文链接:U-Net: Convolutional Networks for Biomedical Image Segmentation


Goal of Image Segmentation

  • To assign a class label to each pixel of an image, i.e. a pixel-wise image classification.
  • The output is a high-resolution image of the same size as input, where each pixel is classified into a specific class, in this case, foreground(cells) and background.

U-Net structure

Downsampling (Encoding)

  • “Contracting path to capture context”
  • Each downsampling step halves the x-y-size and doubles convolution kernels.
  • Enlarges receptive field area.
  • Reduces overfitting and computation cost.

Upsampling (Decoding)

  • “Expanding path to localize precisely”
  • Each upsampling step doubles the x-y-size and halves convolution kernels.
  • Recovers lost resolution in downsampling.
  • The final layer maps feature vectors to pixel-wise segmentation map(foreground and background).

Skip connection (Concatenation)

  • Downsampling: understands more “what” but less “where”; upsampling: recovers “where”.
  • Fuses deep features from the decoder with shallow features from the encoder, which have small context and high resolution.

Loss Function

  • The cross entropy loss function combined with soft-max.
    p k ( x ) = exp ⁡ ( a k ( x ) ) ∑ k ′ = 1 K exp ⁡ ( a k ′ ( x ) ) , E = ∑ x ∈ Ω w ( x ) log ⁡ ( p ℓ ( x ) ( x ) ) p_k(\mathbf{x})=\frac{\exp(a_k(\mathbf{x}))}{\sum_{k^{\prime}=1}^K\exp(a_{k^{\prime}}(\mathbf{x}))}, \quad E=\sum_{\mathbf{x}\in\Omega}w(\mathbf{x})\log(p_{\ell(\mathbf{x})}(\mathbf{x})) pk(x)=k=1Kexp(ak(x))exp(ak(x)),E=xΩw(x)log(p(x)(x))
  • To separate touching objects of the same class, a pixel-wise weighted loss is used, where the separating background labels between touching cells obtain a large weight in the loss function.

Overlap-tile Strategy

  • Due to memory limitations, large images can be divided into patches as input.
  • Prediction of a certain patch requires the full context.
  • In the border region, the missing context is extrapolated by mirror padding to retain more information near the borders. Another use of mirroring is to ensure the image size remains the same after a series of valid convolution.
  • This strategy thus allows the seamless segmentation of arbitrarily large images.

Data Augmentation

  • Sometimes only few training samples are available.
  • Elastic deformation: Perform different elastic distortions on available annotated images to generate new images as training samples. This deformation are actually quite common in tissue and can be simulated efficiently.

Improvements

  • Make greater use of both shallow and deep features at the same time adding connections between layers, that is, a multi-level fusion. (U-Net++)
  • Use more advanced loss functions.

References

  1. 从零开始的U-net入门
  2. 研习U-Net - 周纵苇的文章 - 知乎
  3. 卷积神经网络概念与原理
  4. U-Net+与FCN的区别+医学表现+网络详解+创新
  5. 转置卷积(Transposed Convolution)
  6. 交叉熵损失函数原理详解
  7. Softmax 函数的特点和作用是什么?
  8. Understanding Semantic Segmentation with UNET --A Salt Identification Case Study
  9. skip connection的原理是什么?为什么U-net中要用到skip connection?
  10. 浅析深度学习中的Skip Connection
  11. 怎么通俗易懂的理解SGD中Momentum的含义?

以上是一些对理解U-Net大有裨益的网页,大致按重要度排序,可选择性了解。共勉!

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值